Scenario guide

Best AI models for Research / Analyst

A research assistant that reads long PDFs and answers nuanced questions. Weighted toward reasoning, knowledge, long-context, and a saturation-resistant frontier capability score so the ranking stays meaningful as MMLU-style evals saturate.

Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.

1
Gemini 3 Pro
Google
Score 90.9Q 96.3In $1.25/M
2
GPT-5.5
OpenAI
Score 87.9Q 93.4In $1.50/M
3
Gemini 3 Flash
Google
Score 79.8Q 81.2In $0.30/M
4
Gemini 2.5 Pro
Google
Score 78.5Q 82.5In $1.25/M
5
GPT-5.4
OpenAI
Score 75.2Q 79.2In $1.50/M
6
DeepSeek V3
DeepSeek
Score 73.6Q 73.6In $0.27/M
7
Gemini 2.0 Flash
Google
Score 72.3Q 70.2In $0.10/M
8
Claude Opus 4.7
Anthropic
Score 72.2Q 80.2In $15.00/M
9
Qwen3 235B
Alibaba (Qwen)
Score 69.4Q 68.3In $0.20/M
10
Claude Opus 4.6
Anthropic
Score 69.0Q 76.6In $15.00/M
11
Grok 3
xAI
Score 68.4Q 72.8In $3.00/M
12
o3-mini
OpenAI
Score 68.2Q 70.5In $1.10/M
13
o1
OpenAI
Score 67.1Q 74.5In $15.00/M
14
DeepSeek R1
DeepSeek
Score 67.1Q 67.9In $0.55/M
15
o3
OpenAI
Score 66.2Q 72.6In $10.00/M

Open interactive leaderboard Build custom weights Home