Scenario guide
Best AI models for Research / Analyst
A research assistant that reads long PDFs and answers nuanced questions. Weighted toward reasoning, knowledge, long-context, and a saturation-resistant frontier capability score so the ranking stays meaningful as MMLU-style evals saturate.
Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.
- 1Gemini 3 ProGoogleScore 93.1Q 98.8In $1.25/M
- 2GPT-5.5OpenAIScore 87.9Q 93.4In $1.50/M
- 3Gemini 3 FlashGoogleScore 81.2Q 82.7In $0.30/M
- 4Gemini 2.5 ProGoogleScore 79.3Q 83.5In $1.25/M
- 5GPT-5.4OpenAIScore 77.5Q 81.8In $1.50/M
- 6DeepSeek V3DeepSeekScore 74.4Q 74.6In $0.27/M
- 7Claude Opus 4.7AnthropicScore 73.4Q 81.6In $15.00/M
- 8Gemini 2.0 FlashGoogleScore 72.7Q 70.7In $0.10/M
- 9Claude Opus 4.6AnthropicScore 70.3Q 78.2In $15.00/M
- 10Qwen3 235BAlibaba (Qwen)Score 69.9Q 68.9In $0.20/M
- 11Grok 3xAIScore 68.9Q 73.4In $3.00/M
- 12o3-miniOpenAIScore 68.8Q 71.2In $1.10/M
- 13o1OpenAIScore 67.9Q 75.4In $15.00/M
- 14DeepSeek R1DeepSeekScore 67.8Q 68.6In $0.55/M
- 15o3OpenAIScore 66.6Q 73.1In $10.00/M