Scenario guide

Best AI models for Multimodal Document Q&A

Answering questions about scanned PDFs, charts, and screenshots. Relies on vision benchmarks plus reasoning.

Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.

1
Gemini 2.5 Pro
Google
Score 85.1Q 95.9In $1.25/M
2
Claude Opus 4
Anthropic
Score 73.1Q 91.4In $15.00/M
3
Gemini 2.0 Flash
Google
Score 69.9Q 64.4In $0.10/M
4
Llama 4 Maverick
Meta
Score 68.9Q 67.3In $0.27/M
5
Claude Sonnet 4
Anthropic
Score 66.1Q 75.3In $3.00/M
6
Llama 4 Scout
Meta
Score 60.2Q 54.7In $0.18/M
7
Claude 3.5 Sonnet
Anthropic
Score 50.2Q 55.5In $3.00/M
8
GPT-4o
OpenAI
Score 45.5Q 48.5In $2.50/M
9
Gemini 1.5 Pro
Google
Score 44.7Q 44.3In $1.25/M
10
Gemini 1.5 Flash
Google
Score 34.3Q 18.6In $0.08/M
11
GPT-4o mini
OpenAI
Score 34.0Q 21.3In $0.15/M
12
GPT-5 nano
OpenAI
Score 20.0Q 0.0In $0.05/M
13
GPT-5 mini
OpenAI
Score 14.2Q 0.0In $0.25/M
14
Gemini 2.5 Flash
Google
Score 13.4Q 0.0In $0.30/M
15
Gemini 3 Flash
Google
Score 13.4Q 0.0In $0.30/M

Open interactive leaderboard Build custom weights Home