AMA

Scenario guide

Best AI models for Multimodal Document Q&A

Answering questions about scanned PDFs, charts, and screenshots. Relies on vision benchmarks plus reasoning.

Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.

  1. 1
    Gemini 2.5 Pro
    Google
    Score 85.1Q 95.9In $1.25/M
  2. 2
    Claude Opus 4
    Anthropic
    Score 73.1Q 91.4In $15.00/M
  3. 3
    Gemini 2.0 Flash
    Google
    Score 69.9Q 64.4In $0.10/M
  4. 4
    Llama 4 Maverick
    Meta
    Score 68.9Q 67.3In $0.27/M
  5. 5
    Claude Sonnet 4
    Anthropic
    Score 66.1Q 75.3In $3.00/M
  6. 6
    Llama 4 Scout
    Meta
    Score 60.2Q 54.7In $0.18/M
  7. 7
    Claude 3.5 Sonnet
    Anthropic
    Score 50.2Q 55.5In $3.00/M
  8. 8
    GPT-4o
    OpenAI
    Score 45.5Q 48.5In $2.50/M
  9. 9
    Gemini 1.5 Pro
    Google
    Score 44.7Q 44.3In $1.25/M
  10. 10
    Gemini 1.5 Flash
    Google
    Score 34.3Q 18.6In $0.08/M
  11. 11
    GPT-4o mini
    OpenAI
    Score 34.0Q 21.3In $0.15/M
  12. 12
    GPT-5 nano
    OpenAI
    Score 20.0Q 0.0In $0.05/M
  13. 13
    GPT-5 mini
    OpenAI
    Score 14.2Q 0.0In $0.25/M
  14. 14
    Gemini 2.5 Flash
    Google
    Score 13.4Q 0.0In $0.30/M
  15. 15
    Gemini 3 Flash
    Google
    Score 13.4Q 0.0In $0.30/M