AMA

Value ranking

Best value on MathVista

Math reasoning over visual contexts (charts, figures, geometry).

“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.

  1. 1
    Gemini 2.0 Flash
    Google
    622.60
    62.3 / $0.10/M
  2. 2
    Llama 4 Scout
    Meta
    309.17
    55.6 / $0.18/M
  3. 3
    Gemini 1.5 Flash
    Google
    290.13
    21.8 / $0.08/M
  4. 4
    Llama 4 Maverick
    Meta
    236.70
    63.9 / $0.27/M
  5. 5
    GPT-4o mini
    OpenAI
    113.87
    17.1 / $0.15/M
  6. 6
    o3-mini
    OpenAI
    76.38
    84.0 / $1.10/M
  7. 7
    Gemini 2.5 Pro
    Google
    76.03
    95.0 / $1.25/M
  8. 8
    Gemini 1.5 Pro
    Google
    29.53
    36.9 / $1.25/M
  9. 9
    Grok 2
    xAI
    25.48
    51.0 / $2.00/M
  10. 10
    Claude Sonnet 4
    Anthropic
    25.25
    75.8 / $3.00/M
  11. 11
    Grok 3
    xAI
    22.50
    67.5 / $3.00/M
  12. 12
    o1-mini
    OpenAI
    17.91
    53.7 / $3.00/M
  13. 13
    Claude 3.5 Sonnet
    Anthropic
    15.79
    47.4 / $3.00/M
  14. 14
    GPT-4o
    OpenAI
    14.66
    36.6 / $2.50/M
  15. 15
    o3
    OpenAI
    10.00
    100.0 / $10.00/M
  16. 16
    Claude Opus 4
    Anthropic
    6.15
    92.3 / $15.00/M
  17. 17
    o1
    OpenAI
    4.30
    64.5 / $15.00/M
  18. 18
    Claude 3 Opus
    Anthropic
    0.00
    0.0 / $15.00/M

AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.