AMA

Value ranking

Best value on FrontierMath Tiers 1-3

Mathematical research problems spanning analysis, algebra, combinatorics and number theory. Tiers 1-3 are progressively harder; even frontier reasoning models only solve a small fraction. The hardest publicly reported benchmark for general mathematical reasoning.

“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.

  1. 1
    GPT-5 nano
    OpenAI
    320.20
    16.0 / $0.05/M
  2. 2
    Gemini 3 Flash
    Google
    229.80
    68.9 / $0.30/M
  3. 3
    GPT-5 mini
    OpenAI
    210.76
    52.7 / $0.25/M
  4. 4
    Kimi K2
    Moonshot (Kimi)
    89.95
    54.0 / $0.60/M
  5. 5
    Qwen3 235B (Thinking)
    Alibaba (Qwen)
    82.00
    16.4 / $0.20/M
  6. 6
    GPT-5.5
    OpenAI
    66.67
    100.0 / $1.50/M
  7. 7
    GPT-5.2
    OpenAI
    62.98
    78.7 / $1.25/M
  8. 8
    GPT-5.4
    OpenAI
    61.38
    92.1 / $1.50/M
  9. 9
    Gemini 3 Pro
    Google
    58.18
    72.7 / $1.25/M
  10. 10
    GPT-5
    OpenAI
    50.16
    62.7 / $1.25/M
  11. 11
    GPT-5.1
    OpenAI
    48.02
    60.0 / $1.25/M
  12. 12
    o4-mini
    OpenAI
    43.65
    48.0 / $1.10/M
  13. 13
    Gemini 2.0 Flash
    Google
    33.30
    3.3 / $0.10/M
  14. 14
    Gemini 2.5 Flash
    Google
    31.23
    9.4 / $0.30/M
  15. 15
    Gemini 2.5 Pro
    Google
    21.88
    27.4 / $1.25/M
  16. 16
    o3-mini
    OpenAI
    21.83
    24.0 / $1.10/M
  17. 17
    Claude Sonnet 4.6
    Anthropic
    20.89
    62.7 / $3.00/M
  18. 18
    GLM-4.6
    Zhipu AI (GLM)
    14.78
    7.4 / $0.50/M
  19. 19
    DeepSeek V3
    DeepSeek
    12.33
    3.3 / $0.27/M
  20. 20
    Claude Haiku 4.5
    Anthropic
    11.42
    11.4 / $1.00/M
  21. 21
    Claude Sonnet 4.5
    Anthropic
    9.82
    29.4 / $3.00/M
  22. 22
    GLM-4.7
    Zhipu AI (GLM)
    9.44
    4.7 / $0.50/M
  23. 23
    Grok 4
    xAI
    7.60
    38.0 / $5.00/M
  24. 24
    Claude Opus 4.7
    Anthropic
    5.65
    84.7 / $15.00/M
  25. 25
    Claude Opus 4.6
    Anthropic
    5.25
    78.7 / $15.00/M

AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.