AMA

Value ranking

Best value on OTIS Mock AIME 2024-2025

AIME-style competition problems written specifically for the OTIS mock contest, then run as an evaluation by Epoch AI. Closer in spirit to the public AIME but with novel problems unlikely to appear in training data.

“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.

  1. 1
    GPT-5 nano
    OpenAI
    1608.00
    80.4 / $0.05/M
  2. 2
    Qwen3 235B (Thinking)
    Alibaba (Qwen)
    430.85
    86.2 / $0.20/M
  3. 3
    GPT-5 mini
    OpenAI
    344.68
    86.2 / $0.25/M
  4. 4
    Gemini 3 Flash
    Google
    308.37
    92.5 / $0.30/M
  5. 5
    Gemini 2.0 Flash
    Google
    285.30
    28.5 / $0.10/M
  6. 6
    Gemini 1.5 Flash
    Google
    174.80
    13.1 / $0.08/M
  7. 7
    GLM-4.7
    Zhipu AI (GLM)
    165.42
    82.7 / $0.50/M
  8. 8
    DeepSeek R1
    DeepSeek
    158.84
    87.4 / $0.55/M
  9. 9
    Kimi K2
    Moonshot (Kimi)
    153.18
    91.9 / $0.60/M
  10. 10
    DeepSeek V3
    DeepSeek
    131.30
    35.5 / $0.27/M
  11. 11
    GPT-5.2
    OpenAI
    76.78
    96.0 / $1.25/M
  12. 12
    Gemini 3 Pro
    Google
    76.35
    95.4 / $1.25/M
  13. 13
    o4-mini
    OpenAI
    73.62
    81.0 / $1.10/M
  14. 14
    GPT-5
    OpenAI
    72.86
    91.1 / $1.25/M
  15. 15
    GPT-5.1
    OpenAI
    70.54
    88.2 / $1.25/M
  16. 16
    o3-mini
    OpenAI
    69.16
    76.1 / $1.10/M
  17. 17
    Gemini 2.5 Pro
    Google
    66.86
    83.6 / $1.25/M
  18. 18
    GPT-5.5
    OpenAI
    66.67
    100.0 / $1.50/M
  19. 19
    Claude Haiku 4.5
    Anthropic
    65.42
    65.4 / $1.00/M
  20. 20
    GPT-5.4
    OpenAI
    63.41
    95.1 / $1.50/M
  21. 21
    Claude Sonnet 4.6
    Anthropic
    28.42
    85.3 / $3.00/M
  22. 22
    Claude Sonnet 4.5
    Anthropic
    25.65
    77.0 / $3.00/M
  23. 23
    Llama 4 Scout
    Meta
    24.00
    4.3 / $0.18/M
  24. 24
    GPT-4o mini
    OpenAI
    23.07
    3.5 / $0.15/M
  25. 25
    Claude 3.7 Sonnet
    Anthropic
    18.73
    56.2 / $3.00/M

AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.