Value ranking
Best value on MathVista
Math reasoning over visual contexts (charts, figures, geometry).
“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.
- 1Gemini 2.0 FlashGoogle622.6062.3 / $0.10/M
- 2Llama 4 ScoutMeta309.1755.6 / $0.18/M
- 3Gemini 1.5 FlashGoogle290.1321.8 / $0.08/M
- 4Llama 4 MaverickMeta236.7063.9 / $0.27/M
- 5GPT-4o miniOpenAI113.8717.1 / $0.15/M
- 6o3-miniOpenAI76.3884.0 / $1.10/M
- 7Gemini 2.5 ProGoogle76.0395.0 / $1.25/M
- 8Gemini 1.5 ProGoogle29.5336.9 / $1.25/M
- 9Grok 2xAI25.4851.0 / $2.00/M
- 10Claude Sonnet 4Anthropic25.2575.8 / $3.00/M
- 11Grok 3xAI22.5067.5 / $3.00/M
- 12o1-miniOpenAI17.9153.7 / $3.00/M
- 13Claude 3.5 SonnetAnthropic15.7947.4 / $3.00/M
- 14GPT-4oOpenAI14.6636.6 / $2.50/M
- 15o3OpenAI10.00100.0 / $10.00/M
- 16Claude Opus 4Anthropic6.1592.3 / $15.00/M
- 17o1OpenAI4.3064.5 / $15.00/M
- 18Claude 3 OpusAnthropic0.000.0 / $15.00/M
AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.