Scenario guide
Best AI models for Realtime Chat / Voice
A streaming consumer chat or voice assistant. Speed and time-to-first-token matter as much as raw quality — a slightly less smart model that responds instantly often beats a frontier model that pauses to think.
Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.
- 1Gemini 2.0 FlashGoogleScore 90.4Q 88.6In $0.10/M
- 2DeepSeek V3DeepSeekScore 88.8Q 94.1In $0.27/M
- 3Gemini 1.5 FlashGoogleScore 88.3Q 83.3In $0.08/M
- 4DeepSeek R1DeepSeekScore 85.1Q 94.5In $0.55/M
- 5Gemini 3 FlashGoogleScore 83.7Q 91.1In $0.30/M
- 6Gemini 2.5 FlashGoogleScore 81.8Q 88.4In $0.30/M
- 7Gemini 3 ProGoogleScore 80.3Q 97.3In $1.25/M
- 8GLM-4.6Zhipu AI (GLM)Score 78.1Q 83.2In $0.50/M
- 9Qwen3 235B (Thinking)Alibaba (Qwen)Score 78.0Q 75.2In $0.20/M
- 10DeepSeek V3 (Thinking)DeepSeekScore 77.6Q 78.2In $0.27/M
- 11GLM-4.7Zhipu AI (GLM)Score 77.2Q 81.9In $0.50/M
- 12GPT-5.5OpenAIScore 76.2Q 92.9In $1.50/M
- 13GPT-5.4OpenAIScore 75.4Q 91.7In $1.50/M
- 14Gemini 1.5 ProGoogleScore 74.5Q 85.8In $1.25/M
- 15Claude 3.5 HaikuAnthropicScore 72.8Q 80.8In $0.80/M