AMA

Throughput

Fastest text models (hosted APIs)

Sorted by our normalized output-speed benchmark (higher is faster relative to models we track). Latency and TTFT differ — use the interactive leaderboard’s realtime-chat scenario for a blended view.

  1. 1
    o1-mini
    OpenAI
    100.0
  2. 2
    Claude 3.5 Sonnet
    Anthropic
    100.0
  3. 3
    Claude 3.5 Haiku
    Anthropic
    100.0
  4. 4
    Claude 3 Opus
    Anthropic
    100.0
  5. 5
    Claude 3.7 Sonnet
    Anthropic
    100.0
  6. 6
    Gemini 1.5 Pro
    Google
    100.0
  7. 7
    Gemini 1.5 Flash
    Google
    100.0
  8. 8
    Gemini 2.0 Flash
    Google
    100.0
  9. 9
    Grok 2
    xAI
    100.0
  10. 10
    DeepSeek V3
    DeepSeek
    100.0
  11. 11
    DeepSeek R1
    DeepSeek
    100.0
  12. 12
    Gemini 2.5 Flash
    Google
    86.8
  13. 13
    o4-mini
    OpenAI
    57.9
  14. 14
    o3-mini
    OpenAI
    57.9
  15. 15
    GPT-4o
    OpenAI
    55.3
  16. 16
    Llama 4 Maverick
    Meta
    47.4
  17. 17
    Gemini 2.5 Pro
    Google
    44.7
  18. 18
    Llama 4 Scout
    Meta
    44.7
  19. 19
    o1
    OpenAI
    36.8
  20. 20
    GPT-5
    OpenAI
    31.6
  21. 21
    Llama 3.3 70B Instruct
    Meta
    31.6
  22. 22
    o3
    OpenAI
    23.7
  23. 23
    GPT-4o mini
    OpenAI
    18.4
  24. 24
    Qwen3 235B
    Alibaba (Qwen)
    18.4
  25. 25
    Grok 3
    xAI
    13.2
  26. 26
    Qwen2.5 72B Instruct
    Alibaba (Qwen)
    13.2
  27. 27
    Claude Sonnet 4 (Thinking)
    Anthropic
    10.5
  28. 28
    Claude Sonnet 4
    Anthropic
    7.9
  29. 29
    Grok 4
    xAI
    7.9
  30. 30
    Mistral Large 2
    Mistral
    5.3
  31. 31
    GPT-4 Turbo
    OpenAI
    2.6
  32. 32
    Claude Opus 4
    Anthropic
    2.6
  33. 33
    Claude Opus 4 (Thinking)
    Anthropic
    2.6
  34. 34
    Llama 3.1 70B Instruct
    Meta
    2.6
  35. 35
    Llama 3.1 405B Instruct
    Meta
    0.0

Speed figures are aggregated from observable public sources; see AI Model Analyzer About for methodology.