LLM Rankings by Performance

Models ranked by what they can do — not what they cost. Price is excluded from all rankings on this page. For budget-aware picks, see best by price.

Overall leader

Gemini 3 Pro

8.8/10

Intelligence leader

Gemini 3 Pro

AA 48.44

Context leader

Llama 4 Scout

10M tokens

Speed leader

Llama 4 Scout

180 t/s

Overall quality score

0–10 score

The composite quality score — intelligence, context, speed, accessibility, and trust. The most useful single ranking.

→

Reasoning & intelligence

AA Index score

Ranked by Artificial Analysis Intelligence Index — an independently measured composite of 10 benchmarks including GPQA Diamond, SWE-bench, and Humanity's Last Exam.

→

Context window

Max tokens

How many tokens can the model process in a single request? Matters for long documents, full codebases, and multi-turn conversations.

→

Speed

Tokens/sec

Output tokens per second via API — how fast the model actually responds. Sourced from Artificial Analysis speed leaderboard.

→

How rankings are generated

Every number on these pages comes from a defined source — Artificial Analysis for intelligence and speed, official documentation for context windows. Nothing is estimated or manually assigned except where labeled “est.” See how we rate for the full methodology.