[good]

How We Rate LLMs

Every rating on this site is computed from a defined formula — not assigned by hand. The quality score measures what a model can do. Price is kept separate so cost does not distort capability rankings. For budget-aware recommendations, see the best value, free web chat, and budget API pages.

Why price is not in the quality score

“Which model is smarter?” and “Which is cheapest for my workload?” are different questions. Mixing them into a single score produces misleading results — a cheap model with mediocre capability can outscore an excellent model just because it costs less. Quality and price are both shown on every model page; they are just scored separately.

Quality scoring categories

Intelligence

live

40 pts max

Artificial Analysis Intelligence Index v4.0 — an independently measured composite of 10 standard benchmarks: GPQA Diamond, Humanity's Last Exam, SWE-bench Verified, Terminal-Bench Hard, τ²-Bench Telecom, GDPval-AA, SciCode, AA-LCR, AA-Omniscience, and IFBench. Standard/medium mode only — extended thinking scores are excluded. Normalized against the current field: floor is AA Index 20 (below any practical frontier model), ceiling is 5% above the current leader. Recalibrates automatically when models are added.

Current field range: 15.0 (floor) – 50.9 (ceiling, 5% above field leader)

Context Window

15 pts max

Absolute thresholds — qualitative capability differences that matter regardless of competition. <32K=2, 32K–128K=5, 128K–200K=7, 200K–400K=9, 400K–1M=11, 1M–5M=13, 5M–10M=14, 10M+=15 pts. Steps extended to 1B tokens for future-proofing.

Speed

10 pts max

Output tokens per second (Artificial Analysis speed leaderboard). <15 t/s=1, 15–30=2, 30–50=4, 50–80=6, 80–120=8, 120–200=9, 200+=10 pts. Steps extended to 2000+ t/s for future inference hardware.

Accessibility

10 pts max

Practical access for everyday users: free tier (+4), chat UI (+3), mobile app (+2), open source / self-hostable (+1). Max 10 pts.

Trust & Privacy

5 pts max

Company jurisdiction: US/EU = 3 pts, other = 2 pts, Chinese company = 1 pt. Strong published privacy policy: +2 pts.

Final Rating

total / 8

Total score (max 80) divided by 8, rounded to one decimal place. This is the quality rating shown on all model pages and comparisons. No pricing influence.

Quality score audit — all models

Sorted by total quality score. Click a model name to view its full review.

70/808.8
Intelligence
AA Index: 48.44
37/40
Context Window
1M tokens
13/15
Speedest.
55 t/s
6/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
GPT-5.2OpenAI
66/808.3
Intelligence
AA Index: 46.58
35/40
Context Window
400K tokens
11/15
Speedest.
65 t/s
6/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
64/808.0
Intelligence
AA Index: 44.33
33/40
Context Window
200K tokens
9/15
Speedest.
85 t/s
8/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
64/808.0
Intelligence
AA Index: 41.43
29/40
Context Window
2M tokens
13/15
Speedest.
90 t/s
8/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
60/807.5
Intelligenceest.
AA Index: 38.5
26/40
Context Window
10M tokens
15/15
Speed
180 t/s
9/10
Accessibility
free tier, open source
5/10
Trust & Privacy
US/EU company
5/5
60/807.5
Intelligence
AA Index: 46
35/40
Context Window
200K tokens
9/15
Speed
67 t/s
6/10
Accessibility
chat UI, mobile app
5/10
Trust & Privacy
US/EU company
5/5
58/807.3
Intelligence
AA Index: 39
27/40
Context Window
400K tokens
11/15
Speed
73 t/s
6/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
58/807.3
Intelligence
AA Index: 35
22/40
Context Window
1M tokens
13/15
Speed
170 t/s
9/10
Accessibility
free tier, chat UI, mobile app
9/10
Trust & Privacy
US/EU company
5/5
50/806.3
Intelligence
AA Index: 41.61
30/40
Context Window
128K tokens
7/15
Speedest.
45 t/s
4/10
Accessibility
free tier, chat UI, open source
8/10
Trust & Privacy
Chinese company
1/5
37/804.6
Intelligence
AA Index: 23
9/40
Context Window
256K tokens
9/15
Speed
56 t/s
6/10
Accessibility
free tier, chat UI, open source
8/10
Trust & Privacy
US/EU company
5/5
35/804.4
Intelligence
AA Index: 18
3/40
Context Window
1M tokens
13/15
Speed
125 t/s
9/10
Accessibility
free tier, open source
5/10
Trust & Privacy
US/EU company
5/5

Data sources

  • Intelligence Index: Artificial Analysis Intelligence Index v4.0. Independently measured — not self-reported by providers. Incorporates 10 benchmarks: GPQA Diamond, Humanity's Last Exam, SWE-bench Verified, Terminal-Bench Hard, τ²-Bench Telecom, GDPval-AA, SciCode, AA-LCR, AA-Omniscience, IFBench. Standard/medium inference mode only.
  • Speed (tokens/sec): Artificial Analysis speed leaderboard — API-measured output tokens per second, 72hr rolling average.
  • Pricing: Official provider API documentation and product pages. Verified monthly.
  • Context window, accessibility, trust: Manually verified from official product pages and provider privacy policies.

Values labeled “est.” in the audit table above have not yet been directly measured by Artificial Analysis for that specific model version. They are extrapolated from the closest available AA measurement. These will be updated to verified values as AA completes indexing.

Quality scores recalculate automatically when model data is updated. Last data verification: February 2026.