[good?]

Rankings

Best LLMs Overall

Ranked by composite quality score across 5 dimensions. Price is not included — this ranking answers “which model is most capable,” not “which is cheapest.” For price-aware picks, see best by price.

Score = Intelligence (35) + Tool Use (30) + Context (10) + Trust (15) + Speed (10) = 100 pts → ÷10 = rating. Tool Use: tool calling API (+10), MCP support (+10), parallel tool calls (+5) — verified capability checklist. Trust: US/EU (+7), privacy (+5), open source (+3).

top pick overall

Gemini 3.1 Pro

Google · Quality score 8.7/10 · 86.6/100 pts

1
Intelligence
40/40
Reliability
15/15
Access
10/10
Context
5.9/10
Trust
12/15
Speed
3.7/10
8.7
86.6/100
Free (limited) · $4.50/1M API
Full review →
2
Intelligence
32.3/40
Reliability
12.4/15
Access
10/10
Context
5.8/10
Trust
12/15
Speed
5.1/10
7.8
77.6/100
Free (limited) · $4.50/1M API
Full review →
3
Intelligence
30.6/40
Reliability
11.8/15
Access
10/10
Context
5.8/10
Trust
12/15
Speed
7.3/10
7.8
77.5/100
Free consumer product
Full review →
4
GPT-5.2OpenAI
Intelligence
34.9/40
Reliability
9.9/15
Access
10/10
Context
4.1/10
Trust
12/15
Speed
3.7/10
7.5
74.6/100
Free (limited) · $4.81/1M API
Full review →
5
Intelligence
37.2/40
Reliability
11.5/15
Access
3/10
Context
4.1/10
Trust
12/15
Speed
3.9/10
7.2
71.7/100
$4.81/1M API blended
Full review →
6
Intelligence
28.9/40
Reliability
9.6/15
Access
10/10
Context
2.8/10
Trust
12/15
Speed
2.6/10
6.6
65.9/100
Free (limited) · $6.00/1M API
Full review →
7
Intelligence
30.7/40
Reliability
10.5/15
Access
5/10
Context
2.8/10
Trust
12/15
Speed
3/10
6.4
64/100
$10.00/1M API blended
Full review →
8
Intelligence
26.1/40
Reliability
7.4/15
Access
10/10
Context
4.1/10
Trust
12/15
Speed
3.2/10
6.3
62.8/100
Free (limited) · $0.69/1M API
Full review →
9
Intelligence
22.6/40
Reliability
8.6/15
Access
10/10
Context
2.8/10
Trust
12/15
Speed
2.1/10
5.8
58.1/100
Free (limited) · $6.00/1M API
Full review →
10
Intelligence
17.3/40
Reliability
8.9/15
Access
10/10
Context
2.8/10
Trust
12/15
Speed
4.3/10
5.5
55.3/100
Free (limited) · $2.00/1M API
Full review →
11
GPT OSS 120BOpenAIopen weights
Intelligence
19.2/40
Reliability
2.4/15
Access
5/10
Context
2/10
Trust
15/15
Speed
10/10
5.4
53.6/100
Free (limited) · $0.26/1M API
Full review →
12
Intelligence
10.7/40
Reliability
2.3/15
Access
10/10
Context
7/10
Trust
12/15
Speed
4.9/10
4.7
46.9/100
Free (limited) · $0.25/1M API
Full review →
13
DeepSeek V3.2DeepSeekopen weights
Intelligence
18.2/40
Reliability
2.9/15
Access
10/10
Context
2/10
Trust
5/15
Speed
2.2/10
4.0
40.3/100
Free consumer product
Full review →
14
Llama 4 MaverickMetaopen weights
Intelligence
6.2/40
Reliability
3.7/15
Access
10/10
Context
5.8/10
Trust
10/15
Speed
4.4/10
4.0
40.1/100
Free (limited) · $0.31/1M API
Full review →
15
Kimi K2Moonshot AIopen weights
Intelligence
13.1/40
Reliability
5.9/15
Access
10/10
Context
3.3/10
Trust
5/15
Speed
2.2/10
4.0
39.5/100
Free (limited) · $0.77/1M API
Full review →
16
Llama 4 ScoutMetaopen weights
Intelligence
2/40
Reliability
2.1/15
Access
10/10
Context
10/10
Trust
10/15
Speed
5/10
3.9
39.1/100
Free (limited) · $0.17/1M API
Full review →
17
Mistral Large 3Mistral AI
Intelligence
10.1/40
Reliability
4/15
Access
0/10
Context
3.3/10
Trust
12/15
Speed
2.4/10
3.2
31.8/100
$0.75/1M API blended
Full review →
18
Qwen 3 235BAlibabaopen weights
Intelligence
5/40
Reliability
2/15
Access
10/10
Context
3.3/10
Trust
5/15
Speed
2/10
2.7
27.3/100
Free (limited) · $0.37/1M API
Full review →

Pending Review

These models lack enough independently verified benchmark data for a reliable score. Categories showing 0 are zeroed until we have the data to back them up. They'll move into the main rankings once testing is complete.

Intelligence
0/40
Reliability
5/15
Access
5/10
Context
3.3/10
Trust
7/15
Speed
0/10

Missing: AA Intelligence Index (estimated — not yet measured by Artificial Analysis) · Speed / TPS (estimated — not yet measured by Artificial Analysis) · AA-Omniscience (no data — scored as neutral 5/15)

2.0
20.3/100
$9.00/1M API blended
Full review →

Last updated February 2026. Intelligence scores from Artificial Analysis. Speed from AA speed leaderboard. See how we rate for full methodology.