Rankings

Best LLMs Overall

Ranked by composite quality score across 5 dimensions. Price is not included — this ranking answers “which model is most capable,” not “which is cheapest.” For price-aware picks, see best by price.

Score = Intelligence (35) + Tool Use (30) + Context (10) + Trust (15) + Speed (10) = 100 pts → ÷10 = rating. Tool Use: tool calling API (+10), MCP support (+10), parallel tool calls (+5) — verified capability checklist. Trust: US/EU (+7), privacy (+5), open source (+3).

top pick overall

Gemini 3.1 Pro

Google · Quality score 8.7/10 · 86.6/100 pts

Gemini 3.1 ProGoogle

Intelligence

40/40

Reliability

15/15

Access

10/10

Context

5.9/10

Trust

12/15

Speed

3.7/10

8.7

86.6/100

Free (limited) · $4.50/1M API

Full review →

Gemini 3 ProGoogle

Intelligence

32.3/40

Reliability

12.4/15

Access

10/10

Context

5.8/10

Trust

12/15

Speed

5.1/10

7.8

77.6/100

Free (limited) · $4.50/1M API

Full review →

Gemini 3 FlashGoogle

Intelligence

30.6/40

Reliability

11.8/15

Access

10/10

Context

5.8/10

Trust

12/15

Speed

7.3/10

7.8

77.5/100

Free consumer product

Full review →

GPT-5.4OpenAI

Intelligence

38.5/40

est.

Reliability

5/15

Access

10/10

Context

5.8/10

Trust

12/15

Speed

3.8/10

est.

7.5

75.1/100

Free (limited) · $5.63/1M API

Full review →

GPT-5.2OpenAI

Intelligence

34.9/40

Reliability

9.9/15

Access

10/10

Context

4.1/10

Trust

12/15

Speed

3.7/10

7.5

74.6/100

Free (limited) · $4.81/1M API

Full review →

GPT-5.3-CodexOpenAI

Intelligence

37.2/40

Reliability

11.5/15

Access

3/10

Context

4.1/10

Trust

12/15

Speed

3.9/10

7.2

71.7/100

$4.81/1M API blended

Full review →

Claude Sonnet 4.6Anthropic

Intelligence

28.9/40

Reliability

9.6/15

Access

10/10

Context

2.8/10

Trust

12/15

Speed

2.6/10

6.6

65.9/100

Free (limited) · $6.00/1M API

Full review →

Claude Opus 4.6Anthropic

Intelligence

30.7/40

Reliability

10.5/15

Access

5/10

Context

2.8/10

Trust

12/15

Speed

3/10

6.4

64/100

$10.00/1M API blended

Full review →

GPT-5 MiniOpenAI

Intelligence

26.1/40

Reliability

7.4/15

Access

10/10

Context

4.1/10

Trust

12/15

Speed

3.2/10

6.3

62.8/100

Free (limited) · $0.69/1M API

Full review →

Claude Sonnet 4.5Anthropic

Intelligence

22.6/40

Reliability

8.6/15

Access

10/10

Context

2.8/10

Trust

12/15

Speed

2.1/10

5.8

58.1/100

Free (limited) · $6.00/1M API

Full review →

Claude Haiku 4.5Anthropic

Intelligence

17.3/40

Reliability

8.9/15

Access

10/10

Context

2.8/10

Trust

12/15

Speed

4.3/10

5.5

55.3/100

Free (limited) · $2.00/1M API

Full review →

GPT OSS 120BOpenAIopen weights

Intelligence

19.2/40

Reliability

2.4/15

Access

5/10

Context

2/10

Trust

15/15

Speed

10/10

5.4

53.6/100

Free (limited) · $0.26/1M API

Full review →

Grok 4.1xAI

Intelligence

10.7/40

Reliability

2.3/15

Access

10/10

Context

7/10

Trust

12/15

Speed

4.9/10

4.7

46.9/100

Free (limited) · $0.25/1M API

Full review →

DeepSeek V3.2DeepSeekopen weights

Intelligence

18.2/40

Reliability

2.9/15

Access

10/10

Context

2/10

Trust

5/15

Speed

2.2/10

4.0

40.3/100

Free consumer product

Full review →

Llama 4 MaverickMetaopen weights

Intelligence

6.2/40

Reliability

3.7/15

Access

10/10

Context

5.8/10

Trust

10/15

Speed

4.4/10

4.0

40.1/100

Free (limited) · $0.31/1M API

Full review →

Kimi K2Moonshot AIopen weights

Intelligence

13.1/40

Reliability

5.9/15

Access

10/10

Context

3.3/10

Trust

5/15

Speed

2.2/10

4.0

39.5/100

Free (limited) · $0.77/1M API

Full review →

Llama 4 ScoutMetaopen weights

Intelligence

2/40

Reliability

2.1/15

Access

10/10

Context

10/10

Trust

10/15

Speed

5/10

3.9

39.1/100

Free (limited) · $0.17/1M API

Full review →

Mistral Large 3Mistral AI

Intelligence

10.1/40

Reliability

4/15

Access

0/10

Context

3.3/10

Trust

12/15

Speed

2.4/10

3.2

31.8/100

$0.75/1M API blended

Full review →

Qwen 3 235BAlibabaopen weights

Intelligence

5/40

Reliability

2/15

Access

10/10

Context

3.3/10

Trust

5/15

Speed

2/10

2.7

27.3/100

Free (limited) · $0.37/1M API

Full review →

Pending Review

These models lack enough independently verified benchmark data for a reliable score. Categories showing 0 are zeroed until we have the data to back them up. They'll move into the main rankings once testing is complete.

—

Grok 4.2xAI

Intelligence

0/40

Reliability

5/15

Access

5/10

Context

3.3/10

Trust

7/15

Speed

0/10

Missing: AA Intelligence Index (estimated — not yet measured by Artificial Analysis) · Speed / TPS (estimated — not yet measured by Artificial Analysis) · AA-Omniscience (no data — scored as neutral 5/15)

2.0

20.3/100

$9.00/1M API blended

Full review →

Last updated February 2026. Intelligence scores from Artificial Analysis. Speed from AA speed leaderboard. See how we rate for full methodology.