Smartest LLMs — Ranked by Intelligence
Ranked by Artificial Analysis Intelligence Index v4.0 — an independently measured composite of 10 standard benchmarks. All models run under the same conditions (standard/medium mode, no extended thinking). These are the most trustworthy intelligence rankings available because AA measures every model the same way, rather than relying on self-reported results.
What the AA Index measures
Source: artificialanalysis.ai · Standard/medium inference mode · Not extended thinking
most intelligent (measured)
Gemini 3 Pro
Google · AA Index 48.44
Why AA Index instead of individual benchmarks?
Single benchmarks saturate (models score 90%+), get gamed by training on leaked test sets, and measure narrow capability. The AA Index combines 10 complementary benchmarks — including hard reasoning (GPQA Diamond, HLE), real-world coding (SWE-bench, Terminal-Bench), and scientific reasoning (SciCode, GDPval) — under controlled conditions. No self-reported scores. Every model is run the same way.
Values marked “est.” are extrapolated from the nearest available AA measurement for that model version. Updated when AA completes direct indexing.