LLM Rankings by Capability
Pure capability — price not included. For budget-aware picks see best by price. For image and video generators see Images and Video.
Overall leader
Gemini 3.1 Pro
8.7/10
Intelligence leader
Gemini 3.1 Pro
AA 57.18
Context leader
Llama 4 Scout
10M tokens
Speed leader
GPT OSS 120B
304 t/s
Overall quality score
0–10 scoreThe composite quality score — intelligence, context, speed, accessibility, and trust. The most useful single ranking.
Reasoning & intelligence
AA Index scoreRanked by Artificial Analysis Intelligence Index — an independently measured composite of 10 benchmarks including GPQA Diamond, τ²-Bench, and Humanity's Last Exam.
Context window
Max tokensHow many tokens can the model process in a single request? Matters for long documents, full codebases, and multi-turn conversations.
Speed
Tokens/secOutput tokens per second via API — how fast the model actually responds. Sourced from Artificial Analysis speed leaderboard.
Coding
τ²-bench / LCBBest LLMs for code generation, debugging, and software engineering tasks. Ranked using τ²-bench and LiveCodeBench from Artificial Analysis.
Multimodal
Quality scoreLLMs that accept text and image input, ranked by overall quality score. Useful for document analysis, screenshot debugging, and visual Q&A.
How rankings are generated
Every number on these pages comes from a defined source — Artificial Analysis for intelligence and speed, official documentation for context windows. Nothing is estimated or manually assigned except where labeled “est.” See how we rate for the full methodology.