Llama 4 Maverick

Open Source

4.0

out of 10

Meta's mid-sized open-weights model and the most capable Llama 4 variant for general use. Maverick runs as a mixture-of-experts architecture with 400B total parameters but only 17B active — giving it good speed at 115 t/s while maintaining an AA Intelligence Index of 18. It's multimodal, handles 1M tokens of context, and can be self-hosted. The trade-off: it trails frontier closed models significantly on all AA-measured benchmarks.

Context window

1.0M tokens

API (blended)

$0.31/1M

Consumer access

Free (limited)

Multimodal

Yes

Score Breakdown

40.1/100 → 4.0/10

Total40.1/100 → 4.0/10

Intelligence, Reliability, Speed, and Context are field-relative — scores shift as models are added. Accessibility and Trust are absolute checklists. Full methodology →

Try Llama 4 Maverick Compare

Strengths

+Open weights (Llama 4 Community License) — auditable, self-hostable, no vendor lock-in
+1M token context window at a fraction of closed-model pricing
+Natively multimodal: image + text input in a single API call
+115 t/s output speed (AA-measured) — fast inference even at large parameter counts
+Extremely cheap API: ~$0.31/1M blended via Groq
+Active ecosystem: supported by Groq, Together, Fireworks, Ollama, and others

Weaknesses

-AA Intelligence Index 18 — significantly trails frontier models (Gemini 3.1 Pro: 57, Claude Opus: 46)
-GPQA Diamond 67.1%, HLE 4.8% (AA-measured) — limited on hard science and expert reasoning
-τ²-bench 17.8% (AA-measured) — weak agentic tool use vs frontier alternatives
-Meta data practices: not ideal for privacy-sensitive enterprise use
-No MCP support
-Weights require significant hardware to self-host at full precision

Best for

self-hostingopen-source projectsbudget API usehigh-volume classificationon-premise deployment

Not ideal for

expert reasoning taskscomplex agentic pipelinessensitive enterprise data

Pricing details

Subscription plans

Free (self-hosted)Full model weights via HuggingFace — run on your own infrastructure

Free

Free (meta.ai)Web chat access with rate limits(Daily message caps; no API access)

Free

API pricing

Groqfree tierVery fast inference. Free tier available with rate limits.

$0.19/$0.65

Together AIGood reliability for production workloads.

$0.27/$0.85

Fireworks AIFast inference, competitive pricing.

$0.22/$0.88

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 27, 2026

Benchmark sources:Artificial Analysis: Llama 4 Maverick

Compare Llama 4 Maverick

Llama 4 Maverick vs Mistral Large 3We pick the other →