[good?]

Meta

Llama 4 Maverick

Open Source
4.0
out of 10

Meta's mid-sized open-weights model and the most capable Llama 4 variant for general use. Maverick runs as a mixture-of-experts architecture with 400B total parameters but only 17B active — giving it good speed at 115 t/s while maintaining an AA Intelligence Index of 18. It's multimodal, handles 1M tokens of context, and can be self-hosted. The trade-off: it trails frontier closed models significantly on all AA-measured benchmarks.

Context window

1.0M tokens

API (blended)

$0.31/1M

Consumer access

Free (limited)

Multimodal

Yes

Score Breakdown

40.1/100 → 4.0/10
Total40.1/100 → 4.0/10

Intelligence, Reliability, Speed, and Context are field-relative — scores shift as models are added. Accessibility and Trust are absolute checklists. Full methodology →

Strengths

  • +Open weights (Llama 4 Community License) — auditable, self-hostable, no vendor lock-in
  • +1M token context window at a fraction of closed-model pricing
  • +Natively multimodal: image + text input in a single API call
  • +115 t/s output speed (AA-measured) — fast inference even at large parameter counts
  • +Extremely cheap API: ~$0.31/1M blended via Groq
  • +Active ecosystem: supported by Groq, Together, Fireworks, Ollama, and others

Weaknesses

  • -AA Intelligence Index 18 — significantly trails frontier models (Gemini 3.1 Pro: 57, Claude Opus: 46)
  • -GPQA Diamond 67.1%, HLE 4.8% (AA-measured) — limited on hard science and expert reasoning
  • -τ²-bench 17.8% (AA-measured) — weak agentic tool use vs frontier alternatives
  • -Meta data practices: not ideal for privacy-sensitive enterprise use
  • -No MCP support
  • -Weights require significant hardware to self-host at full precision

Best for

self-hostingopen-source projectsbudget API usehigh-volume classificationon-premise deployment

Not ideal for

expert reasoning taskscomplex agentic pipelinessensitive enterprise data

Pricing details

Subscription plans

Free (self-hosted)Full model weights via HuggingFace — run on your own infrastructure
Free
Free (meta.ai)Web chat access with rate limits(Daily message caps; no API access)
Free

API pricing

Groqfree tierVery fast inference. Free tier available with rate limits.
$0.19/$0.65
Together AIGood reliability for production workloads.
$0.27/$0.85
Fireworks AIFast inference, competitive pricing.
$0.22/$0.88

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 27, 2026