[good?]

Meta

Llama 4 Scout

Open Source
3.9
out of 10

Meta's ultra-long-context open-weights model with a 10M token window — the largest of any publicly available model. Scout is a smaller MoE variant (109B total, ~17B active) optimized for speed and context length over raw intelligence. At 135 t/s and AA Intelligence Index 14, it's the right call when you need to process enormous documents or codebases that would overflow any other model.

Context window

10.0M tokens

API (blended)

$0.17/1M

Consumer access

Free (limited)

Multimodal

Yes

Score Breakdown

39.1/100 → 3.9/10
Total39.1/100 → 3.9/10

Intelligence, Reliability, Speed, and Context are field-relative — scores shift as models are added. Accessibility and Trust are absolute checklists. Full methodology →

Strengths

  • +10M token context — largest of any publicly available model; processes entire repositories or books in one call
  • +Open weights (Llama 4 Community License) — fully self-hostable and auditable
  • +135 t/s (AA-measured) — fastest of any model reviewed, including much smaller ones
  • +Ultra-cheap API: ~$0.17/1M blended via Groq — cheapest in the dataset
  • +Natively multimodal: image + text input
  • +Active ecosystem: Groq, Together, Fireworks, Ollama, and LM Studio support

Weaknesses

  • -AA Intelligence Index 13 — the lowest in this dataset; not suitable for complex reasoning tasks
  • -GPQA Diamond 58.7%, HLE 4.3% (AA-measured) — near the lower bound for frontier science tasks
  • -τ²-bench 15.5% (AA-measured) — very limited agentic tool use capability
  • -Meta data practices: not ideal for privacy-sensitive enterprise use
  • -10M context not available through all hosted providers — check limits before integrating
  • -No MCP support

Best for

ultra-long document processinglarge codebase analysisself-hostinghigh-throughput summarizationbudget-constrained pipelines

Not ideal for

expert reasoningcomplex agentic taskshigh-quality creative worksensitive enterprise data

Pricing details

Subscription plans

Free (self-hosted)Full model weights via HuggingFace — run on your own infrastructure
Free
Free (meta.ai)Web chat access with rate limits(Daily message caps; full 10M context only via API/self-host)
Free

API pricing

Groqfree tierExtremely fast. Free tier with rate limits. Context window varies by provider.
$0.11/$0.34
Together AICompetitive pricing; good for production workloads.
$0.18/$0.66
Fireworks AIFast inference options available.
$0.15/$0.6

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 27, 2026