Rankings
Best Open-Source LLMs
Open-weight models you can self-host, fine-tune, or run through third-party providers like Groq and Together AI. Ranked by the same composite quality score as the overall rankings — intelligence, reliability, context window, speed, accessibility, and trust. Price not included.
top open-source llm
GPT OSS 120B
OpenAI · Quality 5.4/10 · AA Index 33.27
GPT OSS 120B is OpenAI's first large open-weight language model, released August 2025. It uses a Mixture-of-Experts architecture with 117 billion total parameters and 5.1 billion active per forward pass — designed so it can run on a single H100 GPU. With an AA Intelligence Index of 33 (#1 of 50 in reasoning open-weight models), it's the most capable officially released open-weight model from a frontier lab. At $0.15/$0.60 per 1M tokens and 336 tokens/second, it's both cheap and fast. The open weights are available on Hugging Face and can be self-hosted. A smaller companion model, GPT OSS 20B, runs on consumer 16GB GPUs at $0.05/$0.20 per 1M.
DeepSeek's open-weights frontier model and one of the most cost-effective APIs available. V3.2 punches far above its price — at $0.28/$1.10 per 1M tokens it costs roughly 20× less than Claude Sonnet while delivering an AA Intelligence Index of 32. Strong on coding and reasoning tasks, but hosted in China with the privacy implications that brings.
Meta's mid-sized open-weights model and the most capable Llama 4 variant for general use. Maverick runs as a mixture-of-experts architecture with 400B total parameters but only 17B active — giving it good speed at 115 t/s while maintaining an AA Intelligence Index of 18. It's multimodal, handles 1M tokens of context, and can be self-hosted. The trade-off: it trails frontier closed models significantly on all AA-measured benchmarks.
Kimi K2 (0905) is the flagship model from Moonshot AI, a Beijing-based startup — and the current default model on T3.chat. It's a Mixture-of-Experts model with 1 trillion total parameters and 32 billion active per forward pass, released September 2025. Kimi K2 scores an AA Intelligence Index of 31 (#6 of 36 in open-weights large models) and is available as open weights under a permissive license. At $0.39/$1.90 per 1M tokens via the Moonshot API with a 262K context window, it offers strong capability per dollar. Also available in a dedicated Thinking mode for complex reasoning. A newer version (Kimi K2.5) has since launched.
Meta's ultra-long-context open-weights model with a 10M token window — the largest of any publicly available model. Scout is a smaller MoE variant (109B total, ~17B active) optimized for speed and context length over raw intelligence. At 135 t/s and AA Intelligence Index 14, it's the right call when you need to process enormous documents or codebases that would overflow any other model.
Qwen 3 235B is Alibaba's largest open-source language model, released April 2025 under the MIT license. It uses a Mixture-of-Experts architecture with 235 billion total parameters and 22 billion active per forward pass. With a 262K context window and pricing as low as $0.20/$0.88 per 1M tokens on Alibaba Cloud, it's one of the most capable open models available at scale. Qwen 3 235B supports both a standard instruct mode and a Thinking mode for step-by-step reasoning. AA Intelligence Index of 17 reflects its April 2025 release date — newer open models have since surpassed it. A newer Qwen3 235B 2507 version has been released for those wanting the latest.
What counts as open source?
We include models that publish their weights for download, even if the license restricts commercial use or imposes other conditions. The industry calls these “open weights” rather than true open source (which would require open training data and code too). For practical purposes, if you can download and run the model locally or host it on your own infrastructure, it qualifies for this list.
Last updated March 2026. Intelligence scores from Artificial Analysis. See how we rate for full methodology.