Anthropic

Claude Sonnet 4.6

6.6

out of 10

Released February 17, 2026, Claude Sonnet 4.6 is the model most people should use. It's Anthropic's default for free and paid users on Claude.ai for a reason: near-Opus performance at one-fifth the cost, a newer knowledge cutoff, and — in a twist worth understanding — it actually beats Claude Opus 4.6 on the everyday tasks that matter most: office productivity, financial analysis, and real-world tool use. For coding agents, it's the model even serious engineers prefer 59% of the time over the previous-generation flagship.

Context window

200K tokens

API (blended)

$6.00/1M

Consumer access

Free (limited) / $20/mo

Multimodal

Yes

Score Breakdown

65.9/100 → 6.6/10

Total65.9/100 → 6.6/10

Intelligence, Reliability, Speed, and Context are field-relative — scores shift as models are added. Accessibility and Trust are absolute checklists. Full methodology →

Try Claude Sonnet 4.6 Compare

Strengths

+Best writing quality of any mainstream model
+Highly reliable instruction-following
+200K context handles most real-world long-document tasks
+Strong at nuanced reasoning and analysis
+More honest about uncertainty than most models

Weaknesses

-Most expensive API in the group at $6/1M blended
-Smallest context window of the six models reviewed
-No image generation capability

Best for

writingeditinganalysislong documentsresearch synthesis

Not ideal for

real-time dataimage generationultra-budget API use

Adaptive Thinking — Same Engine as Opus

Sonnet 4.6 runs the same four-tier adaptive thinking system as Opus 4.6. The difference isn't the ceiling — it's that Sonnet reaches it at a fraction of the cost.

Effort level	Latency	Best for	Cost impact
Low	Fast	Data retrieval, formatting, simple Q&A	Minimal
Medium	Moderate	Summaries, code tasks, email drafts, analysis	Standard
High (default)	Slower	Complex reasoning, multi-step research, debugging	Standard
Max	Slowest	Hard constraint problems, deep architecture planning	Highest

At medium effort, Sonnet 4.6 matches or beats Opus 4.5 performance while consuming dramatically fewer tokens. Match effort to task complexity — max isn't always better, and it's always more expensive.

How It Benchmarks vs. Competitors

Pass@1, single-attempt scores. All AA-measured in standard mode — no extended thinking, apples-to-apples.

Knowledge & Reasoning (AA-measured)

Benchmark	Claude Sonnet 4.6	Claude Opus 4.6	GPT-5.2	Gemini 3.1 Pro
GPQA Diamond (PhD science reasoning)	79.9%	84.0%	90.3%	94.1%
HLE (expert-level knowledge)	13.2%	18.6%	35.4%	44.7%

Sonnet trails Opus on deep science reasoning — GPQA Diamond is the honest gap. If your work requires expert-level hard-science QA at scale, Opus or Gemini 3.1 Pro is the right choice. For most knowledge work, the gap is smaller in practice than the raw numbers suggest.

Full write-up: Claude Sonnet 4.6 vs GPT-5.2 →Full write-up: Claude Opus 4.6 vs Claude Sonnet 4.6 →

Sources:Artificial Analysis: Claude Sonnet 4.6 Artificial Analysis: Claude Opus 4.6 Artificial Analysis: GPT-5.2 Artificial Analysis: Gemini 3.1 Pro

Coding & Tool Use (AA-measured)

Benchmark	Claude Sonnet 4.6	Claude Opus 4.6	GPT-5.2	Gemini 3.1 Pro
τ²-bench (real-world tool use)	79.5%	84.8%	84.8%	95.6%
AA Coding Index	46.43	47.56	48.67	55.5

On tool use, Sonnet is within 5 points of Opus and GPT-5.2. For most agentic pipelines, that gap is not decision-relevant — the cost difference ($3/$15 vs $5/$25 per MTok) almost certainly is.

Full write-up: Claude Sonnet 4.6 vs GPT-5.2 →

Sources:Artificial Analysis: Claude Sonnet 4.6 Artificial Analysis: Claude Opus 4.6

Where Sonnet 4.6 Actually Beats Opus 4.6

This is the part of the marketing narrative Anthropic underplays. On several high-value real-world tasks, Sonnet wins — not just ties.

Tasks where Sonnet 4.6 leads

Task	Sonnet 4.6	Opus 4.6	Edge
GDPval-AA (office productivity, Elo)	1,633	1,559	Sonnet +74 Elo
Finance Agent v1.1 (financial analysis)	63.3%	62.0%	Sonnet +1.3pp
MCP-Atlas (scaled tool use)	61.3%	60.3%	Sonnet +1.0pp
OSWorld (computer use)	72.5%	72.7%	Essentially tied (−0.2pp)
Knowledge cutoff (training data through)	Jan 2026	Aug 2025	Sonnet +5 months newer

GDPval-AA is Anthropic's measure of everyday office AI tasks — writing, analysis, scheduling, planning. A 74-Elo gap is meaningful at the top of the distribution. The knowledge cutoff difference matters for anything involving events from late 2025 onward.

Full write-up: Claude Opus 4.6 vs Claude Sonnet 4.6 →

Sources:Artificial Analysis: Claude Sonnet 4.6 Anthropic: Claude Sonnet 4.6 release

Context Window & Output Capacity

Sonnet's context is the same as Opus — the one real spec difference is output tokens.

Capability	Claude Sonnet 4.6	Claude Opus 4.6	Notes
Standard context	200,000 tokens	200,000 tokens	~150K words — handles most real-world documents
Extended context (beta)	1,000,000 tokens	1,000,000 tokens	Available via API beta flag (Tier 4+); same as Opus
Max output tokens	64,000 tokens	128,000 tokens	Sonnet gets half — still large enough for full reports
Knowledge cutoff (reliable)	August 2025	May 2025	Sonnet is more current on recent events
Training data through	January 2026	August 2025	Sonnet trained on ~5 months more data

64K output is enough for nearly all production use cases: full code files, long-form reports, migration plans. You only need 128K for very large single-document generation. If that's your use case, Opus is the right call.

Claude Code — Preferred Over the Previous Flagship

Sonnet 4.6 is the default model in Claude Code. The preference numbers are stark.

Claude Code model preference (internal A/B testing)

Comparison	Sonnet 4.6 preference	Sample
Sonnet 4.6 vs Claude Opus 4.5 (previous flagship)	59%	Production Claude Code sessions
Opus 4.6 vs Sonnet 4.6 on Sonnet 4.6 tasks	Varies by task type	Coding tasks: often similar

The 59% preference stat means: when engineers had a choice between Sonnet 4.6 and the previous-generation flagship Opus 4.5, they chose Sonnet 4.6 more than half the time — a remarkable result for a model at one-fifth the cost. Common reason: Sonnet 4.6 applies effort more efficiently and doesn't over-reason on simple tasks.

Sources:Anthropic: Introducing Claude Sonnet 4.6

Practical routing recommendation

Most Claude Code power users route by task type: Sonnet 4.6 for incremental development, PR reviews, and test writing; Opus 4.6 for greenfield architecture, hard debugging sessions, and anything requiring sustained multi-file reasoning over hours. If you're unsure, start with Sonnet — upgrade to Opus only when you hit a real capability wall.

Computer Use — Near-Opus Performance

Sonnet 4.6 and Opus 4.6 score within 0.2 percentage points on OSWorld. For most computer-use deployments, the choice comes down to cost, not capability.

Computer use performance

Benchmark	Sonnet 4.6	Opus 4.6	GPT-5.2
OSWorld (GUI navigation & desktop tasks)	72.5%	72.7%	38.2%

OSWorld measures a model's ability to navigate operating system GUIs, run terminal commands, and interact with applications autonomously. Claude models lead the field by a significant margin — GPT-5.2's 38.2% reflects that OpenAI hasn't invested comparably in computer-use training.

Sources:Anthropic: Claude computer use

Safety Profile — More Reassuring Than Opus

Sonnet 4.6's system card was notably more positive than Opus 4.6's. The red-team findings tell a different story at the two tiers.

System card summary

Anthropic's evaluation described Sonnet 4.6 as showing 'a broadly warm, honest, prosocial, and at times funny character, very strong safety behaviors, and no signs of major concerns.' Prompt injection resistance improved significantly over Sonnet 4.5. One documented caveat: in computer use contexts, Sonnet 4.6 showed 'overeager behavior' in some GUI tasks — acting before confirming intent. For autonomous computer-use agents, add confirmation checkpoints.

Key safety contrast: Opus 4.6 vs Sonnet 4.6

Concern	Opus 4.6	Sonnet 4.6
Scheming / manipulation in agentic contexts	Documented in system card	No major concerns flagged
Prompt injection resistance	Improved (0.77% mitigated ASR)	Significantly improved vs 4.5
ASL classification	ASL-3	ASL-2 / ASL-3 boundary
Morally-motivated sabotage	Occasional 'whistleblowing' in edge cases	Not documented
Computer use overeagerness	Not flagged	Documented — add confirmation steps

ASL-3 is Anthropic's classification for models that 'substantially increase the risk of catastrophic misuse.' Opus 4.6 operates under full ASL-3 protections. Sonnet 4.6's classification reflects a lower risk profile for the same type of deployment.

Sources:Anthropic: Claude Sonnet 4.6 system card

Pricing — The Real Reason Sonnet Wins

At 40% of Opus's cost with comparable performance on most tasks, the math is straightforward for most workloads.

API pricing comparison

Model	Input (per MTok)	Output (per MTok)	Blended (3:1 ratio)	vs Sonnet
Claude Sonnet 4.6	$3.00	$15.00	$6.00	—
Claude Opus 4.6	$5.00	$25.00	$10.00	+67% more expensive
GPT-5.2	$1.25	$5.00	$2.19	64% cheaper than Sonnet
Gemini 3.1 Pro	$1.25	$10.00	$3.44	43% cheaper than Sonnet

Blended cost calculated at 3:1 input:output ratio. Sonnet's $6/1M blended puts it above GPT-5.2 and Gemini 3.1 Pro on raw price — but Claude's lower iteration count and superior instruction-following mean total cost per completed task is often comparable. Batch API (50% discount) and prompt caching (90% savings on cached reads) can bring Sonnet's effective cost well below GPT-5.2 for repeated-context workloads.

Sources:Anthropic pricing

Consumer plan pricing

Plan	Price/month	Sonnet 4.6 access	Opus 4.6 access
Free	$0	✓ (with daily limits)	✗
Pro	$20	✓ (full access)	✓
Max (5×)	$100	✓	✓ (5× Pro capacity)
Max (20×)	$200	✓	✓ (20× Pro capacity)
Team	$30/user/mo	✓	✓

Sonnet 4.6 is the default for Free and Pro users. Pro at $20/month gets you Opus 4.6 too. Max plans are for power users who exhaust Pro limits daily.

Sonnet 4.6 vs Sonnet 4.5 — What Actually Changed

Both cost $3/$15 per 1M tokens. Sonnet 4.6 is the free upgrade — here's what you get.

Dimension	Sonnet 4.6	Sonnet 4.5	Delta
AA Intelligence Index	44.38	37.14	+7.2 pts
OSWorld (computer use)	72.5%	61.4%	+11.1pp — major jump
GDPval-AA (office tasks Elo)	1,633	Not reported	New leading score
Knowledge cutoff (training)	Jan 2026	July 2025	+6 months newer
SWE-bench Verified (provider)	79.6%	77.2%	+2.4pp
Claude Code preference vs 4.5	59% preferred	41% preferred	4.6 wins majority
Adaptive thinking tiers	4 tiers	4 tiers	Same
API price	$3/$15	$3/$15	No cost to upgrade

The OSWorld jump (+11.1pp) is the biggest practical improvement. Computer use in Sonnet 4.5 worked — in Sonnet 4.6, it's substantially more reliable. If your team has autonomous browser or desktop agents on Sonnet 4.5, this is the upgrade that matters.

Sources:Claude Sonnet 4.5 full review

Sonnet 4.6 vs Gemini 3.1 Pro: Where Each Wins

The most common $3-tier decision for professional API users.

Dimension	Claude Sonnet 4.6	Gemini 3.1 Pro
GPQA Diamond (AA)	79.9%	94.1% — Gemini leads
HLE (AA)	13.2%	44.7% — Gemini leads significantly
τ²-bench (AA)	79.5%	95.6% — Gemini leads significantly
GDPval-AA (office tasks)	1,633 Elo	1,317 Elo — Sonnet leads
Writing quality	Best-in-class	Strong but less nuanced prose
Context window	200K (1M beta)	1M (GA)
API price (blended)	$6.00/1M	$3.44/1M — Gemini cheaper
Data jurisdiction	US / EU (Anthropic)	US (Google Cloud)

The decision tree: if you need hard science reasoning, agentic tool use, or production 1M-context — Gemini 3.1 Pro. If you need the best writing, nuanced analysis, or GDPval-style office work — Claude Sonnet. If price is the constraint: Gemini 3.1 Pro is 43% cheaper at blended rates.

Full write-up: Claude Sonnet 4.6 vs Gemini 3 Pro →

Bottom line

Claude Sonnet 4.6 is the right default choice for the vast majority of professional users. It beats the previous-generation flagship on office tasks, matches Opus on computer use, has a more current knowledge cutoff, and costs 40% less. The cases where you actually need Opus 4.6 are narrower than the marketing suggests: sustained multi-file reasoning over hours, hard science QA at scale, or maximum-output-token generation (128K). For everything else — writing, analysis, coding, agentic workflows, customer-facing products — Sonnet is the answer.

Pricing details

Subscription plans

FreeClaude Sonnet access with daily limits(Message cap; no file uploads; no Projects)

Free

Pro5x more usage, Projects, file uploads, priority access during peak hours

$20/mo

TeamAll Pro features, admin console, centralized billing, higher rate limits

$25/mo (annual)

API pricing

AnthropicPrompt caching: cached input at $0.30/1M. Batch API: 50% discount.

$3/$15

OpenRouterSmall markup over direct Anthropic pricing.

$3.1/$15.5

AWS BedrockSame pricing as direct. Useful if already on AWS.

$3/$15

Google Vertex AISame pricing as direct. Useful if already on GCP.

$3/$15

Prices verified February 2026. LLM pricing changes frequently — verify at the provider's site before budgeting.

Last updated: February 17, 2026

Benchmark sources:Artificial Analysis: Claude Sonnet 4.6

Compare Claude Sonnet 4.6

Claude Sonnet 4.6 vs Claude Opus 4.6We pick this →Claude Sonnet 4.6 vs GPT-5.2We pick the other →Claude Sonnet 4.6 vs Gemini 3 ProWe pick this →