🔍 Reviews | May 31, 2026 | 19 min read

By AI Tool Briefing Team

Claude Opus 4.8 Review: Fast Mode Got 3x Cheaper

Anthropic shipped Claude Opus 4.8 on May 28, 2026. That’s 41 days after Opus 4.7. For a company that used to space major model releases six months apart, this is the fastest version cadence Anthropic has ever run, and the release is built around a pricing decision that matters more than most of the benchmark numbers it shipped with.

Fast Mode dropped from $30/$150 per million input/output tokens to $10/$50, while running at 2.5x the speed of standard Opus 4.7 inference. Standard Opus 4.8 stays at $5/$25. If you’ve been paying for Opus over the last six months and ever wondered whether Fast Mode was worth the markup, the answer just changed.

The other two headline moves — Dynamic Workflows for 1,000-agent-per-execution orchestration and a 27-point leap on USAMO 2026 math — are the kind of release notes that read like incremental improvements until you sit with what they actually enable. Repo-scale migrations that used to require human orchestration now run in background. Math benchmarks that were the last serious capability gap against frontier-tuned reasoning models just closed.

Quick Verdict

Aspect Rating
Overall Score ★★★★★ (4.8/5)
Best For Heavy Opus users on Claude Code; agentic workloads at scale
Standard Pricing $5/$25 per MTok (unchanged)
Fast Mode Pricing $10/$50 per MTok (down from $30/$150)
Capability Gain Moderate on standard tasks, large on math + agentic
Honesty Behavior 0% on uncritically reporting flawed results

Bottom line: For teams already paying Anthropic for Opus tier access, this is the rare release where the right move is to upgrade Fast Mode usage up, not down. The economics flipped.

See the official Opus 4.8 release notes

Aspect	Rating
Overall Score	★★★★★ (4.8/5)
Best For	Heavy Opus users on Claude Code; agentic workloads at scale
Standard Pricing	$5/$25 per MTok (unchanged)
Fast Mode Pricing	$10/$50 per MTok (down from $30/$150)
Capability Gain	Moderate on standard tasks, large on math + agentic
Honesty Behavior	0% on uncritically reporting flawed results

The 41-Day Cadence Is the Real Story

Set the benchmarks aside for a second.

Opus 4.5 shipped late 2025. Opus 4.6 followed in February. 4.7 came in mid-April. 4.8 hit on May 28 — 41 days behind 4.7. That’s not the release rhythm of a research lab. That’s the release rhythm of a company that has built a continuous training pipeline and decided to ship every meaningful capability gain instead of saving it for a quarterly marketing moment.

The reason the cadence matters for buyers is that the pricing discipline Anthropic demonstrated with the 4.7 release — holding API rates flat at $5/$25 while delivering meaningful capability gains — wasn’t a one-time decision. It was the start of a recurring pattern. 4.8 ships another step on that same curve. The new Fast Mode pricing is the customer-facing evidence that Anthropic’s cost-per-token on the Opus tier dropped again, and the company is passing it through rather than banking the margin.

That’s the same playbook DeepSeek pulled with the V4-Pro permanent price cut earlier this month, except Anthropic is doing it on the premium tier instead of the value tier. Two different positions in the market, same underlying signal: the cost basis of frontier-class inference is dropping faster than the prices on the public pricing pages.

For teams that lock in annual commitments, the implication is uncomfortable. The vendor that quoted you on April Opus economics has materially cheaper Opus economics today. Renewal math should reflect that, not the pricing page that existed when the contract was negotiated.

What’s Actually New in Opus 4.8

Three substantive changes ship in this release. Each one is doing different work.

Fast Mode pricing dropped 3x

This is the headline. Per Anthropic’s announcement, Fast Mode now runs at $10 per million input tokens and $50 per million output tokens, down from $30/$150. Standard Opus pricing stays at $5/$25.

The speed claim — 2.5x faster than standard Opus 4.7 inference — comes from Anthropic’s published latency tables. Real-world latency depends on prompt length, output length, and tool-call topology, but on coding agent workloads where output tokens dominate, the speedup is consistently in the 2-3x range. That makes Fast Mode the right default for any task where the human is waiting on the model.

The pricing math is the part that should reshape your routing logic. At the old rates, Fast Mode cost 6x more per output token than standard. For interactive coding work, the 2.5x latency gain rarely justified the 6x cost premium. At the new rates, Fast Mode costs 2x more per output token than standard — almost a wash with the 2.5x speedup. The breakeven flips for nearly every interactive workload.

Dynamic Workflows scale to 1,000 total agents per execution

This is the change that will matter most over the next 12 months, even though it’ll get the least attention this week.

Dynamic Workflows lets Claude Code spawn up to 1,000 total agents per execution from a single orchestrating agent, running asynchronously in background. Up to 16 agents run concurrently at any moment — the 1,000 figure is a per-run capacity cap, not a simultaneous parallelism number. Even at 16 concurrent subagents, the fan-out turns a different class of work into a single Claude Code command.

Concretely, the kinds of tasks this enables:

Repo-scale migrations. Rename an API across 800 files with context-aware diffs. Previously a sequential job that took hours. Now a fan-out where each subagent owns a file or directory, and the orchestrator merges results.
Cross-repo audits. Security review across 200 microservices. Each subagent reviews one repo against the same checklist. Reports merge.
Documentation backfill. Generate docstrings or README sections for every function in a large codebase. Each subagent handles one module.
Test generation. Build a parallel coverage pass against an entire codebase, where each subagent writes tests for one source file.

The Dynamic Workflows surface is documented in the Claude Code routines guide update Anthropic pushed alongside the release. For enterprise developer teams running Claude Code at scale, the engineering work to actually use 1,000-way fan-out is non-trivial — orchestration logic, result merging, failure handling — but the ceiling on what a single human-initiated command can accomplish just moved by an order of magnitude.

Benchmark gains: USAMO math is the leap

The benchmark moves are real, but the magnitudes vary. Three numbers are worth attention.

Benchmark	Opus 4.7	Opus 4.8	Delta
SWE-Bench Verified	87.6%	88.6%	+1.0
SWE-Bench Pro	64.3%	69.2%	+4.9
USAMO 2026 Math	69.3%	96.7%	+27.4

SWE-Bench Verified moves a single point (87.6% → 88.6%) — a real but modest gain on a benchmark where Opus 4.7 was already the top model. SWE-Bench Pro moves in the 4-5 point range, which is more meaningful. The narrow Verified delta reflects that 4.7 had already closed most of the gap to perfection on that benchmark; 4.8 inches it forward while making larger gains on Pro, which is the harder of the two.

The USAMO 2026 result is the discontinuity. A 27-point jump on the United States of America Mathematical Olympiad benchmark, landing at 96.7%, puts Opus 4.8 at near-saturation on what was a serious capability gap against reasoning-tuned frontier models like GPT-5.5 with extended thinking. The implication for users who don’t care about competition math is that whatever architectural work produced the math leap is also reflected in adjacent reasoning tasks — proof construction, symbolic manipulation, formal logic, and multi-step deductive chains.

The 4x improvement on shipping unflagged flawed code is the safety-adjacent benchmark that procurement teams will care about most. Per Anthropic’s published evaluations, Opus 4.8 is roughly four times less likely than 4.7 to ship code with unflagged flaws. That number folds into the broader honesty story: on the “uncritically reporting flawed results” eval, 4.8 scores 0%. The model now consistently flags its own uncertainty and identifies flaws in results it produces, rather than presenting them as clean.

How Fast Mode Pricing Reshapes Routing

Here’s the practical Fast Mode question every Opus customer should run this week.

At the old rates, the routing rule looked like this:

Use Fast Mode for tasks where latency is the bottleneck and budget is not the bottleneck.

At the new rates, the routing rule looks like this:

Use Fast Mode for tasks where the human is in the loop. Use Standard for batch and async.

The cost premium of Fast Mode over Standard dropped from roughly 6x to roughly 2x per output token. For interactive coding work — Claude Code sessions, IDE-resident chat, agent loops where the human is reviewing each step — the new economics almost always favor Fast Mode. The 2.5x speedup means the human spends less time waiting, which means more iterations per hour, which means more value per dollar even at the higher rate.

The math on a Claude Code session running 500K output tokens per engineer per day:

Mode	Per-Day Cost	Per-Engineer-Month (20 days)
Standard Opus 4.8	$12.50	$250
Fast Mode Opus 4.8 (new)	$25.00	$500
Fast Mode Opus 4.7 (old)	$75.00	$1,500

The new Fast Mode tier is $250 more per engineer per month than Standard. If that delta produces 30 minutes of recovered engineering time per day, the math pays back at any reasonable loaded engineer cost. For most teams, the recovered time at 2.5x speed is more than 30 minutes.

For batch and async workloads — overnight migration runs, agentic background work via Dynamic Workflows, data processing pipelines — Standard is still the right default. The human isn’t waiting, the latency premium produces no value, and the cost gap matters at scale.

The routing logic that needs to update is the Claude Code defaults inside enterprise teams. Most internal deployments are still configured around the old 6x premium, which biased every default toward Standard. The new economics flip that default, and the routing config update is a one-line change that produces an immediate latency improvement across an engineering org.

Where Opus 4.8 Still Struggles

The release is genuinely strong, but the honest limitations remain.

Cost basis vs frontier-cheap models. At $5/$25 standard pricing, Opus 4.8 is still 6-7x more expensive per token than DeepSeek V4-Pro. The gap we covered in the DeepSeek price-cut analysis didn’t close. For non-regulated, routine, latency-flexible workloads, the cost case for routing some traffic away from Opus to a cheaper model is unchanged.

Capacity story. Anthropic’s chronic capacity issue — peak-hour rate limits, dedicated capacity sold to enterprise tiers first — didn’t get solved by the release. The block-or-moat capacity debate holds. Heavy Fast Mode adoption following this release is likely to make peak-hour capacity tighter before it gets better.

Dynamic Workflows requires real engineering. The 1,000-total-agent-per-execution capability is a ceiling, not a default. Teams that want to actually use it need to design orchestration topology, handle subagent failures, merge results, and manage the rate-limit accounting across the fan-out. For most teams, productive use of the new ceiling will land in Q3, not this week.

Multimodal lag. Opus 4.8 didn’t ship meaningful gains on the vision and audio surfaces. Gemini 3 Pro still leads on document-with-images workflows and any task where the input is screenshot-heavy. If your use case is multimodal-first, the Opus 4.8 release doesn’t change the comparison.

The pricing tier dance. Fast Mode at $10/$50 is cheaper than it was, but it’s a new tier above Standard rather than a replacement for it. Anthropic is increasingly running a tiered pricing strategy — Pro Plus on the subscription side, Fast Mode on the API side, dedicated capacity above that — and the procurement complexity is going up. That’s a defensible business strategy for a vendor heading into the October IPO, but it adds work for buyers who need to model their total spend.

Who Should Upgrade Their Usage

A few specific recommendations, by buyer profile.

Claude Code users on Standard

Move your interactive sessions to Fast Mode this week. The latency improvement is large enough that engineers will notice it inside the first day, and the cost delta at the new pricing is well below the productivity gain at typical engineer loaded costs.

Keep batch jobs, scheduled runs, and Dynamic Workflows background fan-outs on Standard. The latency premium is wasted on workloads where no human is waiting on a response.

Heavy Opus API customers (>$50K/month spend)

Renegotiate. The Fast Mode price cut is a public reset on Opus tier economics. Any contract negotiated against 4.7 pricing — including any quote you have outstanding from an Anthropic AE this quarter — is now on the wrong side of the public rate. The account team has flexibility this week that they won’t have once the next two weeks of customer renegotiations land.

If your contract is up in Q3 or Q4, open the conversation now. Multi-year commitments at the new Fast Mode pricing are likely available in exchange for predictable revenue, which Anthropic prioritizes ahead of the IPO.

Pro and Team subscribers

Nothing changes immediately. Pro subscribers don’t see API-tier Fast Mode pricing in the same way. What you do see is that the underlying model serving the Pro tier is now Opus 4.8 by default, which means the benchmark gains land on your queries without any price change. That’s the cleanest win in the release for individual users — better model, same price.

The Pro Plus tier rumors from the Series H coverage didn’t materialize in this release. Expect a Pro Plus announcement separately, probably tied to higher rate caps and broader Fast Mode access on the consumer side, sometime in Q3.

Teams running multi-agent harnesses

The Dynamic Workflows surface is the right place to focus. The multi-agent harness coverage from earlier this year sketched the general shape; 4.8 makes the 1,000-parallel ceiling real. The engineering work to actually exercise that ceiling is non-trivial, but the ceiling itself unlocks workload shapes that weren’t tractable on prior releases.

The right first project is the repo-scale migration use case. It’s the workload where the orchestration logic is simplest (file-level fan-out, simple merge), the value is concrete (engineering hours saved on tedious work), and the failure modes are well-understood (failed migrations are visible and easy to retry).

Opus 4.8 vs Opus 4.7: When the Upgrade Is Worth Defending

For most users, Opus 4.8 is a drop-in replacement for 4.7 at the same Standard price point. The capability gains are real but incremental on routine work; you’ll feel them on math-heavy tasks and complex agentic flows but not necessarily on standard chat or document drafting.

The cases where the upgrade has a hard, defensible ROI:

Anything where the human is waiting on the model. Fast Mode at the new pricing wins the latency-vs-cost trade on interactive workloads.
Coding agent workloads with multi-step reasoning. The 1-point gain on SWE-Bench Verified compounds across long agent loops where each step’s accuracy gates the next.
Anything involving formal reasoning, proofs, or symbolic math. The USAMO leap reflects an underlying capability shift that lands on adjacent tasks.
Repo-scale agentic work. Dynamic Workflows is the feature; the breakeven is engineering hours spent on orchestration vs hours saved on the migration.
Workflows where flawed-result detection matters. The 4x improvement on shipping unflagged flaws is the kind of safety gain that procurement teams will care about for regulated workloads.

For everything else — drafting, summarization, general chat, light coding, customer-facing chatbots — the 4.7-to-4.8 upgrade is more about ambient quality than a feature you’ll point to. That’s still a fine reason to be on 4.8 by default, but the business case for actively rerouting traffic to capture the gains is narrower than the headline benchmark moves suggest.

How to Start With Opus 4.8

Three concrete steps, in order.

Update your Claude Code defaults. Switch interactive sessions to Fast Mode and confirm the model is set to Opus 4.8. The configuration change is documented in the Claude Code routines guide. Most teams will see the latency improvement inside the first day.
Audit your batch and async jobs. Anything currently routed to Fast Mode for cost reasons can now be evaluated honestly — at the new rates, the routing decision is purely about whether a human is waiting. Standard is still the right default for everything else.
Start one Dynamic Workflows project. Pick a single repo-scale task — migration, audit, doc backfill — and build the orchestration. The engineering work is real but bounded, and the proof point on a single project unlocks the budget conversation for broader adoption.

The Bottom Line

The 41-day cadence is the structural story. The Fast Mode price cut is the immediate story. The Dynamic Workflows ceiling is the 12-month story. All three follow from the same underlying engine: Anthropic has built a continuous training pipeline running on a falling cost basis, and the company has decided to ship every meaningful step rather than save them up.

For teams already on Opus, the right move this week is to upgrade Fast Mode usage and renegotiate the renewal. For teams on the fence about Claude Code adoption, the math just got easier — the interactive cost case at Fast Mode pricing is materially better than it was 60 days ago.

For everyone watching the broader market, this release is the first concrete evidence that the pricing pressure DeepSeek established is forcing US frontier labs to respond — not on headline rates, but on tier engineering. Anthropic didn’t cut Standard pricing. They cut the premium tier that most aggressive users were already paying. The competitive read is that the lab figured out it could move that lever without trading away its main pricing power, and the result is a release that improves the value proposition for the customers Anthropic most wants to keep.

The next data point is whether OpenAI matches on its own equivalent tier inside the next 60 days. If GPT-5.5’s “Pro” or extended-reasoning pricing moves, the cost-basis arms race is officially the dominant frame for AI economics in 2026. If it doesn’t, Anthropic widens its developer-economics lead heading into the October IPO. Either way, the Opus 4.8 release is the cleanest signal yet that the rate at which frontier-class inference gets cheaper is accelerating, not slowing.

Frequently Asked Questions

When did Claude Opus 4.8 ship?

May 28, 2026 — 41 days after Opus 4.7, per Anthropic’s announcement. It is the fastest cadence between major Opus releases the company has run.

What changed in Fast Mode pricing?

Fast Mode dropped from $30 per million input tokens and $150 per million output tokens to $10 and $50 respectively. That’s a 3x cut. The mode runs at roughly 2.5x the speed of standard Opus 4.7 inference. Standard Opus 4.8 pricing stays at $5/$25 per million tokens, unchanged from 4.7.

What are Dynamic Workflows?

Dynamic Workflows is an agentic orchestration surface that lets Claude Code spawn up to 1,000 total agents per execution from a single orchestrating agent, running asynchronously in background. Up to 16 agents run concurrently at any given moment — the 1,000 figure is a per-run capacity cap, not a simultaneous concurrency number. The ceiling unlocks repo-scale migrations, cross-repo audits, parallel test generation, and other workload shapes that weren’t tractable on prior releases.

How much better are the benchmarks?

SWE-Bench Verified moved from 87.6% to 88.6% — a 1-point gain. SWE-Bench Pro moved from 64.3% to 69.2%. The USAMO 2026 math benchmark jumped from 69.3% to 96.7% — a 27.4-point leap that closes a serious capability gap on formal reasoning tasks. Opus 4.8 is also roughly 4x less likely than 4.7 to ship code with unflagged flaws, and scores 0% on the “uncritically reports flawed results” honesty eval.

Should I upgrade from Opus 4.7 to 4.8?

Yes, for most users, since Standard pricing is unchanged and the model is a drop-in replacement. The harder question is whether to start routing interactive workloads through Fast Mode at the new pricing. For Claude Code sessions and IDE-resident chat, the answer is almost always yes — the 2.5x latency gain at 2x cost beats the prior 2.5x latency gain at 6x cost.

How does Opus 4.8 compare to GPT-5.5?

The GPT-5.5 vs Opus 4.7 coding comparison covered the prior matchup. The 4.8 gains on SWE-Bench and the math benchmarks push the comparison further in Anthropic’s direction on coding and formal reasoning. GPT-5.5 still leads on a few multimodal and ecosystem dimensions. The honest read is that 4.8 widens the Anthropic developer-economics lead heading into Q3.

Is Fast Mode the same model as Standard?

Yes. Fast Mode and Standard Opus 4.8 are the same underlying model, served on different infrastructure that prioritizes latency. The capability is identical; the trade-off is speed for cost. The pricing now reflects a much smaller premium for the latency optimization than the prior release did.

What about Sonnet and Haiku 4.8?

Anthropic did not announce updated Sonnet or Haiku models alongside the Opus 4.8 release. The expectation across most analysts is that Sonnet and Haiku 4.8 follow within 4-6 weeks on the established cadence. Until then, Sonnet 4.7 and Haiku 4.7 remain the middle and bottom of the price-performance stack.

Does the Dynamic Workflows ceiling apply to Pro subscribers?

Not at the 1,000-parallel ceiling. The Dynamic Workflows surface is primarily a Claude Code and API feature, and the practical fan-out limits on Pro and Team plans are well below the API ceiling. Pro subscribers do get the Opus 4.8 model upgrade for chat-tier work at no price change, which is the cleanest win in the release for individual users.

Last updated: May 31, 2026. Sources: Anthropic Opus 4.8 announcement · Anthropic pricing · Claude Code documentation.