Anthropic Just Beat OpenAI in Business Spending
Google I/O 2026 starts in seven days, and almost every preview piece you can find right now is about Android 17. That’s fine if you sell phones. It’s useless if you buy or build AI tools for a living. The keynote on May 19 is going to be model-heavy, infrastructure-heavy, and, based on what’s already leaked out of Mountain View, the biggest reset to Google’s developer surface since Vertex AI got folded into the Gemini Enterprise Agent Platform last month.
This is the preview for the people who actually have to make procurement decisions afterward.
Quick Summary: What Matters for AI Pros at Google I/O 2026
Detail Info Dates May 19-20, 2026 — Shoreline Amphitheatre, Mountain View Main keynote May 19, 10:00 AM PT Developer keynote May 19, 1:30 PM PT Confirmed AI-track session ”Agent-first workflows from prompt to production” Headline model Gemini 4 — confirmed 10M+ token context, native multimodal Leaked video model Gemini Omni — chat-driven editing, watermark removal, in-clip object swap Leaked Flash tier Gemini 3.2 Flash — $0.25/M input, $2.00/M output Surface story Firebase pivoting to agent-native, Antigravity agent harness inside Google AI Studio Bottom line: This is the I/O where Google answers two open questions at once — can it ship a frontier model that closes the gap with GPT-5 and Claude Opus on context and reasoning, and can it turn Firebase into the default deployment surface for agent-built apps? If both land, the enterprise math on Google Cloud shifts again. If only one lands, expect a partial reset.
The schedule is published. The main keynote livestreams from Shoreline at 10:00 AM PT. The developer keynote follows at 1:30 PM PT. After that, Google has scheduled four 3:30 PM sessions — “What’s new in Google AI,” “What’s new in Android,” “What’s new in Chrome,” and “Agent-first workflows from prompt to production.”
That last session is the tell. Google doesn’t put “agent-first workflows” into the post-keynote slot unless the morning keynote is going to set up something concrete that the afternoon session has to explain in detail. The shape we’re looking at: a model announcement in the morning, a developer-platform story in the afternoon, and a Firebase-and-agent story stitching them together.
Everything streams free at io.google. No badge required. The people you actually want to listen to are the developer relations folks running the breakout sessions. They show up live in the YouTube comments and on the io.google chat. That’s the channel for technical questions the keynote skips.
Google has confirmed Gemini 4 is the I/O model. The two specs that have made it out of the rumor mill into something close to consensus reporting:
10M+ token context window. This is the headline. Gemini 3.1 Pro launched at 1M tokens and already led the field on long-context work. A 10x jump puts Gemini 4 into territory where it can hold an entire enterprise codebase, a multi-year deal archive, or a research corpus in working memory without retrieval gymnastics. For the agent use case — long-running tasks that span days and need persistent state — this is the more important number than any reasoning benchmark.
Native multimodal end-to-end. Gemini has always advertised native multimodality. In practice, Gemini 3 still preprocessed video and audio into intermediate representations before reasoning over them. Gemini 4, per the same set of leaks, removes that preprocessing step. Audio comes in as audio. Video comes in as video. The reasoning happens directly on the raw modality. If that pans out, it closes the gap with OpenAI’s GPT-Realtime-2 on the voice side and meaningfully widens Gemini’s lead on the video side.
What we do not have confirmed: pricing, exact context-window number (10M is the lower bound in the rumor set), or whether Gemini 4 ships in Pro and Flash variants on day one. The smart bet, based on the Gemini 3.2 Flash leak in the iOS app, is that Flash gets its own announcement track and Pro lands as the morning’s flagship.
For most AI buyers, the reasoning-benchmark gap between frontier models has stopped mattering operationally. GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro all post numbers that are good enough for everything inside a normal enterprise workflow. The differentiator that actually changes architecture decisions is context.
A 1M context model lets you stuff a single mid-size project into the prompt. A 10M context model lets you stuff a department’s entire knowledge base in. That’s not a feature upgrade — it’s a different deployment shape. RAG pipelines built around chunking, embeddings, and retrieval start to look like overhead instead of necessity. Long-running agents stop needing external memory layers cobbled together with vector databases. The agent-platform consolidation Google announced at Cloud Next is positioned for exactly this kind of model.
The risk: 10M context is expensive to serve at inference time. If Gemini 4 Pro costs three or four times what 3.1 Pro costs per million tokens, the math only works for the workflows where the long context is load-bearing. Watch the pricing slide carefully.
The most interesting leak of the past two weeks did not come from a benchmark or a TechCrunch source. It came from a UI string.
On May 2, X user @Thomas16937378 spotted a placeholder line inside Gemini’s video generation tab that read “Start with an idea or try a template. Powered by Omni.” The string sat next to “Toucan,” the known internal codename for Google’s current Veo-3.1-backed video pathway. TestingCatalog confirmed the find the same day and walked through the demo footage that surfaced alongside it.
Two things make this leak more interesting than a typical pre-keynote rumor.
First, the placement. UI copy that names a brand reaches that state of polish when the team is preparing a public release, not while they’re still A/B testing internally. Whoever flipped that string from the codename “Toucan” to the user-facing “Omni” did it because they expected users to see it within days, not months.
Second, the editing demos. The footage that leaked alongside the string showed three things working unusually well for a first public glimpse: watermark removal from existing video, in-clip object swapping (“replace the red car with a black one”), and scene rewriting via chat instructions (“make the dialogue more apologetic, keep the framing”). Each of those individually is interesting. The combination — chat-driven, frame-by-frame, in-place editing instead of regeneration — is the kind of capability that resets a product category.
The category in question is video editing. If Omni ships on May 19 with even half the editing capability the leaks suggest, then the best AI video generators roundup needs a rewrite, the AI video editing landscape gets a new top-line entrant, and Runway, Pika, and the standalone video AI startups have a hard conversation to have with their boards.
I have to call out the most legally questionable part of the leak. Watermark removal as a marketed feature is going to trigger a fight with rights holders before Google ships it. The legitimate use case — your own footage with your own watermark you want gone — is real. The illegitimate use case — circumventing rights management on someone else’s clip — is exactly the kind of rights-management liability that compounds an already unsustainable cost structure — as Sora’s shutdown demonstrated.
If Omni ships with watermark removal as advertised in the leak, watch for the announcement to be hedged with usage policies, attribution requirements, or limits on third-party content. If Google does not hedge, the lawsuit clock starts on May 19.
The flagship is the headline, but the Flash-tier announcement is where the procurement math lives. Gemini 3.2 Flash already leaked in Google’s iOS app and AI Studio on May 5 with concrete pricing pulled from the AI Studio billing screen.
$0.25 per million input tokens. $2.00 per million output tokens.
That input price matches Gemini 3.1 Flash-Lite. The output price comes in below Gemini 3 Flash’s $3.00/M, and well below the $5-10/M range that Claude Haiku and GPT-5 Mini occupy. For volume use cases — high-throughput classification, summarization, agent step-execution, voice transcription — this is the new floor.
If you are running an agent platform on GPT-5 Mini or Claude Haiku today and your output tokens dominate your bill, Gemini 3.2 Flash is going to force a rerun of your unit economics. The model has reportedly outperformed Gemini 3.1 Pro on coding tasks in the leaked LM Arena evaluations, which is the part that makes the pricing genuinely disruptive rather than just a discount. A faster, cheaper model that performs above the previous-generation Pro is a price reset, not a tier addition.
What I’d watch for in the keynote: rate limits, regional availability, and whether the published $0.25/M number is the Standard tier or the Batch tier. Google has historically reserved its sharpest pricing for batch inference, and a Standard-tier reveal that’s 50% higher than the leak would still be aggressive but materially different from the headline.
The Cloud Next keynote in April was about agents at the enterprise platform layer. I/O is going to be about agents at the developer-platform layer, and the surface for that story is Firebase.
The pieces are already public:
What this adds up to: Firebase is positioning to be the default backend for agent-generated full-stack apps. Not a side product. The thing the agent reaches for by default when it needs a database, an auth system, or a deployment target.
The competitive frame matters here. Vercel has v0 and the Next.js stack. Supabase has its own AI integrations. Cloudflare has Workers AI and a growing developer-tool surface. Firebase has been around longer than any of them but has not had a clean agent story until now. If the I/O announcement lands cleanly, Google’s pitch to a startup founder is no longer “use Firebase because it’s mature.” It’s “describe what you want to build, and Firebase is what the agent provisions.”
That’s a different sale.
| Google I/O 2026 (May) | OpenAI DevDay (Q4 2025) | AWS re:Invent (Dec 2025) | Anthropic Builder Days (Mar 2026) | |
|---|---|---|---|---|
| Headline model story | Gemini 4 (10M context, native multimodal) | GPT-5.5, Realtime API | Bedrock model catalog, AgentCore | Claude Opus 4.7, Code Routines |
| Developer-tool story | AI Studio + Antigravity + Firebase | Responses API, Realtime API, Apps in ChatGPT | AgentCore Runtime, AgentCore Identity | Claude Code Routines, MCP servers |
| Deployment surface | Firebase agent-native | Microsoft 365, ChatGPT app store | Bedrock | Claude Code, third-party IDEs |
| Enterprise angle | Gemini Enterprise Agent Platform extension | Azure AI Foundry, AWS Bedrock | Bedrock, AgentCore Identity | Vertex AI, Bedrock, direct API |
| Pricing story | 3.2 Flash at $0.25/M input | GPT-5 Mini tier, batch discounts | Bedrock provisioned throughput pricing | Haiku tier, prompt caching |
The pattern is clear. The big events in this cycle are increasingly about deployment, not models. The model story is the keynote attention-grabber. The deployment story is what determines whether the model gets used.
Five concrete moves for the week before I/O if AI tools sit anywhere in your roadmap.
The teams that ran this playbook before OpenAI’s GPT-Realtime-2 launch saved themselves three weeks of post-keynote scrambling. The pattern repeats every major announcement cycle.
The Android coverage is going to dominate the live tweets. Ignore it.
The signal that matters at I/O 2026 is whether Google can ship a frontier model that is genuinely competitive on context and multimodality, and whether it can stitch that model into a developer surface that turns “I want to build an app” into a one-prompt operation. If both land, Google’s enterprise AI story finally lines up — frontier model, agent platform, developer toolchain, deployment surface, partner ecosystem — in a way that matches what AWS and Microsoft already have.
The risk for Google is that the model lands and the platform story under-delivers, or the other way around. A great Gemini 4 with a confusing Firebase story is a model that gets used inside other people’s platforms. A great Firebase agent story without a frontier model is a deployment surface for someone else’s frontier work. Google needs both to land.
The leak signal is unusually strong this cycle. Gemini Omni is real enough to have UI strings in production. Gemini 3.2 Flash is real enough to have pricing visible in the iOS app. Firebase’s agent-native pivot has already been previewed in two separate blog posts. The keynote is unlikely to be a surprise. It’s likely to be a confirmation, a price announcement, and a release date.
For AI buyers, that’s actually the better outcome. The hardest planning moments come from the events that surprise the field. I/O 2026 is going to confirm what most of us already suspect. The planning value is in moving on Day 8, not Day 1.
We’ll publish the post-keynote analysis the morning of May 20, with the actual numbers, the actual launch dates, and the actual surprises — because there are always two or three. For now, the preview is enough to start the procurement conversations that have to happen anyway.
When is Google I/O 2026? Google I/O 2026 runs May 19-20, 2026 at Shoreline Amphitheatre in Mountain View, with all keynotes livestreamed for free at io.google. The main keynote starts at 10:00 AM PT on May 19, with the developer-focused keynote at 1:30 PM PT the same day.
Is Gemini 4 launching at Google I/O 2026? Gemini 4 is confirmed as the I/O 2026 model announcement. Pre-event reporting points to a 10M+ token context window and native end-to-end multimodality (audio and video processed directly without an intermediate transcription step). Pricing and exact availability dates are expected on the May 19 keynote.
What is the Gemini Omni video model? Gemini Omni is an unreleased Google video model that surfaced via a UI string in the Gemini app on May 2, 2026. Leaked demos show chat-driven editing, watermark removal, in-clip object swapping (“swap the red car for a black one”), and scene rewriting via natural-language instructions. It is expected to be announced at Google I/O 2026.
How much will Gemini 3.2 Flash cost? Per the leaked pricing visible in Google AI Studio on May 5, 2026, Gemini 3.2 Flash is set at $0.25 per million input tokens and $2.00 per million output tokens. That’s the same input rate as Gemini 3.1 Flash-Lite and below Gemini 3 Flash’s $3.00/M output rate. The numbers are subject to confirmation at the I/O keynote.
What does “Firebase agent-native” actually mean? Firebase is being integrated with Google AI Studio’s new build experience, which uses a coding agent built on the same harness that ships inside Google Antigravity. The agent auto-provisions Cloud Firestore for persistent storage and Firebase Authentication for user identity when it detects those needs in the prompt. The full agent-native deployment story is expected to be detailed in the May 19 developer keynote.
Should I wait for I/O 2026 before signing a new Gemini contract? For most teams, yes. The leaked pricing on Gemini 3.2 Flash and the expected Gemini 4 announcement mean any commitment signed in the week before I/O is likely to be re-priced within thirty days. Existing customers can ask for a price-protection clause tied to the May 19 announcements.
How does Google I/O 2026 compare to OpenAI DevDay or AWS re:Invent? All three events have shifted from model-only stories to deployment-platform stories. Google’s I/O 2026 emphasis on Firebase agent-native development is the developer-platform parallel to AWS Bedrock AgentCore and Microsoft Azure AI Foundry. For enterprise procurement, the enterprise AI deployment guide walks through the cross-platform decision matrix.
Is Gemini 4 going to beat GPT-5.5 and Claude Opus 4.7? Unknown until the benchmarks land. The context-window story (10M+ tokens) is a clear lead. On reasoning, the current Gemini 3.1 Pro versus Opus 4.6 versus GPT-5.2 comparison gives a baseline for where the field stands going in. Watch the GPQA and ARC-AGI numbers on the keynote slide.
Where do I watch Google I/O 2026 if I’m not attending in person? Everything streams free at io.google/2026. No registration required to watch the keynotes. Registration on the I/O site gives access to the live developer-relations chat, which is where most of the substantive technical Q&A happens during breakout sessions.
Last updated: May 12, 2026. Sources: Google I/O 2026 official event page · Google “save the date” announcement · TestingCatalog coverage of the Gemini Omni leak · ChromeUnboxed Omni leak coverage · AIxploria Gemini 3.2 Flash leak coverage · Firebase agent-native announcement · Google AI Studio full-stack vibe coding announcement.
Related reading: Gemini 3.1 Pro Review · Google Antigravity Review · Google Cloud Next 2026: Gemini Enterprise Agent Platform · OpenAI GPT-Realtime-2 Launch · Best AI Video Generators 2026 · Best AI Video Editing Tools 2026 · Enterprise AI Deployment Guide · Gemini 3.1 Pro vs Claude Opus 4.6 vs GPT-5.2