I/O 2025: Google DeepMind + Gemini for Devs
Year: 2025 · ▶ Watch on YouTube
Demis Hassabis (CEO) · Tulsee Doshi (Head of Product, Gemini)
Segments (8)
- 00:00:07 · Introduction to the Gemini Era — Demis Hassabis
- Demis Hassabis introduces the rapid progress in AI and highlights the capabilities and developer adoption of the Gemini 2.5 family of models.
- 00:02:41 · 30 Things You Can Build With Gemini
- A fast-paced video montage showcasing creative and technical applications built with Gemini, from simulations to coding with voice.
- 00:03:58 · Gemini 2.5 for Developers — Tulsee Doshi
- Tulsee Doshi takes the stage to detail developer-focused improvements for Gemini 2.5, including new capabilities, security, cost-efficiency, and control.
- 00:07:39 · Demo: Coding a 3D Web App with Gemini 2.5 Pro — Tulsee Doshi
- A live demo shows Gemini 2.5 Pro in Google AI Studio transforming a hand-drawn sketch into a functional 3D photo sphere web application.
- 00:13:00 · Introducing Gemini Diffusion — Tulsee Doshi
- Introduction of Gemini Diffusion, a new experimental text diffusion model designed for extremely low-latency text generation.
- 00:14:38 · The Future of Gemini: World Models & Project Astra — Demis Hassabis
- Demis Hassabis returns to discuss the vision for Gemini, including the concept of ‘World Models’, Gemini Robotics, and the universal AI assistant, Project Astra.
- 00:19:26 · Demo: Project Astra in Action
- A pre-recorded video demonstrates Project Astra’s capabilities as a multimodal AI assistant helping a user identify bike parts, find manuals, and make calls.
- 00:21:51 · AI for Science and Accessibility — Demis Hassabis
- Demis Hassabis covers breakthroughs in AI for science (AlphaProof, Co-Scientist, AlphaFold 3) and a partnership with Aira to help the visually impaired using Project Astra technology.
Products Announced (8)
- 00:01:47 ·
Gemini 2.5 Flash (Updated)(Updated Preview)- Improved reasoning, code, and long context capabilities · High speed and low cost · Ranks #2 on LMArena leaderboard, second only to 2.5 Pro
- Generally available in early June
- 00:04:25 ·
Gemini 2.5 Native Audio Output(Preview)- First-of-its-kind multi-speaker support for two voices · Expressive speech with nuanced tones, including whispering · Supports over 24 languages and code-switching
- Available in the Gemini API today
- 00:06:14 ·
Gemini 2.5 Thought Summaries(Experimental)- Organizes the model’s raw thoughts into a clear, structured format · Provides transparency into the model’s reasoning process · Aids in debugging and understanding model actions
- Included in 2.5 Pro and Flash via Gemini API and Vertex AI
- 00:07:00 ·
Gemini 2.5 Pro Thinking Budgets(Coming Soon)- Gives developers control over the trade-off between cost/latency and quality · Allows setting a token budget for the model’s ‘thinking’ phase · Can be turned off for faster, less deliberative responses
- Coming soon to 2.5 Pro
- 00:12:27 ·
Jules(Public Beta)- Asynchronous coding agent based on Gemini 2.5 Pro · Tackles complex tasks in large codebases (e.g., version upgrades) · Integrates with GitHub and works autonomously
- Sign-ups available now at jules.google
- 00:14:58 ·
Gemini 2.5 Pro Deep Think(Trusted Tester)- A new mode that pushes model performance to its limits · Uses cutting-edge research in thinking and reasoning, including parallel techniques · Achieves groundbreaking results on difficult math and coding benchmarks
- Available to trusted testers via Gemini API
- 00:18:02 ·
Gemini Robotics(Research)- Specialized model to teach robots useful tasks · Enables robots to grasp objects, follow instructions, and adapt · Leverages world model understanding of physical environments
- Demo available in the AI Sandbox at the event
- 00:22:12 ·
AlphaProof, Co-Scientist, AlphaEvolve, AMIE Medical, AlphaFold 3(Research)- AI models for advancing scientific discovery · Solving math problems, collaborating with researchers, and predicting molecular structures · Revolutionizing drug discovery and AI training itself
- Research publications and models
Benchmarks Shown (5)
- 00:01:17 ·
WebDev Arena: 1415- Gemini 2.5 Pro tops the leaderboard, +142 vs March release.
- 00:02:01 ·
LMArena: 1424- Gemini 2.5 Flash is ranked #2, behind Gemini 2.5 Pro.
- 00:15:19 ·
USAMO 2025 (Mathematics): 49.4%- Gemini 2.5 Pro Deep Think significantly outperforms Gemini 2.5 Pro (34.5%) and OpenAI models.
- 00:15:19 ·
LiveCodeBench v6 (Code): 80.4%- Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.
- 00:15:19 ·
MMMU (Multimodality): 84.0%- Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.
Commitments / Timelines (5)
- 00:02:07 (in early June) — Gemini 2.5 Flash will be generally available.
- 00:02:09 (soon after) — Gemini 2.5 Pro will be generally available.
- 00:05:27 (today) — Native audio output is available in the Gemini API.
- 00:12:29 (now) — Jules is now in public beta.
- 00:21:39 (soon) — Project Astra capabilities are coming to Gemini Live, Search Live, and the Live API for developers.
Demos (4)
- 00:04:40 ✓ · Gemini 2.5 Native Audio Output — Tulsee Doshi
- A demonstration of the model’s new text-to-speech capabilities, including expressive tones, whispering, and seamlessly switching between English and Hindi.
- 00:07:39 ✓ · Coding a 3D Web App from a Sketch — Tulsee Doshi
- In Google AI Studio, a hand-drawn sketch of a photo sphere was uploaded, and Gemini 2.5 Pro generated the HTML, CSS, and JavaScript (using three.js) to create an interactive 3D web application.
- 00:13:45 ✓ · Gemini Diffusion Real-Time Generation — Tulsee Doshi
- A math problem was given as a prompt, and the Gemini Diffusion model generated the step-by-step solution almost instantaneously, demonstrating its low latency.
- 00:19:26 ✓ · Project Astra: AI Assistant for Bike Repair — None
- A pre-recorded video showed a user interacting with Project Astra on a phone. The AI identified bike parts, searched for manuals, found YouTube tutorials, read emails to find part sizes, and initiated a call to a bike shop.
Notable Quotes (7)
- 00:00:51 — Demis Hassabis:
Gemini 2.5 Pro is our most intelligent model ever and the best foundation model in the world.
- 00:04:30 — Tulsee Doshi:
These now have a first-of-its-kind multi-speaker support for two voices, built on native audio output.
- 00:06:07 — Tulsee Doshi:
So Gemini 2.5 is our most secure model yet.
- 00:14:58 — Demis Hassabis:
Today, we’re making 2.5 Pro even better by introducing a new mode we’re calling Deep Think.
- 00:16:41 — Demis Hassabis:
We’re working hard to extend it to become what we call a world model.
- 00:18:41 — Demis Hassabis:
This is our ultimate vision for the Gemini app: to transform it into a universal AI assistant.
- 00:23:13 — Demis Hassabis:
I’ve always believed, if done safely and responsibly, it has the potential to accelerate scientific discovery and be the most beneficial technology ever invented.
Visual Signals (Beyond the Transcript)
On-Screen Text Moments (8)
- 00:00:11 ·
Google DeepMind- Brands the following segment as coming from Google’s core AI research lab.
- 00:00:50 ·
Gemini 2.5 Pro - Our most intelligent model ever- A clear, bold claim about the model’s superiority.
- 00:01:17 ·
WebDev Arena - 1415 Elo score- Shows a specific benchmark score to substantiate claims of coding leadership.
- 00:02:07 ·
Gemini 2.5 Flash - Generally available in early June- Announces the general availability timeline for the new Flash model.
- 00:04:11 ·
List of Gemini 2.5 improvements: Improved capabilities, Enhanced security and transparency, Better c- Outlines the key themes of the developer-focused updates.
- 00:12:30 ·
jules.google- Provides the direct URL for developers to sign up for the new coding agent.
- 00:15:19 ·
Bar charts for Mathematics (USAMO 2025), Code (LiveCodeBench v6), and Multimodality (MMMU) benchmark- Visually compares the performance of Gemini 2.5 Pro Deep Think against other models, showing significant leads.
- 00:24:25 ·
A summary slide showing all the announced products and concepts under the Gemini umbrella.- Recaps the entire presentation, connecting various projects like Gemini Live, Project Astra, and AI for Science into a cohesive vision.
Stage Moments (7)
- 00:00:07 · The presentation begins with a pre-recorded segment featuring Demis Hassabis in a studio.
- 00:00:11 · Demis Hassabis walks onto the live Google I/O stage in front of a large audience.
- 00:02:11 · The audience applauds loudly for the announcement of Gemini 2.5 Flash’s general availability.
- 00:03:51 · Tulsee Doshi walks onto the stage as Demis Hassabis introduces her.
- 00:09:34 · The audience applauds enthusiastically after the successful demo of generating a 3D web app from a sketch.
- 00:14:30 · Demis Hassabis walks back onto the stage to take over from Tulsee Doshi.
- 00:15:00 · The audience murmurs and applauds at the announcement of ‘Deep Think’ mode.
Visual Demos (5)
- 00:01:08 · A demo of Gemini 2.5 Pro turning a hand-drawn sketch of an earthquake into an interactive 3D city simulation.
- A split screen showing a simple drawing on the left and a complex, interactive 3D city model being generated on the right.
- 00:07:50 · Tulsee Doshi demonstrates coding a 3D photo gallery in Google AI Studio.
- The AI Studio interface with a prompt area, code editor, and a live preview pane. A hand-drawn sketch is uploaded, and the model generates code that creates a 3D sphere of photos.
- 00:13:47 · A demonstration of Gemini Diffusion’s speed.
- A complex math problem is shown as a prompt, and the full, step-by-step solution appears almost instantly on the screen, emphasizing the low latency.
- 00:17:15 · A video generated by Genie 2, a generative world model.
- A playable 2D video game world featuring a robot in a futuristic city, generated from a single image prompt.
- 00:19:26 · A pre-recorded demo of Project Astra.
- A first-person view from a smartphone camera. The AI assistant highlights objects, understands spoken commands, navigates web pages and other apps on the phone, and even interrupts and resumes conversation naturally.
Production Signals (5)
- 00:00:07 · Pre-recorded segment
- 00:00:11 · Transition to live on-stage presentation
- 00:02:41 · Pre-produced demo reel
- 00:19:26 · Pre-recorded, edited product demo video
- 00:24:39 · Pre-produced short film (Aira partnership)
Key Topics
Gemini 2.5 Pro · Gemini 2.5 Flash · AI for Developers · Multimodality · Native Audio Output · AI Agents · Coding with AI · AI for Science · Project Astra · World Models · Gemini Diffusion · Robotics · AI Safety · Low-latency Models
Takeaways
- Google is rapidly iterating on its flagship Gemini models, with Gemini 2.5 Pro positioned as the world’s best foundation model and 2.5 Flash as a highly efficient, low-cost alternative.
- The focus is shifting towards making AI a proactive, universal assistant, as shown by the Project Astra vision, which integrates memory, context, and the ability to act across apps and devices.
- Developer experience is paramount, with new tools like native multi-speaker audio output, ‘Thinking Budgets’ for cost control, and ‘Thought Summaries’ for transparency.
- Google is pushing the research frontier with concepts like ‘World Models’ that can simulate reality, ‘Gemini Diffusion’ for ultra-fast text generation, and ‘Deep Think’ mode for complex reasoning.
- AI is being applied to solve grand challenges, with significant progress in AI for science (AlphaFold 3, AlphaProof) and new applications in robotics (Gemini Robotics) and accessibility (Aira partnership).
- Multimodality is expanding beyond text and images to include deeply integrated, expressive, and context-aware audio and video understanding, forming the basis for next-generation AI assistants.