I/O 2025: Google DeepMind + Gemini for Devs

Year: 2025 · ▶ Watch on YouTube

Demis Hassabis (CEO) · Tulsee Doshi (Head of Product, Gemini)

Switch language → zh

Segments (8)

00:00:07 · Introduction to the Gemini Era — Demis Hassabis
- Demis Hassabis introduces the rapid progress in AI and highlights the capabilities and developer adoption of the Gemini 2.5 family of models.
00:02:41 · 30 Things You Can Build With Gemini
- A fast-paced video montage showcasing creative and technical applications built with Gemini, from simulations to coding with voice.
00:03:58 · Gemini 2.5 for Developers — Tulsee Doshi
- Tulsee Doshi takes the stage to detail developer-focused improvements for Gemini 2.5, including new capabilities, security, cost-efficiency, and control.
00:07:39 · Demo: Coding a 3D Web App with Gemini 2.5 Pro — Tulsee Doshi
- A live demo shows Gemini 2.5 Pro in Google AI Studio transforming a hand-drawn sketch into a functional 3D photo sphere web application.
00:13:00 · Introducing Gemini Diffusion — Tulsee Doshi
- Introduction of Gemini Diffusion, a new experimental text diffusion model designed for extremely low-latency text generation.
00:14:38 · The Future of Gemini: World Models & Project Astra — Demis Hassabis
- Demis Hassabis returns to discuss the vision for Gemini, including the concept of ‘World Models’, Gemini Robotics, and the universal AI assistant, Project Astra.
00:19:26 · Demo: Project Astra in Action
- A pre-recorded video demonstrates Project Astra’s capabilities as a multimodal AI assistant helping a user identify bike parts, find manuals, and make calls.
00:21:51 · AI for Science and Accessibility — Demis Hassabis
- Demis Hassabis covers breakthroughs in AI for science (AlphaProof, Co-Scientist, AlphaFold 3) and a partnership with Aira to help the visually impaired using Project Astra technology.

Products Announced (8)

00:01:47 · Gemini 2.5 Flash (Updated) (Updated Preview)
- Improved reasoning, code, and long context capabilities · High speed and low cost · Ranks #2 on LMArena leaderboard, second only to 2.5 Pro
- Generally available in early June
00:04:25 · Gemini 2.5 Native Audio Output (Preview)
- First-of-its-kind multi-speaker support for two voices · Expressive speech with nuanced tones, including whispering · Supports over 24 languages and code-switching
- Available in the Gemini API today
00:06:14 · Gemini 2.5 Thought Summaries (Experimental)
- Organizes the model’s raw thoughts into a clear, structured format · Provides transparency into the model’s reasoning process · Aids in debugging and understanding model actions
- Included in 2.5 Pro and Flash via Gemini API and Vertex AI
00:07:00 · Gemini 2.5 Pro Thinking Budgets (Coming Soon)
- Gives developers control over the trade-off between cost/latency and quality · Allows setting a token budget for the model’s ‘thinking’ phase · Can be turned off for faster, less deliberative responses
- Coming soon to 2.5 Pro
00:12:27 · Jules (Public Beta)
- Asynchronous coding agent based on Gemini 2.5 Pro · Tackles complex tasks in large codebases (e.g., version upgrades) · Integrates with GitHub and works autonomously
- Sign-ups available now at jules.google
00:14:58 · Gemini 2.5 Pro Deep Think (Trusted Tester)
- A new mode that pushes model performance to its limits · Uses cutting-edge research in thinking and reasoning, including parallel techniques · Achieves groundbreaking results on difficult math and coding benchmarks
- Available to trusted testers via Gemini API
00:18:02 · Gemini Robotics (Research)
- Specialized model to teach robots useful tasks · Enables robots to grasp objects, follow instructions, and adapt · Leverages world model understanding of physical environments
- Demo available in the AI Sandbox at the event
00:22:12 · AlphaProof, Co-Scientist, AlphaEvolve, AMIE Medical, AlphaFold 3 (Research)
- AI models for advancing scientific discovery · Solving math problems, collaborating with researchers, and predicting molecular structures · Revolutionizing drug discovery and AI training itself
- Research publications and models

Benchmarks Shown (5)

00:01:17 · WebDev Arena: 1415
- Gemini 2.5 Pro tops the leaderboard, +142 vs March release.
00:02:01 · LMArena: 1424
- Gemini 2.5 Flash is ranked #2, behind Gemini 2.5 Pro.
00:15:19 · USAMO 2025 (Mathematics): 49.4%
- Gemini 2.5 Pro Deep Think significantly outperforms Gemini 2.5 Pro (34.5%) and OpenAI models.
00:15:19 · LiveCodeBench v6 (Code): 80.4%
- Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.
00:15:19 · MMMU (Multimodality): 84.0%
- Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.

Commitments / Timelines (5)

00:02:07 (in early June) — Gemini 2.5 Flash will be generally available.
00:02:09 (soon after) — Gemini 2.5 Pro will be generally available.
00:05:27 (today) — Native audio output is available in the Gemini API.
00:12:29 (now) — Jules is now in public beta.
00:21:39 (soon) — Project Astra capabilities are coming to Gemini Live, Search Live, and the Live API for developers.

Demos (4)

00:04:40 ✓ · Gemini 2.5 Native Audio Output — Tulsee Doshi
- A demonstration of the model’s new text-to-speech capabilities, including expressive tones, whispering, and seamlessly switching between English and Hindi.
00:07:39 ✓ · Coding a 3D Web App from a Sketch — Tulsee Doshi
- In Google AI Studio, a hand-drawn sketch of a photo sphere was uploaded, and Gemini 2.5 Pro generated the HTML, CSS, and JavaScript (using three.js) to create an interactive 3D web application.
00:13:45 ✓ · Gemini Diffusion Real-Time Generation — Tulsee Doshi
- A math problem was given as a prompt, and the Gemini Diffusion model generated the step-by-step solution almost instantaneously, demonstrating its low latency.
00:19:26 ✓ · Project Astra: AI Assistant for Bike Repair — None
- A pre-recorded video showed a user interacting with Project Astra on a phone. The AI identified bike parts, searched for manuals, found YouTube tutorials, read emails to find part sizes, and initiated a call to a bike shop.

Notable Quotes (7)

00:00:51 — Demis Hassabis:

Gemini 2.5 Pro is our most intelligent model ever and the best foundation model in the world.
00:04:30 — Tulsee Doshi:

These now have a first-of-its-kind multi-speaker support for two voices, built on native audio output.
00:06:07 — Tulsee Doshi:

So Gemini 2.5 is our most secure model yet.
00:14:58 — Demis Hassabis:

Today, we’re making 2.5 Pro even better by introducing a new mode we’re calling Deep Think.
00:16:41 — Demis Hassabis:

We’re working hard to extend it to become what we call a world model.
00:18:41 — Demis Hassabis:

This is our ultimate vision for the Gemini app: to transform it into a universal AI assistant.
00:23:13 — Demis Hassabis:

I’ve always believed, if done safely and responsibly, it has the potential to accelerate scientific discovery and be the most beneficial technology ever invented.

Visual Signals (Beyond the Transcript)

On-Screen Text Moments (8)

00:00:11 · Google DeepMind
- Brands the following segment as coming from Google’s core AI research lab.
00:00:50 · Gemini 2.5 Pro - Our most intelligent model ever
- A clear, bold claim about the model’s superiority.
00:01:17 · WebDev Arena - 1415 Elo score
- Shows a specific benchmark score to substantiate claims of coding leadership.
00:02:07 · Gemini 2.5 Flash - Generally available in early June
- Announces the general availability timeline for the new Flash model.
00:04:11 · List of Gemini 2.5 improvements: Improved capabilities, Enhanced security and transparency, Better c
- Outlines the key themes of the developer-focused updates.
00:12:30 · jules.google
- Provides the direct URL for developers to sign up for the new coding agent.
00:15:19 · Bar charts for Mathematics (USAMO 2025), Code (LiveCodeBench v6), and Multimodality (MMMU) benchmark
- Visually compares the performance of Gemini 2.5 Pro Deep Think against other models, showing significant leads.
00:24:25 · A summary slide showing all the announced products and concepts under the Gemini umbrella.
- Recaps the entire presentation, connecting various projects like Gemini Live, Project Astra, and AI for Science into a cohesive vision.

Stage Moments (7)

00:00:07 · The presentation begins with a pre-recorded segment featuring Demis Hassabis in a studio.
00:00:11 · Demis Hassabis walks onto the live Google I/O stage in front of a large audience.
00:02:11 · The audience applauds loudly for the announcement of Gemini 2.5 Flash’s general availability.
00:03:51 · Tulsee Doshi walks onto the stage as Demis Hassabis introduces her.
00:09:34 · The audience applauds enthusiastically after the successful demo of generating a 3D web app from a sketch.
00:14:30 · Demis Hassabis walks back onto the stage to take over from Tulsee Doshi.
00:15:00 · The audience murmurs and applauds at the announcement of ‘Deep Think’ mode.

Visual Demos (5)

00:01:08 · A demo of Gemini 2.5 Pro turning a hand-drawn sketch of an earthquake into an interactive 3D city simulation.
- A split screen showing a simple drawing on the left and a complex, interactive 3D city model being generated on the right.
00:07:50 · Tulsee Doshi demonstrates coding a 3D photo gallery in Google AI Studio.
- The AI Studio interface with a prompt area, code editor, and a live preview pane. A hand-drawn sketch is uploaded, and the model generates code that creates a 3D sphere of photos.
00:13:47 · A demonstration of Gemini Diffusion’s speed.
- A complex math problem is shown as a prompt, and the full, step-by-step solution appears almost instantly on the screen, emphasizing the low latency.
00:17:15 · A video generated by Genie 2, a generative world model.
- A playable 2D video game world featuring a robot in a futuristic city, generated from a single image prompt.
00:19:26 · A pre-recorded demo of Project Astra.
- A first-person view from a smartphone camera. The AI assistant highlights objects, understands spoken commands, navigates web pages and other apps on the phone, and even interrupts and resumes conversation naturally.

Production Signals (5)

00:00:07 · Pre-recorded segment
00:00:11 · Transition to live on-stage presentation
00:02:41 · Pre-produced demo reel
00:19:26 · Pre-recorded, edited product demo video
00:24:39 · Pre-produced short film (Aira partnership)

Key Topics

Gemini 2.5 Pro · Gemini 2.5 Flash · AI for Developers · Multimodality · Native Audio Output · AI Agents · Coding with AI · AI for Science · Project Astra · World Models · Gemini Diffusion · Robotics · AI Safety · Low-latency Models

Takeaways

Google is rapidly iterating on its flagship Gemini models, with Gemini 2.5 Pro positioned as the world’s best foundation model and 2.5 Flash as a highly efficient, low-cost alternative.
The focus is shifting towards making AI a proactive, universal assistant, as shown by the Project Astra vision, which integrates memory, context, and the ability to act across apps and devices.
Developer experience is paramount, with new tools like native multi-speaker audio output, ‘Thinking Budgets’ for cost control, and ‘Thought Summaries’ for transparency.
Google is pushing the research frontier with concepts like ‘World Models’ that can simulate reality, ‘Gemini Diffusion’ for ultra-fast text generation, and ‘Deep Think’ mode for complex reasoning.
AI is being applied to solve grand challenges, with significant progress in AI for science (AlphaFold 3, AlphaProof) and new applications in robotics (Gemini Robotics) and accessibility (Aira partnership).
Multimodality is expanding beyond text and images to include deeply integrated, expressive, and context-aware audio and video understanding, forming the basis for next-generation AI assistants.