I/O 2025: Google DeepMind + Gemini for Devs

Year: 2025 · ▶ Watch on YouTube

Demis Hassabis (CEO) · Tulsee Doshi (Head of Product, Gemini)

Switch language → zh

Segments (8)

  • 00:00:07 · Introduction to the Gemini Era — Demis Hassabis
    • Demis Hassabis introduces the rapid progress in AI and highlights the capabilities and developer adoption of the Gemini 2.5 family of models.
  • 00:02:41 · 30 Things You Can Build With Gemini
    • A fast-paced video montage showcasing creative and technical applications built with Gemini, from simulations to coding with voice.
  • 00:03:58 · Gemini 2.5 for Developers — Tulsee Doshi
    • Tulsee Doshi takes the stage to detail developer-focused improvements for Gemini 2.5, including new capabilities, security, cost-efficiency, and control.
  • 00:07:39 · Demo: Coding a 3D Web App with Gemini 2.5 Pro — Tulsee Doshi
    • A live demo shows Gemini 2.5 Pro in Google AI Studio transforming a hand-drawn sketch into a functional 3D photo sphere web application.
  • 00:13:00 · Introducing Gemini Diffusion — Tulsee Doshi
    • Introduction of Gemini Diffusion, a new experimental text diffusion model designed for extremely low-latency text generation.
  • 00:14:38 · The Future of Gemini: World Models & Project Astra — Demis Hassabis
    • Demis Hassabis returns to discuss the vision for Gemini, including the concept of ‘World Models’, Gemini Robotics, and the universal AI assistant, Project Astra.
  • 00:19:26 · Demo: Project Astra in Action
    • A pre-recorded video demonstrates Project Astra’s capabilities as a multimodal AI assistant helping a user identify bike parts, find manuals, and make calls.
  • 00:21:51 · AI for Science and Accessibility — Demis Hassabis
    • Demis Hassabis covers breakthroughs in AI for science (AlphaProof, Co-Scientist, AlphaFold 3) and a partnership with Aira to help the visually impaired using Project Astra technology.

Products Announced (8)

  • 00:01:47 · Gemini 2.5 Flash (Updated) (Updated Preview)
    • Improved reasoning, code, and long context capabilities · High speed and low cost · Ranks #2 on LMArena leaderboard, second only to 2.5 Pro
    • Generally available in early June
  • 00:04:25 · Gemini 2.5 Native Audio Output (Preview)
    • First-of-its-kind multi-speaker support for two voices · Expressive speech with nuanced tones, including whispering · Supports over 24 languages and code-switching
    • Available in the Gemini API today
  • 00:06:14 · Gemini 2.5 Thought Summaries (Experimental)
    • Organizes the model’s raw thoughts into a clear, structured format · Provides transparency into the model’s reasoning process · Aids in debugging and understanding model actions
    • Included in 2.5 Pro and Flash via Gemini API and Vertex AI
  • 00:07:00 · Gemini 2.5 Pro Thinking Budgets (Coming Soon)
    • Gives developers control over the trade-off between cost/latency and quality · Allows setting a token budget for the model’s ‘thinking’ phase · Can be turned off for faster, less deliberative responses
    • Coming soon to 2.5 Pro
  • 00:12:27 · Jules (Public Beta)
    • Asynchronous coding agent based on Gemini 2.5 Pro · Tackles complex tasks in large codebases (e.g., version upgrades) · Integrates with GitHub and works autonomously
    • Sign-ups available now at jules.google
  • 00:14:58 · Gemini 2.5 Pro Deep Think (Trusted Tester)
    • A new mode that pushes model performance to its limits · Uses cutting-edge research in thinking and reasoning, including parallel techniques · Achieves groundbreaking results on difficult math and coding benchmarks
    • Available to trusted testers via Gemini API
  • 00:18:02 · Gemini Robotics (Research)
    • Specialized model to teach robots useful tasks · Enables robots to grasp objects, follow instructions, and adapt · Leverages world model understanding of physical environments
    • Demo available in the AI Sandbox at the event
  • 00:22:12 · AlphaProof, Co-Scientist, AlphaEvolve, AMIE Medical, AlphaFold 3 (Research)
    • AI models for advancing scientific discovery · Solving math problems, collaborating with researchers, and predicting molecular structures · Revolutionizing drug discovery and AI training itself
    • Research publications and models

Benchmarks Shown (5)

  • 00:01:17 · WebDev Arena: 1415
    • Gemini 2.5 Pro tops the leaderboard, +142 vs March release.
  • 00:02:01 · LMArena: 1424
    • Gemini 2.5 Flash is ranked #2, behind Gemini 2.5 Pro.
  • 00:15:19 · USAMO 2025 (Mathematics): 49.4%
    • Gemini 2.5 Pro Deep Think significantly outperforms Gemini 2.5 Pro (34.5%) and OpenAI models.
  • 00:15:19 · LiveCodeBench v6 (Code): 80.4%
    • Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.
  • 00:15:19 · MMMU (Multimodality): 84.0%
    • Gemini 2.5 Pro Deep Think outperforms Gemini 2.5 Pro and OpenAI models.

Commitments / Timelines (5)

  • 00:02:07 (in early June) — Gemini 2.5 Flash will be generally available.
  • 00:02:09 (soon after) — Gemini 2.5 Pro will be generally available.
  • 00:05:27 (today) — Native audio output is available in the Gemini API.
  • 00:12:29 (now) — Jules is now in public beta.
  • 00:21:39 (soon) — Project Astra capabilities are coming to Gemini Live, Search Live, and the Live API for developers.

Demos (4)

  • 00:04:40 ✓ · Gemini 2.5 Native Audio Output — Tulsee Doshi
    • A demonstration of the model’s new text-to-speech capabilities, including expressive tones, whispering, and seamlessly switching between English and Hindi.
  • 00:07:39 ✓ · Coding a 3D Web App from a Sketch — Tulsee Doshi
    • In Google AI Studio, a hand-drawn sketch of a photo sphere was uploaded, and Gemini 2.5 Pro generated the HTML, CSS, and JavaScript (using three.js) to create an interactive 3D web application.
  • 00:13:45 ✓ · Gemini Diffusion Real-Time Generation — Tulsee Doshi
    • A math problem was given as a prompt, and the Gemini Diffusion model generated the step-by-step solution almost instantaneously, demonstrating its low latency.
  • 00:19:26 ✓ · Project Astra: AI Assistant for Bike Repair — None
    • A pre-recorded video showed a user interacting with Project Astra on a phone. The AI identified bike parts, searched for manuals, found YouTube tutorials, read emails to find part sizes, and initiated a call to a bike shop.

Notable Quotes (7)

  • 00:00:51 — Demis Hassabis:

    Gemini 2.5 Pro is our most intelligent model ever and the best foundation model in the world.

  • 00:04:30 — Tulsee Doshi:

    These now have a first-of-its-kind multi-speaker support for two voices, built on native audio output.

  • 00:06:07 — Tulsee Doshi:

    So Gemini 2.5 is our most secure model yet.

  • 00:14:58 — Demis Hassabis:

    Today, we’re making 2.5 Pro even better by introducing a new mode we’re calling Deep Think.

  • 00:16:41 — Demis Hassabis:

    We’re working hard to extend it to become what we call a world model.

  • 00:18:41 — Demis Hassabis:

    This is our ultimate vision for the Gemini app: to transform it into a universal AI assistant.

  • 00:23:13 — Demis Hassabis:

    I’ve always believed, if done safely and responsibly, it has the potential to accelerate scientific discovery and be the most beneficial technology ever invented.

Visual Signals (Beyond the Transcript)

On-Screen Text Moments (8)

  • 00:00:11 · Google DeepMind
    • Brands the following segment as coming from Google’s core AI research lab.
  • 00:00:50 · Gemini 2.5 Pro - Our most intelligent model ever
    • A clear, bold claim about the model’s superiority.
  • 00:01:17 · WebDev Arena - 1415 Elo score
    • Shows a specific benchmark score to substantiate claims of coding leadership.
  • 00:02:07 · Gemini 2.5 Flash - Generally available in early June
    • Announces the general availability timeline for the new Flash model.
  • 00:04:11 · List of Gemini 2.5 improvements: Improved capabilities, Enhanced security and transparency, Better c
    • Outlines the key themes of the developer-focused updates.
  • 00:12:30 · jules.google
    • Provides the direct URL for developers to sign up for the new coding agent.
  • 00:15:19 · Bar charts for Mathematics (USAMO 2025), Code (LiveCodeBench v6), and Multimodality (MMMU) benchmark
    • Visually compares the performance of Gemini 2.5 Pro Deep Think against other models, showing significant leads.
  • 00:24:25 · A summary slide showing all the announced products and concepts under the Gemini umbrella.
    • Recaps the entire presentation, connecting various projects like Gemini Live, Project Astra, and AI for Science into a cohesive vision.

Stage Moments (7)

  • 00:00:07 · The presentation begins with a pre-recorded segment featuring Demis Hassabis in a studio.
  • 00:00:11 · Demis Hassabis walks onto the live Google I/O stage in front of a large audience.
  • 00:02:11 · The audience applauds loudly for the announcement of Gemini 2.5 Flash’s general availability.
  • 00:03:51 · Tulsee Doshi walks onto the stage as Demis Hassabis introduces her.
  • 00:09:34 · The audience applauds enthusiastically after the successful demo of generating a 3D web app from a sketch.
  • 00:14:30 · Demis Hassabis walks back onto the stage to take over from Tulsee Doshi.
  • 00:15:00 · The audience murmurs and applauds at the announcement of ‘Deep Think’ mode.

Visual Demos (5)

  • 00:01:08 · A demo of Gemini 2.5 Pro turning a hand-drawn sketch of an earthquake into an interactive 3D city simulation.
    • A split screen showing a simple drawing on the left and a complex, interactive 3D city model being generated on the right.
  • 00:07:50 · Tulsee Doshi demonstrates coding a 3D photo gallery in Google AI Studio.
    • The AI Studio interface with a prompt area, code editor, and a live preview pane. A hand-drawn sketch is uploaded, and the model generates code that creates a 3D sphere of photos.
  • 00:13:47 · A demonstration of Gemini Diffusion’s speed.
    • A complex math problem is shown as a prompt, and the full, step-by-step solution appears almost instantly on the screen, emphasizing the low latency.
  • 00:17:15 · A video generated by Genie 2, a generative world model.
    • A playable 2D video game world featuring a robot in a futuristic city, generated from a single image prompt.
  • 00:19:26 · A pre-recorded demo of Project Astra.
    • A first-person view from a smartphone camera. The AI assistant highlights objects, understands spoken commands, navigates web pages and other apps on the phone, and even interrupts and resumes conversation naturally.

Production Signals (5)

  • 00:00:07 · Pre-recorded segment
  • 00:00:11 · Transition to live on-stage presentation
  • 00:02:41 · Pre-produced demo reel
  • 00:19:26 · Pre-recorded, edited product demo video
  • 00:24:39 · Pre-produced short film (Aira partnership)

Key Topics

Gemini 2.5 Pro · Gemini 2.5 Flash · AI for Developers · Multimodality · Native Audio Output · AI Agents · Coding with AI · AI for Science · Project Astra · World Models · Gemini Diffusion · Robotics · AI Safety · Low-latency Models

Takeaways

  • Google is rapidly iterating on its flagship Gemini models, with Gemini 2.5 Pro positioned as the world’s best foundation model and 2.5 Flash as a highly efficient, low-cost alternative.
  • The focus is shifting towards making AI a proactive, universal assistant, as shown by the Project Astra vision, which integrates memory, context, and the ability to act across apps and devices.
  • Developer experience is paramount, with new tools like native multi-speaker audio output, ‘Thinking Budgets’ for cost control, and ‘Thought Summaries’ for transparency.
  • Google is pushing the research frontier with concepts like ‘World Models’ that can simulate reality, ‘Gemini Diffusion’ for ultra-fast text generation, and ‘Deep Think’ mode for complex reasoning.
  • AI is being applied to solve grand challenges, with significant progress in AI for science (AlphaFold 3, AlphaProof) and new applications in robotics (Gemini Robotics) and accessibility (Aira partnership).
  • Multimodality is expanding beyond text and images to include deeply integrated, expressive, and context-aware audio and video understanding, forming the basis for next-generation AI assistants.