I/O 2024: Google DeepMind
Year: 2024 · ▶ Watch on YouTube
Demis Hassabis (Co-Founder & CEO) · Doug Eck (Senior Research Director, AI)
Segments (8)
- 00:00:00 · Introduction to Google DeepMind — Demis Hassabis
- Demis Hassabis introduces the mission of Google DeepMind: to build AGI responsibly to benefit humanity, highlighting recent breakthroughs.
- 02:20 · Introducing Gemini 1.5 Flash — Demis Hassabis
- A new, lighter-weight Gemini model is announced, designed for speed, efficiency, and lower cost at scale.
- 03:50 · Introducing Project Astra — Demis Hassabis
- The vision for a universal, multimodal AI agent is unveiled, capable of conversational, real-time understanding and interaction with the world.
- 08:17 · Generative Media: Bringing Creative Ideas to Life — Doug Eck
- Doug Eck takes the stage to introduce a series of updates to Google’s generative media tools for image, music, and video.
- 09:08 · Introducing Imagen 3 — Doug Eck
- The announcement of Imagen 3, Google’s most capable text-to-image model, with improved photorealism, detail, and text rendering.
- 10:17 · Music AI Sandbox — Doug Eck
- A suite of AI tools for musicians is showcased through a video featuring artists like Wyclef Jean and Marc Rebillet.
- 12:56 · Introducing Veo — Demis Hassabis
- Demis Hassabis announces Veo, Google’s most capable generative video model, for creating high-quality 1080p video from various prompts.
- 14:29 · Veo Collaboration with Donald Glover — Demis Hassabis
- A short film is presented, created in collaboration with Donald Glover and his studio Gilga, demonstrating Veo’s filmmaking capabilities.
Products Announced (5)
- 02:20 ·
Gemini 1.5 Flash(New Model)- Lighter-weight and cost-efficient · Optimized for speed and low latency · Maintains multimodal reasoning and long context window
- Available today in Google AI Studio and Vertex AI.
- 03:55 ·
Project Astra(Vision / Project)- Universal AI agent · Real-time, multimodal understanding (vision and speech) · Conversational and context-aware
- Capabilities coming to Google products later this year.
- 09:11 ·
Imagen 3(New Model)- Improved photorealism and detail · Better understanding of natural language prompts · Advanced text rendering capabilities
- Sign-ups open today for private preview in ImageFX.
- 10:35 ·
Music AI Sandbox(Tool Suite)- Create new instrumental sections from scratch · Transfer styles between tracks · Collaborative tool for artists
- In development with YouTube and artists.
- 13:09 ·
Veo(New Model)- Generates high-quality 1080p video over a minute long · Understands cinematic terms and visual styles · Maintains consistency of subjects across shots
- Waitlist open now for VideoFX; available to select creators in the coming weeks.
Commitments / Timelines (6)
- 02:49 (today) — Gemini 1.5 Flash and 1.5 Pro are available with up to 1 million tokens in Google AI Studio and Vertex AI.
- 02:58 (today) — Developers can sign up to try a 2 million token context window.
- 07:59 (later this year) — Some Project Astra agent capabilities will come to Google products like the Gemini app.
- 10:05 (today) — Sign-ups are open to try Imagen 3 in ImageFX.
- 16:13 (over the coming weeks) — Some Veo features will be available to select creators through VideoFX.
- 16:20 (now) — The waitlist for VideoFX with Veo is open.
Demos (3)
- 05:23 ✓ · Project Astra — Unnamed Google employee (in video)
- A pre-recorded, first-person demo of the AI agent identifying objects, explaining code, remembering previous context (location of glasses), and engaging in creative tasks on a phone and through smart glasses.
- 11:02 ✓ · Music AI Sandbox — Wyclef Jean, Marc Rebillet (in video)
- A pre-recorded video of professional musicians using the AI tools to generate, sample, and modify musical loops and tracks in a studio setting.
- 14:30 ✓ · Veo Filmmaking — Donald Glover and his team (in video)
- A pre-recorded video showcasing a creative team using Veo to generate various video clips from text prompts to brainstorm and create a short film.
Notable Quotes (8)
- 00:37 — Demis Hassabis:
I co-founded DeepMind in 2010 with the goal of one day building AGI, artificial general intelligence.
- 02:20 — Demis Hassabis:
So today, we’re introducing Gemini 1.5 Flash.
- 04:02 — Demis Hassabis:
For a long time, we’ve wanted to build a universal AI agent that can be truly helpful in everyday life.
- 09:09 — Doug Eck:
Today, I’m so excited to introduce Imagen 3, our most capable image generation model yet.
- 13:04 — Demis Hassabis:
Today, I’m excited to announce our newest, most capable generative video model, called Veo.
- 15:39 — Donald Glover:
Everybody’s going to become a director, and everybody should be a director.
- 15:44 — Donald Glover:
Because at the heart of all of this is just storytelling.
- 16:50 — Demis Hassabis:
We knew that one day it would change everything. Now that time is here.
Visual Signals (Beyond the Transcript)
On-Screen Text Moments (9)
- 00:00 ·
Google DeepMind- Sets the topic for the entire presentation segment.
- 02:22 ·
Gemini 1.5 Flash- Official branding for the newly announced model.
- 02:51 ·
Available in Google AI Studio and Vertex AI / 1M tokens- Key availability and capability announcement for developers.
- 03:55 ·
Project Astra- Official name for Google’s AI agent vision.
- 04:03 ·
A universal AI agent helpful in everyday life- The core mission statement for Project Astra.
- 09:11 ·
Imagen 3- Official branding for the new text-to-image model.
- 10:36 ·
Music AI Sandbox- Official name for the suite of music creation tools.
- 13:09 ·
Veo- Official branding for the new text-to-video model.
- 16:02 ·
A collaboration between Google DeepMind, Donald Glover, and Gilga. Coming soon.- Credits the high-profile collaboration for the Veo demo.
Stage Moments (5)
- 00:01 · Demis Hassabis walks on stage, shaking hands with Sundar Pichai who is exiting.
- 02:27 · The audience applauds the announcement of Gemini 1.5 Flash.
- 08:40 · Demis Hassabis introduces Doug Eck, who walks on stage to take over the presentation.
- 12:43 · Demis Hassabis returns to the stage after the Music AI Sandbox video.
- 16:07 · The audience gives a strong round of applause following the Veo demo video with Donald Glover.
Visual Demos (5)
- 05:23 · Project Astra Demo
- A first-person view from a phone camera, where the AI identifies objects in an office, explains code, and remembers the location of glasses. The demo transitions to a view through smart glasses.
- 09:16 · Imagen 3 Examples
- A series of high-quality, photorealistic and artistic images generated by Imagen 3, including a wolf, people laughing in sunlight, a landscape, and the word ‘LIGHT’ made of feathers.
- 11:02 · Music AI Sandbox Demo
- Musicians Wyclef Jean and Marc Rebillet in a studio interacting with a UI to generate and combine musical elements, showing prompts and resulting audio waveforms.
- 13:15 · Veo Examples
- A montage of diverse, high-quality, 1080p video clips generated by Veo, including a dog in a bathtub, an aerial shot of a lighthouse, a blooming sunflower, and a car driving through a city.
- 14:30 · Veo Filmmaking Demo with Donald Glover
- Donald Glover and his creative team using a text-prompt interface to generate various video shots (a car driving to a palace, a sailboat, a jungle trail) for a short film project.
Production Signals (3)
- 05:23 · Pre-recorded demo segment for Project Astra, labeled ‘Prototype shown’.
- 11:02 · Pre-recorded video segment showcasing the Music AI Sandbox with artists.
- 14:30 · Pre-recorded video segment showcasing Veo in collaboration with Donald Glover.
Key Topics
Artificial General Intelligence (AGI) · Multimodal AI · AI Agents · Generative AI · Text-to-Video Generation · Text-to-Image Generation · AI for Creativity · Music Generation · Google DeepMind · Project Astra · Gemini Models · Veo · Imagen 3 · AI Responsibility
Takeaways
- Google DeepMind is positioned as the core engine driving Google’s most ambitious AI research, with a clear long-term goal of achieving AGI.
- The Gemini model family is diversifying to serve different needs: Gemini 1.5 Pro for peak capability and the new Gemini 1.5 Flash for speed and cost-efficiency at scale.
- Project Astra represents Google’s vision for the future of AI assistants: a proactive, conversational, and multimodal agent that understands the world through sight and sound in real-time.
- Google is making a major push into generative media, launching powerful new tools for creators across video (Veo), image (Imagen 3), and music (Music AI Sandbox).
- High-profile collaborations with creators like Donald Glover and Wyclef Jean are a key part of Google’s strategy to develop and validate its creative AI tools.
- The core technical challenge being addressed is reducing latency and improving contextual memory to make AI interactions feel natural and truly helpful in everyday life.
- Google is advancing the state-of-the-art in generative video with Veo, focusing on high-resolution output, longer clip duration, and maintaining visual consistency.