The Future of AI Infrastructure

Year: 2026 · ▶ Watch on YouTube

Amin Vahdat (SVP and Chief Technologist, AI and Infrastructure) · Ben Gilbert (Co-Founder / Co-Host, Acquired Podcast) · David Rosenthal (Co-Founder / Co-Host, Acquired Podcast)

Switch language → zh

Segments (7)

00:00:11 · Introduction: The Foundation of AI — Amin Vahdat
- The speaker welcomes the audience to an exclusive preview of AI infrastructure announcements, emphasizing that infrastructure is foundational to everything in AI.
00:00:36 · Google’s Mission and the Evolution of Infrastructure — Amin Vahdat
- Connecting Google’s mission to the historical need for custom infrastructure for web search and the current need for new infrastructure to solve for intelligence.
00:02:05 · The Agentic Era and the AI Stack — Amin Vahdat
- Describing the unprecedented demand on infrastructure from the ‘agentic era’ and introducing Google’s vertically integrated AI stack, from energy to services.
00:05:16 · A Decade of TPU Supercomputing — Amin Vahdat
- A historical overview of Google’s custom silicon journey, showcasing the rapid innovation and increasing cadence of TPU generations from v1 to Ironwood.
00:07:01 · Announcement: Eighth Generation TPUs — Amin Vahdat
- The main announcement of the event, introducing two distinct, custom-built eighth-generation TPUs: TPU 8t for training and TPU 8i for inference.
00:11:56 · Panel Discussion: The ‘Why’ Behind Custom Silicon — Amin Vahdat, Ben Gilbert, David Rosenthal
- The panel discusses the origins of Google’s custom silicon efforts, driven by the massive computational cost of voice recognition, and the strategic decision to build specialized chips.
00:26:34 · Panel Discussion: The Next Bottlenecks and the Future — Amin Vahdat, Ben Gilbert, David Rosenthal
- The discussion focuses on future challenges, identifying reliability at massive scale, silent data corruption, and the need for specialized network topologies as the next major problems to solve.

Products Announced (2)

00:07:35 · TPU 8t (8th Generation)
- Optimized for large-scale training (‘The training powerhouse’) · 4x scale-out networking bandwidth over previous generation · Features advanced numerics and next-gen engineering
- Generally available later this year (2026)
00:07:35 · TPU 8i (8th Generation)
- Optimized for low-latency inference (‘The reasoning engine’) · Features accelerated agent processing and ‘Boardify’ networking · Designed for cost efficiency and breaking the ‘memory wall’
- Generally available later this year (2026)

Customer Stories (1)

00:26:02 · Citadel — Using TPUs in their securities trading systems to improve efficiency by 2-4x and reduce costs by 30%.

Benchmarks Shown (4)

00:08:05 · TPU 8t Pod Performance vs. Ironwood: 121 (2.8x improvement)
- Compared to Ironwood (42.5 EFlops)
00:08:05 · TPU 8t Pod Performance vs. Ironwood: 400 (4x improvement)
- Compared to Ironwood (100 Gb/s)
00:08:05 · TPU 8i Pod Performance vs. Ironwood: 11.6 (9.8x improvement)
- Compared to Ironwood (1.2 EFlops)
00:08:05 · TPU 8i Pod Performance vs. Ironwood: 331.8 (6.8x improvement)
- Compared to Ironwood (49.2 TB)

Commitments (1)

00:11:08 (Later this year (2026)) — TPU 8t and 8i will be generally available.

Notable Quotes (4)

00:01:41 — Amin Vahdat:

The infrastructure required to solve for intelligence doesn’t yet exist. We’re in the process of defining it.
00:13:32 — Amin Vahdat:

We would actually have to go build two or three additional, complete Googles if we wanted every Google user to interact via voice for 30 seconds a day.
00:29:38 — Amin Vahdat:

Goodput: you’re making progress. 97% means that you’re finding those issues when they happen, and they happen, and you’re recovering from them super quickly.
00:31:35 — Amin Vahdat:

CPUs are going to make a comeback. That’s a prediction I’ll make here today.

Visual Signals

On-screen (12)

00:00:09 · Amin Vahdat, SVP and Chief Technologist, AI and Infrastructure, Google
- Introduces the main speaker and his role.
00:00:24 · A Next '26 Exclusive Event, The Future of AI Infrastructure
- Sets the context for the presentation as a preview for the main conference.
00:00:39 · Google: Organize the world's information and make it universally accessible and useful
- Displays Google’s mission statement, which frames the entire talk.
00:02:06 · The agentic era is placing unprecedented demand on infrastructure
- Headline for the section discussing the need for new, powerful infrastructure.
00:02:41 · AI stack
- Introduces the core concept of Google’s vertically integrated approach.
00:02:54 · A diagram of the AI stack with layers: Service, Models (Gemini 3), AI infrastructure software, AI in
- Visually breaks down Google’s full-stack AI strategy.
00:05:16 · A timeline of TPU Supercomputing from 2015 (TPU v1) to 2025 (Ironwood).
- Provides historical context and shows the accelerating pace of Google’s custom silicon development.
00:07:02 · Introducing our eighth generation TPUs
- The main product announcement title card.
00:07:35 · Images of two new chips labeled 'TPU 8t' and 'TPU 8i'.
- The first visual reveal of the new TPU products.
00:08:05 · A detailed comparison table of TPU 8t and TPU 8i versus the previous generation (Ironwood).
- Provides specific performance benchmarks and specifications for the new chips.
00:12:14 · Ben Gilbert, Co-Founder / Co-Host, Acquired Podcast
- Introduces one of the panel moderators.
00:13:49 · David Rosenthal, Co-Founder / Co-Host, Acquired Podcast
- Introduces the second panel moderator.

Stage (2)

00:00:04 · Amin Vahdat walks onto the stage.
00:11:48 · Ben Gilbert and David Rosenthal walk on stage to join Amin Vahdat for a panel discussion.

Visual demos (1)

00:02:06 · An aerial shot of a massive data center campus.
- Rows of large cooling units outside multiple large buildings, identified as the Clarksville, TN data center.

Key Topics

AI Infrastructure · Custom Silicon · Tensor Processing Unit (TPU) · Generative AI · AI Agents · Agentic Era · Data Centers · Vertical Integration · Model Training · Model Inference · TPU 8t · TPU 8i · Supercomputing · System Reliability · Hardware Co-design

Takeaways

Google announced its 8th generation of TPUs, splitting the product line into two specialized chips for the first time: TPU 8t for training and TPU 8i for inference.
The new chips offer significant performance improvements, with TPU 8t providing 2.8x the EFlops and 4x the networking bandwidth, and TPU 8i offering 9.8x the EFlops and 6.8x the HBM capacity compared to the previous generation.
Google’s strategy of deep vertical integration, from energy and data centers up to models and services, is a key competitive advantage that allows for holistic system optimization.
The ‘agentic era’ requires a shift in infrastructure focus towards ultra-low latency for inference, which drove the specialized design of the TPU 8i and its novel ‘Boardify’ networking.
As AI systems scale to tens of thousands of chips, the primary engineering challenge is no longer just raw performance but ensuring massive reliability, high ‘goodput’, and mitigating silent data corruption.
Google’s custom silicon journey began over a decade ago out of necessity, as running voice recognition on general-purpose CPUs would have been economically unfeasible.
The future of AI hardware will see continued specialization for different workloads, and a predicted resurgence of general-purpose CPUs to handle the complex orchestration required by AI agents.