GTC March 2024 Keynote (Blackwell announcement)

Category: Main Keynote · Year: 2024 · ▶ Watch

Speakers: Jensen Huang - Founder and CEO, NVIDIA

Segments (15)

00:00 · Intro: I am AI
- A cinematic video showcasing the diverse applications of AI across industries.
03:34 · Welcome to GTC
- Jensen Huang welcomes attendees to the developer conference and highlights the massive industry representation.
07:08 · The Journey of Accelerated Computing and AI
- A historical overview of computing milestones leading to the generative AI revolution.
10:30 · Simulation and Omniverse
- Showcasing how Omniverse is used to simulate physical worlds and train AI.
15:00 · Accelerating the EDA and Simulation Industry
- Partnerships with Ansys, Synopsys, and Cadence to accelerate engineering and chip design.
19:50 · The Need for Larger Models and Bigger GPUs
- Explaining the exponential growth in compute required for training trillion-parameter models.
26:05 · Introducing the Blackwell Platform
- The official unveiling of the Blackwell GPU architecture and the GB200 Superchip.
39:50 · Blackwell Performance and Architecture
- Detailing the performance leaps of Blackwell over Hopper, including new FP4 precision.
42:30 · Scaling to the Data Center: NVLink Switch and GB200 NVL72
- Introducing the networking and rack-scale systems required to build AI factories.
50:00 · Training and Inference on Blackwell
- Comparing the power and time required to train and infer large models on Blackwell versus Hopper.
01:08:00 · Earth-2: Climate Tech and Weather Prediction
- Using AI and digital twins to predict extreme weather events with high resolution.
01:11:00 · Healthcare and BioNeMo
- Applying generative AI to biology for drug discovery and protein generation.
01:16:16 · Generative AI Microservices (NIMs)
- Introducing NIMs as a new way to package and deploy AI models as software.
01:38:00 · Omniverse and Industrial Digital Twins
- How companies like Wistron, Siemens, and Nissan use Omniverse for industrial digitalization.
01:42:50 · The Next Wave of AI: Robotics
- Announcing new platforms and foundation models for physical AI and humanoid robots.

Product Announcements (7)

[26:40] Blackwell GPU
- Next-generation AI GPU architecture.
- specs: 208 billion transistors, TSMC 4NP process, 20 petaFLOPS AI performance.
- availability: Not specified
[27:20] GB200 Grace Blackwell Superchip
- A superchip combining two Blackwell GPUs and one Grace CPU.
- specs: 40 petaFLOPS AI performance, 864GB fast memory, 3.6 TB/s NVLink bandwidth.
- availability: Not specified
[40:00] NVLink Switch Chip
- A networking chip to connect multiple GPUs at high speeds.
- specs: 50 billion transistors, 7.2 TB/s full-duplex bandwidth, 4 NVLinks at 1.8 TB/s each.
- availability: Not specified
[48:40] GB200 NVL72
- A liquid-cooled rack-scale system acting as a single giant GPU.
- specs: 72 Blackwell GPUs, 36 Grace CPUs, 1.44 exaFLOPS inference performance, 130 TB/s bandwidth.
- availability: Not specified
[01:16:16] NVIDIA NIM (Inference Microservice)
- Pre-trained AI models packaged and optimized to run across the CUDA installed base.
- specs: Includes industry standard APIs, Triton Inference Server, and enterprise management tools.
- availability: Available at ai.nvidia.com
[01:40:00] Omniverse Cloud APIs
- APIs to stream Omniverse digital twins to devices like the Apple Vision Pro.
- specs: Enables data interoperability and physics-based rendering to industrial scale.
- availability: Not specified
[01:51:50] Project GR00T
- A general-purpose foundation model for humanoid robot learning.
- specs: Takes multimodal instructions and past interactions to produce actions for robots.
- availability: Not specified

Specific Numbers (6)

Timestamp	Metric	Value	Context
06:33	Industry Representation	$100 Trillion	The value of the world’s industries represented by attendees in the room.
26:48	Transistors	208 Billion	The number of transistors in the Blackwell GPU.
27:00	AI Performance	20 PetaFLOPS	The AI performance of a single Blackwell GPU.
48:40	GPUs per Rack	72	The number of Blackwell GPUs in a single GB200 NVL72 rack.
50:10	Power Consumption	4 Megawatts	Power required to train a 1.8T parameter model in 90 days using 2000 Blackwell GPUs (down from 15MW with Hopper).
01:11:00	Inference Throughput	30x	The performance increase of Blackwell over Hopper for inference on a 1.8T parameter model.

Benchmark Claims (3)

[39:50] Training Performance: 2.5x
- vs: Hopper (FP8)
- gain: 2.5 times faster training performance per chip.
[39:50] Inference Performance: 5x
- vs: Hopper (FP8 vs new FP4)
- gain: 5 times faster inference performance per chip using the new FP4 format.
[01:11:00] Large Model Inference Throughput: 30x
- vs: Hopper
- gain: 30 times higher throughput for a 1.8T parameter MoE model on the GB200 NVL72 system.

Customer Stories (2)

[01:03:00] Wistron
- Built digital twins of their factories using Omniverse.
- outcome: Brought the factory online in 2.5 months instead of 5, increased worker efficiency by 51%, and reduced cycle times by 50%.
[01:39:00] Siemens / HD Hyundai
- Integrated Omniverse into Teamcenter X to build digital twins of massive ships.
- outcome: Unified engineering data, enabled interactive visualization, and eliminated waste and errors in manufacturing.

Key Technologies (4)

2nd Gen Transformer Engine: Dynamically scales precision to FP4 to double throughput and memory bandwidth for AI inference.
5th Gen NVLink: Provides high-speed, coherent interconnect between GPUs, enabling them to act as a single massive GPU.
NVIDIA NIM: Packages AI models with optimized inference engines and APIs for easy deployment.
Omniverse: A platform for building and operating physically based digital twins.

Demos Shown (5)

[01:08:00] Earth-2 predicting extreme weather events like typhoons with high resolution.
- Yes
[01:11:00] BioNeMo generating protein structures and molecules.
- Yes
[01:24:00] NeMo Retriever interacting with a PDF document to answer questions.
- Yes
[01:38:00] Omniverse digital twins of factories and ships.
- Yes
[01:51:50] Robots learning tasks in Isaac Sim and transferring skills to the real world.
- Yes

Predictions / Commitments (2)

[01:28:30, The future] Future data centers will be thought of as AI factories, whose goal is to generate intelligence.
[01:44:50, The future] Everything that moves will be robotic.

Companies Mentioned (4)

Ansys, Synopsys, Cadence · TSMC · AWS, Google Cloud, Oracle, Microsoft Azure · SAP, ServiceNow, Cohesity, Snowflake, NetApp, Dell

Notable Quotes (3)

We need bigger GPUs. — Jensen Huang @ 22:15

Blackwell is not a chip, Blackwell is the name of a platform. — Jensen Huang @ 31:18

The future is generative. — Jensen Huang @ 01:15:50

Key Topics

Accelerated Computing · Generative AI · Blackwell Architecture · Large Language Models · Digital Twins · Omniverse · Robotics · Healthcare AI · Climate Tech · AI Factories · NVIDIA NIM · NVLink

Takeaways

NVIDIA is transitioning from a chip company to a full-stack platform company.
The Blackwell architecture delivers massive leaps in performance and efficiency, specifically designed for trillion-parameter generative AI models.
Generative AI is expanding beyond text to include video, 3D, and physical simulation.
NVIDIA NIMs (Inference Microservices) simplify the deployment of custom AI models for enterprises.
Omniverse and digital twins are becoming critical tools for industrial digitalization, manufacturing, and robotics training.
The next major wave of AI is physical AI and robotics, powered by platforms like Project GR00T and Isaac.