GTC March 2024 Keynote (Blackwell announcement)
Category: Main Keynote · Year: 2024 · ▶ Watch
Speakers: Jensen Huang - Founder and CEO, NVIDIA
Segments (15)
- 00:00 · Intro: I am AI
- A cinematic video showcasing the diverse applications of AI across industries.
- 03:34 · Welcome to GTC
- Jensen Huang welcomes attendees to the developer conference and highlights the massive industry representation.
- 07:08 · The Journey of Accelerated Computing and AI
- A historical overview of computing milestones leading to the generative AI revolution.
- 10:30 · Simulation and Omniverse
- Showcasing how Omniverse is used to simulate physical worlds and train AI.
- 15:00 · Accelerating the EDA and Simulation Industry
- Partnerships with Ansys, Synopsys, and Cadence to accelerate engineering and chip design.
- 19:50 · The Need for Larger Models and Bigger GPUs
- Explaining the exponential growth in compute required for training trillion-parameter models.
- 26:05 · Introducing the Blackwell Platform
- The official unveiling of the Blackwell GPU architecture and the GB200 Superchip.
- 39:50 · Blackwell Performance and Architecture
- Detailing the performance leaps of Blackwell over Hopper, including new FP4 precision.
- 42:30 · Scaling to the Data Center: NVLink Switch and GB200 NVL72
- Introducing the networking and rack-scale systems required to build AI factories.
- 50:00 · Training and Inference on Blackwell
- Comparing the power and time required to train and infer large models on Blackwell versus Hopper.
- 01:08:00 · Earth-2: Climate Tech and Weather Prediction
- Using AI and digital twins to predict extreme weather events with high resolution.
- 01:11:00 · Healthcare and BioNeMo
- Applying generative AI to biology for drug discovery and protein generation.
- 01:16:16 · Generative AI Microservices (NIMs)
- Introducing NIMs as a new way to package and deploy AI models as software.
- 01:38:00 · Omniverse and Industrial Digital Twins
- How companies like Wistron, Siemens, and Nissan use Omniverse for industrial digitalization.
- 01:42:50 · The Next Wave of AI: Robotics
- Announcing new platforms and foundation models for physical AI and humanoid robots.
Product Announcements (7)
- [26:40] Blackwell GPU
- Next-generation AI GPU architecture.
- specs: 208 billion transistors, TSMC 4NP process, 20 petaFLOPS AI performance.
- availability: Not specified
- [27:20] GB200 Grace Blackwell Superchip
- A superchip combining two Blackwell GPUs and one Grace CPU.
- specs: 40 petaFLOPS AI performance, 864GB fast memory, 3.6 TB/s NVLink bandwidth.
- availability: Not specified
- [40:00] NVLink Switch Chip
- A networking chip to connect multiple GPUs at high speeds.
- specs: 50 billion transistors, 7.2 TB/s full-duplex bandwidth, 4 NVLinks at 1.8 TB/s each.
- availability: Not specified
- [48:40] GB200 NVL72
- A liquid-cooled rack-scale system acting as a single giant GPU.
- specs: 72 Blackwell GPUs, 36 Grace CPUs, 1.44 exaFLOPS inference performance, 130 TB/s bandwidth.
- availability: Not specified
- [01:16:16] NVIDIA NIM (Inference Microservice)
- Pre-trained AI models packaged and optimized to run across the CUDA installed base.
- specs: Includes industry standard APIs, Triton Inference Server, and enterprise management tools.
- availability: Available at ai.nvidia.com
- [01:40:00] Omniverse Cloud APIs
- APIs to stream Omniverse digital twins to devices like the Apple Vision Pro.
- specs: Enables data interoperability and physics-based rendering to industrial scale.
- availability: Not specified
- [01:51:50] Project GR00T
- A general-purpose foundation model for humanoid robot learning.
- specs: Takes multimodal instructions and past interactions to produce actions for robots.
- availability: Not specified
Specific Numbers (6)
| Timestamp | Metric | Value | Context |
|---|---|---|---|
| 06:33 | Industry Representation | $100 Trillion | The value of the world’s industries represented by attendees in the room. |
| 26:48 | Transistors | 208 Billion | The number of transistors in the Blackwell GPU. |
| 27:00 | AI Performance | 20 PetaFLOPS | The AI performance of a single Blackwell GPU. |
| 48:40 | GPUs per Rack | 72 | The number of Blackwell GPUs in a single GB200 NVL72 rack. |
| 50:10 | Power Consumption | 4 Megawatts | Power required to train a 1.8T parameter model in 90 days using 2000 Blackwell GPUs (down from 15MW with Hopper). |
| 01:11:00 | Inference Throughput | 30x | The performance increase of Blackwell over Hopper for inference on a 1.8T parameter model. |
Benchmark Claims (3)
- [39:50] Training Performance: 2.5x
- vs: Hopper (FP8)
- gain: 2.5 times faster training performance per chip.
- [39:50] Inference Performance: 5x
- vs: Hopper (FP8 vs new FP4)
- gain: 5 times faster inference performance per chip using the new FP4 format.
- [01:11:00] Large Model Inference Throughput: 30x
- vs: Hopper
- gain: 30 times higher throughput for a 1.8T parameter MoE model on the GB200 NVL72 system.
Customer Stories (2)
- [01:03:00] Wistron
- Built digital twins of their factories using Omniverse.
- outcome: Brought the factory online in 2.5 months instead of 5, increased worker efficiency by 51%, and reduced cycle times by 50%.
- [01:39:00] Siemens / HD Hyundai
- Integrated Omniverse into Teamcenter X to build digital twins of massive ships.
- outcome: Unified engineering data, enabled interactive visualization, and eliminated waste and errors in manufacturing.
Key Technologies (4)
- 2nd Gen Transformer Engine: Dynamically scales precision to FP4 to double throughput and memory bandwidth for AI inference.
- 5th Gen NVLink: Provides high-speed, coherent interconnect between GPUs, enabling them to act as a single massive GPU.
- NVIDIA NIM: Packages AI models with optimized inference engines and APIs for easy deployment.
- Omniverse: A platform for building and operating physically based digital twins.
Demos Shown (5)
- [01:08:00] Earth-2 predicting extreme weather events like typhoons with high resolution.
- Yes
- [01:11:00] BioNeMo generating protein structures and molecules.
- Yes
- [01:24:00] NeMo Retriever interacting with a PDF document to answer questions.
- Yes
- [01:38:00] Omniverse digital twins of factories and ships.
- Yes
- [01:51:50] Robots learning tasks in Isaac Sim and transferring skills to the real world.
- Yes
Predictions / Commitments (2)
- [01:28:30, The future] Future data centers will be thought of as AI factories, whose goal is to generate intelligence.
- [01:44:50, The future] Everything that moves will be robotic.
Companies Mentioned (4)
Ansys, Synopsys, Cadence · TSMC · AWS, Google Cloud, Oracle, Microsoft Azure · SAP, ServiceNow, Cohesity, Snowflake, NetApp, Dell
Notable Quotes (3)
We need bigger GPUs. — Jensen Huang @ 22:15
Blackwell is not a chip, Blackwell is the name of a platform. — Jensen Huang @ 31:18
The future is generative. — Jensen Huang @ 01:15:50
Key Topics
Accelerated Computing · Generative AI · Blackwell Architecture · Large Language Models · Digital Twins · Omniverse · Robotics · Healthcare AI · Climate Tech · AI Factories · NVIDIA NIM · NVLink
Takeaways
- NVIDIA is transitioning from a chip company to a full-stack platform company.
- The Blackwell architecture delivers massive leaps in performance and efficiency, specifically designed for trillion-parameter generative AI models.
- Generative AI is expanding beyond text to include video, 3D, and physical simulation.
- NVIDIA NIMs (Inference Microservices) simplify the deployment of custom AI models for enterprises.
- Omniverse and digital twins are becoming critical tools for industrial digitalization, manufacturing, and robotics training.
- The next major wave of AI is physical AI and robotics, powered by platforms like Project GR00T and Isaac.