GTC March 2024 Keynote (Blackwell announcement)

Category: Main Keynote · Year: 2024 · ▶ Watch

Speakers: Jensen Huang - Founder and CEO, NVIDIA

Switch language → 中文

Segments (15)

  • 00:00 · Intro: I am AI
    • A cinematic video showcasing the diverse applications of AI across industries.
  • 03:34 · Welcome to GTC
    • Jensen Huang welcomes attendees to the developer conference and highlights the massive industry representation.
  • 07:08 · The Journey of Accelerated Computing and AI
    • A historical overview of computing milestones leading to the generative AI revolution.
  • 10:30 · Simulation and Omniverse
    • Showcasing how Omniverse is used to simulate physical worlds and train AI.
  • 15:00 · Accelerating the EDA and Simulation Industry
    • Partnerships with Ansys, Synopsys, and Cadence to accelerate engineering and chip design.
  • 19:50 · The Need for Larger Models and Bigger GPUs
    • Explaining the exponential growth in compute required for training trillion-parameter models.
  • 26:05 · Introducing the Blackwell Platform
    • The official unveiling of the Blackwell GPU architecture and the GB200 Superchip.
  • 39:50 · Blackwell Performance and Architecture
    • Detailing the performance leaps of Blackwell over Hopper, including new FP4 precision.
  • 42:30 · Scaling to the Data Center: NVLink Switch and GB200 NVL72
    • Introducing the networking and rack-scale systems required to build AI factories.
  • 50:00 · Training and Inference on Blackwell
    • Comparing the power and time required to train and infer large models on Blackwell versus Hopper.
  • 01:08:00 · Earth-2: Climate Tech and Weather Prediction
    • Using AI and digital twins to predict extreme weather events with high resolution.
  • 01:11:00 · Healthcare and BioNeMo
    • Applying generative AI to biology for drug discovery and protein generation.
  • 01:16:16 · Generative AI Microservices (NIMs)
    • Introducing NIMs as a new way to package and deploy AI models as software.
  • 01:38:00 · Omniverse and Industrial Digital Twins
    • How companies like Wistron, Siemens, and Nissan use Omniverse for industrial digitalization.
  • 01:42:50 · The Next Wave of AI: Robotics
    • Announcing new platforms and foundation models for physical AI and humanoid robots.

Product Announcements (7)

  • [26:40] Blackwell GPU
    • Next-generation AI GPU architecture.
    • specs: 208 billion transistors, TSMC 4NP process, 20 petaFLOPS AI performance.
    • availability: Not specified
  • [27:20] GB200 Grace Blackwell Superchip
    • A superchip combining two Blackwell GPUs and one Grace CPU.
    • specs: 40 petaFLOPS AI performance, 864GB fast memory, 3.6 TB/s NVLink bandwidth.
    • availability: Not specified
  • [40:00] NVLink Switch Chip
    • A networking chip to connect multiple GPUs at high speeds.
    • specs: 50 billion transistors, 7.2 TB/s full-duplex bandwidth, 4 NVLinks at 1.8 TB/s each.
    • availability: Not specified
  • [48:40] GB200 NVL72
    • A liquid-cooled rack-scale system acting as a single giant GPU.
    • specs: 72 Blackwell GPUs, 36 Grace CPUs, 1.44 exaFLOPS inference performance, 130 TB/s bandwidth.
    • availability: Not specified
  • [01:16:16] NVIDIA NIM (Inference Microservice)
    • Pre-trained AI models packaged and optimized to run across the CUDA installed base.
    • specs: Includes industry standard APIs, Triton Inference Server, and enterprise management tools.
    • availability: Available at ai.nvidia.com
  • [01:40:00] Omniverse Cloud APIs
    • APIs to stream Omniverse digital twins to devices like the Apple Vision Pro.
    • specs: Enables data interoperability and physics-based rendering to industrial scale.
    • availability: Not specified
  • [01:51:50] Project GR00T
    • A general-purpose foundation model for humanoid robot learning.
    • specs: Takes multimodal instructions and past interactions to produce actions for robots.
    • availability: Not specified

Specific Numbers (6)

Timestamp Metric Value Context
06:33 Industry Representation $100 Trillion The value of the world’s industries represented by attendees in the room.
26:48 Transistors 208 Billion The number of transistors in the Blackwell GPU.
27:00 AI Performance 20 PetaFLOPS The AI performance of a single Blackwell GPU.
48:40 GPUs per Rack 72 The number of Blackwell GPUs in a single GB200 NVL72 rack.
50:10 Power Consumption 4 Megawatts Power required to train a 1.8T parameter model in 90 days using 2000 Blackwell GPUs (down from 15MW with Hopper).
01:11:00 Inference Throughput 30x The performance increase of Blackwell over Hopper for inference on a 1.8T parameter model.

Benchmark Claims (3)

  • [39:50] Training Performance: 2.5x
    • vs: Hopper (FP8)
    • gain: 2.5 times faster training performance per chip.
  • [39:50] Inference Performance: 5x
    • vs: Hopper (FP8 vs new FP4)
    • gain: 5 times faster inference performance per chip using the new FP4 format.
  • [01:11:00] Large Model Inference Throughput: 30x
    • vs: Hopper
    • gain: 30 times higher throughput for a 1.8T parameter MoE model on the GB200 NVL72 system.

Customer Stories (2)

  • [01:03:00] Wistron
    • Built digital twins of their factories using Omniverse.
    • outcome: Brought the factory online in 2.5 months instead of 5, increased worker efficiency by 51%, and reduced cycle times by 50%.
  • [01:39:00] Siemens / HD Hyundai
    • Integrated Omniverse into Teamcenter X to build digital twins of massive ships.
    • outcome: Unified engineering data, enabled interactive visualization, and eliminated waste and errors in manufacturing.

Key Technologies (4)

  • 2nd Gen Transformer Engine: Dynamically scales precision to FP4 to double throughput and memory bandwidth for AI inference.
  • 5th Gen NVLink: Provides high-speed, coherent interconnect between GPUs, enabling them to act as a single massive GPU.
  • NVIDIA NIM: Packages AI models with optimized inference engines and APIs for easy deployment.
  • Omniverse: A platform for building and operating physically based digital twins.

Demos Shown (5)

  • [01:08:00] Earth-2 predicting extreme weather events like typhoons with high resolution.
    • Yes
  • [01:11:00] BioNeMo generating protein structures and molecules.
    • Yes
  • [01:24:00] NeMo Retriever interacting with a PDF document to answer questions.
    • Yes
  • [01:38:00] Omniverse digital twins of factories and ships.
    • Yes
  • [01:51:50] Robots learning tasks in Isaac Sim and transferring skills to the real world.
    • Yes

Predictions / Commitments (2)

  • [01:28:30, The future] Future data centers will be thought of as AI factories, whose goal is to generate intelligence.
  • [01:44:50, The future] Everything that moves will be robotic.

Companies Mentioned (4)

Ansys, Synopsys, Cadence · TSMC · AWS, Google Cloud, Oracle, Microsoft Azure · SAP, ServiceNow, Cohesity, Snowflake, NetApp, Dell

Notable Quotes (3)

We need bigger GPUs. — Jensen Huang @ 22:15

Blackwell is not a chip, Blackwell is the name of a platform. — Jensen Huang @ 31:18

The future is generative. — Jensen Huang @ 01:15:50

Key Topics

Accelerated Computing · Generative AI · Blackwell Architecture · Large Language Models · Digital Twins · Omniverse · Robotics · Healthcare AI · Climate Tech · AI Factories · NVIDIA NIM · NVLink

Takeaways

  • NVIDIA is transitioning from a chip company to a full-stack platform company.
  • The Blackwell architecture delivers massive leaps in performance and efficiency, specifically designed for trillion-parameter generative AI models.
  • Generative AI is expanding beyond text to include video, 3D, and physical simulation.
  • NVIDIA NIMs (Inference Microservices) simplify the deployment of custom AI models for enterprises.
  • Omniverse and digital twins are becoming critical tools for industrial digitalization, manufacturing, and robotics training.
  • The next major wave of AI is physical AI and robotics, powered by platforms like Project GR00T and Isaac.