GTC March 2026 Keynote (Vera Rubin Ultra, AI Factory)

Category: Main Keynote · Year: 2026 · ▶ Watch

Switch language → 中文

Segments (19)

  • 00:00 · Introduction: The AI Factory
    • A cinematic intro showcasing how intelligence is manufactured in AI factories using tokens.
  • 03:14 · Welcome to GTC 2026
    • Jensen Huang takes the stage to outline the focus on technology, platforms, and ecosystems.
  • 06:00 · 20 Years of the CUDA Flywheel
    • Reflecting on two decades of CUDA, its installed base, and how it accelerates breakthroughs.
  • 10:22 · GeForce and the Origins of AI
    • How GeForce funded the development of CUDA and programmable shaders, leading to the AI revolution.
  • 13:31 · Announcing DLSS 5
    • Introduction of DLSS 5, fusing 3D graphics with generative AI for neural rendering.
  • 16:19 · Data Processing: cuDF and cuVS
    • Accelerating structured (cuDF) and unstructured (cuVS) data processing, the ground truth for AI.
  • 20:00 · Enterprise Data Partnerships
    • Highlighting partnerships with IBM, Dell, and Google Cloud to accelerate enterprise data pipelines.
  • 26:30 · Cloud Partnerships and Confidential Computing
    • Showcasing integrations with Azure, Oracle, AWS, and the importance of confidential computing.
  • 33:35 · CUDA-X Libraries
    • Overview of domain-specific libraries accelerating various industries from quantum to robotics.
  • 42:30 · The Inference Inflection
    • Discussing the massive shift towards inference compute driven by reasoning models like o1 and Claude Code.
  • 1:00:00 · Extreme Co-Design: GB NVL72
    • How co-designing chips, systems, and software drastically reduces token generation costs.
  • 1:08:00 · Announcing Vera Rubin Architecture
    • Unveiling the next-generation Vera Rubin platform for the agentic AI era.
  • 1:28:00 · Uniting with Groq for Extreme Inference
    • Announcing the integration of Groq’s LPU technology into the NVIDIA ecosystem via the LPX chip.
  • 1:36:00 · Roadmap: Oberon and Feynman
    • Revealing the future architecture roadmap extending to 2028.
  • 1:39:00 · DSX AI Factory Platform
    • Introducing the blueprint and tools for building gigawatt-scale AI factories.
  • 1:46:00 · OpenClaw: The Agentic OS
    • Highlighting the explosive growth of OpenClaw, an open-source framework for agentic AI.
  • 1:56:00 · Nemotron and Open Models
    • Announcing Nemotron 3 Super and NVIDIA’s commitment to open-source frontier models.
  • 2:04:00 · Physical AI and Robotics
    • The transition of AI into the physical world using Isaac Lab, Cosmos, and GR00T.
  • 2:12:00 · Disney Research Demo
    • A live demonstration of a physical robot (Olaf) powered by NVIDIA AI.

Product Announcements (12)

  • [13:47] DLSS 5
    • 3D-Guided Neural Rendering technology
    • specs: Fuses structured 3D data with generative AI for highly realistic, controllable graphics.
    • availability: N/A
  • [1:08:00] Vera Rubin Architecture
    • Next-generation AI computing platform
    • specs: Designed for agentic AI, featuring massive memory bandwidth and compute.
    • availability: N/A
  • [1:10:00] Vera Rubin NVL72
    • Rack-scale AI supercomputer
    • specs: 3.6 Exaflops compute, 260 TB/s NVLink bandwidth, 72 GPUs.
    • availability: N/A
  • [1:11:00] Rubin GPU
    • Next-gen data center GPU
    • specs: 288 GB HBM4, 22 TB/s bandwidth, 50 PFLOPS NVFP4, 336B transistors.
    • availability: N/A
  • [1:12:00] Vera CPU
    • Next-gen data center CPU
    • specs: Uses LPDDR5X, extreme single-thread performance, designed for agentic workflows.
    • availability: N/A
  • [1:14:00] BlueField-4 STX
    • Storage and networking DPU
    • specs: Co-packaged optics, designed for high-bandwidth AI storage access.
    • availability: N/A
  • [1:15:00] Spectrum-X6
    • Ethernet switch
    • specs: 800G Ethernet, co-packaged optics.
    • availability: N/A
  • [1:16:00] NVLink 6 Switch
    • GPU interconnect switch
    • specs: 3600 GB/s bandwidth.
    • availability: N/A
  • [1:32:00] Groq 3 LPX
    • Inference accelerator chip
    • specs: 315 PFLOPS, 128 GB SRAM, 40 PB/s memory bandwidth, deterministic data flow.
    • availability: Available 2H26
  • [1:45:00] Space-1 Vera Rubin Module
    • Space-grade AI compute module
    • specs: Radiation approved, designed for satellite and space data center deployments.
    • availability: N/A
  • [1:56:00] NemoClaw Reference OpenClaw
    • Agentic AI toolkit
    • specs: Includes OpenShell policy engine, integrates with cuDF, cuVS, and LLMs.
    • availability: Available now
  • [1:59:00] Nemotron 3 Super
    • Open frontier AI model
    • specs: Optimized for OpenClaw, tops leaderboards for reasoning and agentic tasks.
    • availability: N/A

Specific Numbers (10)

Timestamp Metric Value Context
05:18 Industry representation $100 trillion Value of the industries represented by companies at GTC.
21:35 Cost Savings 83% Nestle’s cost savings using IBM Watsonx.data on NVIDIA GPUs.
22:46 Cost Savings 76% Snap’s cost savings using Google Cloud AI Hypercomputer.
1:01:00 Performance/Watt 50x higher GB300 NVL72 compared to H200 NVL8.
1:01:00 Cost Reduction 35x lower Token cost on GB300 NVL72 compared to H200 NVL8.
1:10:00 Compute 3.6 Exaflops Compute power of the Vera Rubin NVL72 system.
1:11:00 Memory 288 GB HBM4 Memory capacity of a single Rubin GPU.
1:11:00 Transistors 336 Billion Transistor count on the Rubin GPU.
1:32:00 SRAM Bandwidth 40 PB/s Memory bandwidth of the Groq 3 LPX chip.
1:42:00 Infrastructure Cost $40 Billion Estimated cost to build a 1 Gigawatt AI factory.

Benchmark Claims (3)

  • [1:01:00] Token Cost / Performance per Watt: 50x Perf/Watt, 35x Lower Cost
    • vs: H200 NVL8
    • gain: Massive reduction in inference costs enabling new business models.
  • [1:23:00] Inference Throughput (ISO Power): 35x higher
    • vs: Hopper Architecture
    • gain: 35x more tokens generated for the same power footprint.
  • [1:59:00] OpenClaw Agentic Benchmarks: 85.6%
    • vs: Claude Opus, GPT-4
    • gain: Nemotron 3 Super is the best open model for agentic workflows.

Customer Stories (3)

  • [21:08] Nestle
    • Used IBM Watsonx.data accelerated by NVIDIA GPUs for their order-to-cash data mart.
    • outcome: Achieved 5x faster updates and 83% lower costs compared to CPUs.
  • [22:42] Snap
    • Used Google Cloud AI Hypercomputer with cuDF for A/B experimentation.
    • outcome: Lowered costs by 76% and scaled experiments across petabytes of data.
  • [29:00] Palantir
    • Deployed their Ontology platform on Dell AI infrastructure.
    • outcome: Enabled secure, on-premise, air-gapped AI deployments for enterprises.

Key Technologies (6)

  • DLSS 5: Uses 3D-guided neural rendering to generate high-fidelity graphics.
  • cuDF / cuVS: Accelerates structured data (dataframes) and unstructured data (vector search) processing.
  • Vera Rubin Architecture: Next-gen platform combining CPU, GPU, and networking for agentic AI.
  • Groq LPU (LPX): Deterministic data flow processor for ultra-low latency token generation.
  • OpenClaw: An open-source framework and OS for building and orchestrating AI agents.
  • Isaac Lab & Cosmos: Simulation environments and world models for training physical robots.

Demos Shown (3)

  • [13:50] DLSS 5 side-by-side comparisons in games like Resident Evil and EA Sports FC.
    • True
  • [1:47:00] OpenClaw agent modifying a video of a street scene based on a prompt.
    • True
  • [2:08:00] Live on-stage demo of a physical Olaf robot walking and interacting.
    • True

Predictions / Commitments (4)

  • [1:36:00, 2027-2028] NVIDIA will release the Oberon architecture in 2027 and Feynman in 2028.
  • [1:41:00, Near future] AI factories will scale to multi-gigawatt power consumption.
  • [1:53:00, Ongoing/Near future] Every SaaS company will transition into an Agent-as-a-Service company.
  • [2:03:00, Long-term] Agentic AI will expand the $2 Trillion IT industry into a multi-trillion dollar industry.

Companies Mentioned (7)

IBM · Google Cloud · Microsoft Azure · Oracle · Groq · OpenClaw · Disney Research

Notable Quotes (4)

Tokens are the new commodity. Compute is revenue. — Jensen Huang @ 1:07:00

Accelerated computing is not a chip problem. Accelerated computing is not a systems problem. Accelerated computing has a missing word: application acceleration. — Jensen Huang @ 1:11:00

Every single SaaS company will become a gas company… an Agent-as-a-Service company. — Jensen Huang @ 1:53:00

We are a vertically integrated computing company with open horizontal integration with the world. — Jensen Huang @ 2:04:00

Key Topics

AI Factories · Agentic AI · Vera Rubin Architecture · Inference Scaling · DLSS 5 · Data Processing (cuDF/cuVS) · Confidential Computing · OpenClaw Framework · Groq LPU Integration · Physical AI · Robotics Simulation · NVLink Networking

Takeaways

  • The industry is shifting from an AI training focus to an AI inference focus, driven by reasoning models.
  • Agentic AI (agents using tools and reasoning) is the next major computing platform, replacing traditional SaaS.
  • NVIDIA’s Vera Rubin architecture is purpose-built for this agentic era, emphasizing massive memory bandwidth and CPU/GPU co-design.
  • Extreme co-design across chips, systems, and software is required to drive down the cost of token generation.
  • NVIDIA is embracing open-source frameworks like OpenClaw to accelerate the adoption of agentic workflows.
  • Physical AI is becoming a reality, with simulation tools like Isaac Lab enabling robots to learn before deployment.