GTC 2023 Keynote (Hopper / GPT-era inflection)

Category: Main Keynote · Year: 2023 · ▶ Watch

Speakers: Jensen Huang - CEO, NVIDIA · Milan Nedeljković - Member of the Board of Management, BMW AG Production

Switch language → 中文

Segments (15)

  • 00:00 · Introduction & The AI Era
    • Jensen Huang discusses the end of Moore’s Law and the rise of accelerated computing and AI.
  • 04:40 · I am AI
    • A video montage showcasing various applications of AI across different industries.
  • 07:47 · Accelerated Computing & AI Milestones
    • Reviewing the history of AI breakthroughs from AlexNet to ChatGPT.
  • 10:10 · Acceleration Libraries & Quantum Computing
    • Updates on CUDA libraries for CFD and the cuQuantum platform for quantum simulation.
  • 14:05 · Data Processing & Optimization
    • Introducing Spark RAPIDS, RAFT for vector databases, and cuOpt for logistics.
  • 17:53 · AI Inference
    • Highlighting TensorRT, Triton, and new libraries for computer vision and video processing.
  • 20:30 · Healthcare & Genomics
    • Advancements in genomics with Parabricks and medical devices with Clara Holoscan.
  • 22:30 · Computational Lithography
    • Unveiling cuLitho to accelerate semiconductor manufacturing with TSMC, ASML, and Synopsys.
  • 26:43 · Grace CPU & BlueField-3
    • Detailing the Grace CPU Superchip for AI workloads and the BlueField-3 DPU.
  • 31:40 · DGX H100 & DGX Cloud
    • Announcing the DGX H100 AI supercomputer and the new DGX Cloud service.
  • 35:30 · Generative AI & AI Foundations
    • Introducing NeMo, Picasso, and BioNeMo cloud services for building custom generative AI models.
  • 50:00 · New Inference Platforms
    • Launching L4, L40, H100 NVL, and Grace-Hopper platforms for diverse AI inference workloads.
  • 53:30 · Omniverse & Industrial Digitalization
    • Showcasing Omniverse as the platform for building digital twins and automating factories.
  • 01:06:00 · BMW Virtual Factory Demo
    • BMW demonstrates planning and operating a virtual EV factory using Omniverse.
  • 01:11:15 · Omniverse Cloud on Azure & Conclusion
    • Announcing Omniverse Cloud hosted on Microsoft Azure and closing remarks.

Product Announcements (10)

  • [13:43] Quantum Control Link
    • A link connecting NVIDIA GPUs to quantum computers for error correction.
    • specs: Developed in partnership with Quantum Machines for high-speed error correction.
    • availability: N/A
  • [15:04] RAFT
    • An acceleration library for indexing and retrieving data in vector databases.
    • specs: Integrated into Meta’s FAISS, Milvus, and Redis.
    • availability: Open source
  • [21:21] Parabricks 4.1
    • A suite of AI-accelerated libraries for genomics analysis.
    • specs: Available in public clouds and genomic platforms.
    • availability: Available now
  • [25:21] cuLitho
    • A library for computational lithography.
    • specs: Accelerates the process by over 40x, reducing power and server requirements.
    • availability: Qualifying for production starting in June
  • [31:24] BlueField-3 DPU
    • Data Processing Unit for offloading infrastructure software.
    • specs: Adopted by major cloud service providers.
    • availability: In production
  • [34:18] NVIDIA DGX Cloud
    • An AI supercomputing service accessed via a web browser.
    • specs: Hosted on Azure, GCP, and OCI; includes NVIDIA AI Enterprise software.
    • availability: Available soon
  • [39:52] NVIDIA AI Foundations
    • Cloud services for building custom generative AI models (NeMo, Picasso, BioNeMo).
    • specs: Allows enterprises to train models with proprietary data with guardrails.
    • availability: Available now/Early access
  • [51:05] New Inference Platforms
    • L4, L40, H100 NVL, and Grace-Hopper hardware platforms.
    • specs: Optimized for video, graphics, LLMs, and recommender systems respectively.
    • availability: N/A
  • [01:10:07] Omniverse Workstations and OVX Servers
    • Hardware optimized for running NVIDIA Omniverse.
    • specs: Powered by Ada RTX GPUs and BlueField-3.
    • availability: Starting in March
  • [01:11:11] NVIDIA Omniverse Cloud on Microsoft Azure
    • A fully managed cloud service for Omniverse.
    • specs: Connects to Microsoft 365 productivity suite.
    • availability: N/A

Specific Numbers (12)

Timestamp Metric Value Context
01:52 developers 4 million Number of developers in the global NVIDIA ecosystem.
09:03 FLOPS 262 quadrillion Floating point operations required to train AlexNet.
09:36 FLOPS 323 zettaflops Floating point operations required to train GPT-3.
11:58 throughput 9x Throughput increase of A100 vs CPU servers for CFD.
16:16 moves per second 30 billion Moves analyzed per second by cuOpt for the traveling salesperson problem.
25:38 speedup 40x Acceleration of computational lithography using cuLitho.
26:14 servers 500 Number of DGX H100 systems needed to replace 40,000 CPU servers for TSMC.
28:20 cores 72 Number of Arm cores in a single Grace CPU.
36:24 users 100 million Number of users ChatGPT reached in just a few months.
53:18 performance 10x Performance of L40 compared to T4 for Omniverse and graphics.
56:18 performance 10x Performance of H100 NVL compared to HGX A100 for GPT-3 175B inference.
57:08 bandwidth 7x Data transfer speed of Grace-Hopper compared to PCIe.

Benchmark Claims (8)

  • [11:58] High-fidelity CFD (Cadence): 9x throughput, 17x less energy
    • vs: CPU servers
    • gain: Significant cost and energy savings for fluid dynamics simulations.
  • [25:38] Computational Lithography: 40x speedup
    • vs: CPU-based processing
    • gain: Reduces processing time from weeks to hours.
  • [29:45] Microservices: 1.3x faster
    • vs: Next-gen x86 CPUs
    • gain: Higher performance for cloud microservices.
  • [29:54] Data Processing: 1.2x faster
    • vs: Next-gen x86 CPUs
    • gain: Higher performance for big data workloads.
  • [30:00] Energy Efficiency: 1.7x more efficient
    • vs: Next-gen x86 CPUs
    • gain: Significant power savings at the data center level.
  • [53:18] Omniverse and Graphics: 10x performance
    • vs: NVIDIA T4
    • gain: Massive leap in rendering and simulation capabilities.
  • [56:18] LLM Inference (GPT-3 175B): 10x faster
    • vs: HGX A100
    • gain: Drastic reduction in processing costs for large language models.
  • [57:08] CPU-GPU Bandwidth: 7x faster
    • vs: PCIe
    • gain: Eliminates data transfer bottlenecks for massive datasets.

Customer Stories (8)

  • [16:26] AT&T
    • Used cuOpt to optimize dispatch routing for 30,000 technicians.
    • outcome: Found solutions 100x faster, enabling real-time dispatch updates.
  • [18:23] Uber
    • Used Triton inference server for ETA predictions.
    • outcome: Serves hundreds of thousands of predictions per second.
  • [19:49] Tencent
    • Used CV-CUDA and VPF for video processing.
    • outcome: Processes 300,000 videos per day.
  • [22:00] Medtronic
    • Built the GI Genius system for colon cancer detection on NVIDIA Holoscan.
    • outcome: Created a software-defined medical device platform.
  • [26:00] TSMC
    • Implemented cuLitho for computational lithography.
    • outcome: Reduced 40,000 CPU servers to 500 DGX systems, cutting power from 35MW to 5MW.
  • [41:59] Runway
    • Used CV-CUDA for cloud-based generative AI video editing.
    • outcome: Enabled features like object removal and background changes in minutes.
  • [58:56] Amazon Robotics
    • Used Omniverse and Isaac Sim to simulate the Proteus autonomous robot.
    • outcome: Generated synthetic data to improve marker detection success rate from 88.6% to 98%.
  • [01:06:00] BMW
    • Used Omniverse to build a digital twin of a new EV factory in Debrecen, Hungary.
    • outcome: Enabled global teams to collaborate virtually, optimizing layouts and resolving issues before physical construction.

Key Technologies (12)

  • CUDA: Parallel computing platform and programming model that accelerates applications across various domains.
  • cuQuantum: Acceleration library for simulating quantum circuits on GPUs.
  • Spark RAPIDS: Accelerates Apache Spark data processing workloads on GPUs.
  • RAFT: Library for accelerating indexing and retrieval in vector databases.
  • cuOpt: Optimization engine for solving complex routing and logistics problems.
  • TensorRT & Triton: Software stack for optimizing and serving AI inference models in data centers.
  • CV-CUDA & VPF: Libraries for accelerating computer vision and video processing pipelines.
  • cuLitho: Software library that accelerates computational lithography for semiconductor manufacturing.
  • Grace CPU: Arm-based CPU designed for high-performance AI and cloud computing workloads.
  • BlueField DPU: Data Processing Unit that offloads networking, storage, and security tasks from the CPU.
  • AI Foundations (NeMo, Picasso, BioNeMo): Cloud services providing pre-trained models and frameworks for building custom generative AI.
  • Omniverse & USD: Platform based on Universal Scene Description for creating 3D digital twins and industrial simulations.

Demos Shown (7)

  • [16:10] Visualization of cuOpt solving a complex routing problem.
    • True
  • [24:40] Animation explaining computational lithography and the impact of cuLitho.
    • True
  • [38:40] Examples of generative AI applications (Tabnine, Omnikey, Core AI, Jasper).
    • True
  • [42:00] Runway’s cloud-based video editing tools powered by CV-CUDA.
    • True
  • [48:24] BioNeMo predicting protein structures and generating molecules.
    • True
  • [58:56] Simulation of Amazon’s Proteus robot navigating a warehouse in Isaac Sim.
    • True
  • [01:06:00] A live, collaborative session inside the digital twin of BMW’s new EV factory using Omniverse.
    • True

Predictions / Commitments (5)

  • [07:50, Current/Ongoing] The iPhone moment of AI has started.
  • [26:32, June 2023] TSMC will be qualifying cuLitho for production starting in June.
  • [35:24, Near future] Oracle Cloud Infrastructure (OCI) will be the first DGX Cloud.
  • [49:15, Long-term] Generative AI will reinvent nearly every industry.
  • [01:11:11, Near future] NVIDIA Omniverse Cloud will be hosted in Microsoft Azure.

Companies Mentioned (17)

Cadence, Ansys, Siemens · IBM, Google, Baidu, AWS · GCP, AWS, Databricks, Cloudera · Meta, Milvus, Redis · AT&T · Microsoft, Amazon, Amex, USPS, Uber, Roblox · PacBio, Oxford Nanopore, Ultima · Medtronic · ASML, TSMC, Synopsys · Check Point, Cisco, DDN, Dell, Juniper, Palo Alto, Red Hat, VMware · Baidu, CoreWeave, JD.com, Azure, OCI, Tencent · Microsoft Azure, Google GCP, Oracle OCI · Getty Images, Shutterstock, Adobe · Amgen, AstraZeneca, Insilico Medicine · Siemens, Bentley, Rockwell, Unity · BMW · Microsoft

Notable Quotes (4)

The iPhone moment of AI has started. — Jensen Huang @ 07:48

The chip industry is the foundation of nearly every industry. — Jensen Huang @ 22:34

Generative AI is a new kind of computer, one that we program in human language. — Jensen Huang @ 37:53

Together, we are helping the world do the impossible. — Jensen Huang @ 01:17:38

Key Topics

Accelerated Computing · Generative AI · Large Language Models (LLMs) · Quantum Computing Simulation · Computational Lithography · AI Inference · Digital Twins · Industrial Metaverse · Drug Discovery · Cloud Computing · Semiconductor Manufacturing · Robotics Simulation

Takeaways

  • Moore’s Law is slowing down, making accelerated computing essential for future performance gains and energy efficiency across all industries.
  • Generative AI represents a fundamental shift in computing, acting as a new platform that is programmable via human language.
  • NVIDIA is expanding its business model from selling hardware to offering full-stack cloud services, including DGX Cloud, AI Foundations, and Omniverse Cloud.
  • The introduction of cuLitho is a major breakthrough for semiconductor manufacturing, drastically reducing the time and energy required to design next-generation chips.
  • Omniverse is positioning itself as the standard operating system for industrial digitalization, enabling companies to build and simulate digital twins of factories and products before physical construction.