GTC 2023 Keynote (Hopper / GPT-era inflection)
Category: Main Keynote · Year: 2023 · ▶ Watch
Speakers: Jensen Huang - CEO, NVIDIA · Milan Nedeljković - Member of the Board of Management, BMW AG Production
Segments (15)
- 00:00 · Introduction & The AI Era
- Jensen Huang discusses the end of Moore’s Law and the rise of accelerated computing and AI.
- 04:40 · I am AI
- A video montage showcasing various applications of AI across different industries.
- 07:47 · Accelerated Computing & AI Milestones
- Reviewing the history of AI breakthroughs from AlexNet to ChatGPT.
- 10:10 · Acceleration Libraries & Quantum Computing
- Updates on CUDA libraries for CFD and the cuQuantum platform for quantum simulation.
- 14:05 · Data Processing & Optimization
- Introducing Spark RAPIDS, RAFT for vector databases, and cuOpt for logistics.
- 17:53 · AI Inference
- Highlighting TensorRT, Triton, and new libraries for computer vision and video processing.
- 20:30 · Healthcare & Genomics
- Advancements in genomics with Parabricks and medical devices with Clara Holoscan.
- 22:30 · Computational Lithography
- Unveiling cuLitho to accelerate semiconductor manufacturing with TSMC, ASML, and Synopsys.
- 26:43 · Grace CPU & BlueField-3
- Detailing the Grace CPU Superchip for AI workloads and the BlueField-3 DPU.
- 31:40 · DGX H100 & DGX Cloud
- Announcing the DGX H100 AI supercomputer and the new DGX Cloud service.
- 35:30 · Generative AI & AI Foundations
- Introducing NeMo, Picasso, and BioNeMo cloud services for building custom generative AI models.
- 50:00 · New Inference Platforms
- Launching L4, L40, H100 NVL, and Grace-Hopper platforms for diverse AI inference workloads.
- 53:30 · Omniverse & Industrial Digitalization
- Showcasing Omniverse as the platform for building digital twins and automating factories.
- 01:06:00 · BMW Virtual Factory Demo
- BMW demonstrates planning and operating a virtual EV factory using Omniverse.
- 01:11:15 · Omniverse Cloud on Azure & Conclusion
- Announcing Omniverse Cloud hosted on Microsoft Azure and closing remarks.
Product Announcements (10)
- [13:43] Quantum Control Link
- A link connecting NVIDIA GPUs to quantum computers for error correction.
- specs: Developed in partnership with Quantum Machines for high-speed error correction.
- availability: N/A
- [15:04] RAFT
- An acceleration library for indexing and retrieving data in vector databases.
- specs: Integrated into Meta’s FAISS, Milvus, and Redis.
- availability: Open source
- [21:21] Parabricks 4.1
- A suite of AI-accelerated libraries for genomics analysis.
- specs: Available in public clouds and genomic platforms.
- availability: Available now
- [25:21] cuLitho
- A library for computational lithography.
- specs: Accelerates the process by over 40x, reducing power and server requirements.
- availability: Qualifying for production starting in June
- [31:24] BlueField-3 DPU
- Data Processing Unit for offloading infrastructure software.
- specs: Adopted by major cloud service providers.
- availability: In production
- [34:18] NVIDIA DGX Cloud
- An AI supercomputing service accessed via a web browser.
- specs: Hosted on Azure, GCP, and OCI; includes NVIDIA AI Enterprise software.
- availability: Available soon
- [39:52] NVIDIA AI Foundations
- Cloud services for building custom generative AI models (NeMo, Picasso, BioNeMo).
- specs: Allows enterprises to train models with proprietary data with guardrails.
- availability: Available now/Early access
- [51:05] New Inference Platforms
- L4, L40, H100 NVL, and Grace-Hopper hardware platforms.
- specs: Optimized for video, graphics, LLMs, and recommender systems respectively.
- availability: N/A
- [01:10:07] Omniverse Workstations and OVX Servers
- Hardware optimized for running NVIDIA Omniverse.
- specs: Powered by Ada RTX GPUs and BlueField-3.
- availability: Starting in March
- [01:11:11] NVIDIA Omniverse Cloud on Microsoft Azure
- A fully managed cloud service for Omniverse.
- specs: Connects to Microsoft 365 productivity suite.
- availability: N/A
Specific Numbers (12)
| Timestamp | Metric | Value | Context |
|---|---|---|---|
| 01:52 | developers | 4 million | Number of developers in the global NVIDIA ecosystem. |
| 09:03 | FLOPS | 262 quadrillion | Floating point operations required to train AlexNet. |
| 09:36 | FLOPS | 323 zettaflops | Floating point operations required to train GPT-3. |
| 11:58 | throughput | 9x | Throughput increase of A100 vs CPU servers for CFD. |
| 16:16 | moves per second | 30 billion | Moves analyzed per second by cuOpt for the traveling salesperson problem. |
| 25:38 | speedup | 40x | Acceleration of computational lithography using cuLitho. |
| 26:14 | servers | 500 | Number of DGX H100 systems needed to replace 40,000 CPU servers for TSMC. |
| 28:20 | cores | 72 | Number of Arm cores in a single Grace CPU. |
| 36:24 | users | 100 million | Number of users ChatGPT reached in just a few months. |
| 53:18 | performance | 10x | Performance of L40 compared to T4 for Omniverse and graphics. |
| 56:18 | performance | 10x | Performance of H100 NVL compared to HGX A100 for GPT-3 175B inference. |
| 57:08 | bandwidth | 7x | Data transfer speed of Grace-Hopper compared to PCIe. |
Benchmark Claims (8)
- [11:58] High-fidelity CFD (Cadence): 9x throughput, 17x less energy
- vs: CPU servers
- gain: Significant cost and energy savings for fluid dynamics simulations.
- [25:38] Computational Lithography: 40x speedup
- vs: CPU-based processing
- gain: Reduces processing time from weeks to hours.
- [29:45] Microservices: 1.3x faster
- vs: Next-gen x86 CPUs
- gain: Higher performance for cloud microservices.
- [29:54] Data Processing: 1.2x faster
- vs: Next-gen x86 CPUs
- gain: Higher performance for big data workloads.
- [30:00] Energy Efficiency: 1.7x more efficient
- vs: Next-gen x86 CPUs
- gain: Significant power savings at the data center level.
- [53:18] Omniverse and Graphics: 10x performance
- vs: NVIDIA T4
- gain: Massive leap in rendering and simulation capabilities.
- [56:18] LLM Inference (GPT-3 175B): 10x faster
- vs: HGX A100
- gain: Drastic reduction in processing costs for large language models.
- [57:08] CPU-GPU Bandwidth: 7x faster
- vs: PCIe
- gain: Eliminates data transfer bottlenecks for massive datasets.
Customer Stories (8)
- [16:26] AT&T
- Used cuOpt to optimize dispatch routing for 30,000 technicians.
- outcome: Found solutions 100x faster, enabling real-time dispatch updates.
- [18:23] Uber
- Used Triton inference server for ETA predictions.
- outcome: Serves hundreds of thousands of predictions per second.
- [19:49] Tencent
- Used CV-CUDA and VPF for video processing.
- outcome: Processes 300,000 videos per day.
- [22:00] Medtronic
- Built the GI Genius system for colon cancer detection on NVIDIA Holoscan.
- outcome: Created a software-defined medical device platform.
- [26:00] TSMC
- Implemented cuLitho for computational lithography.
- outcome: Reduced 40,000 CPU servers to 500 DGX systems, cutting power from 35MW to 5MW.
- [41:59] Runway
- Used CV-CUDA for cloud-based generative AI video editing.
- outcome: Enabled features like object removal and background changes in minutes.
- [58:56] Amazon Robotics
- Used Omniverse and Isaac Sim to simulate the Proteus autonomous robot.
- outcome: Generated synthetic data to improve marker detection success rate from 88.6% to 98%.
- [01:06:00] BMW
- Used Omniverse to build a digital twin of a new EV factory in Debrecen, Hungary.
- outcome: Enabled global teams to collaborate virtually, optimizing layouts and resolving issues before physical construction.
Key Technologies (12)
- CUDA: Parallel computing platform and programming model that accelerates applications across various domains.
- cuQuantum: Acceleration library for simulating quantum circuits on GPUs.
- Spark RAPIDS: Accelerates Apache Spark data processing workloads on GPUs.
- RAFT: Library for accelerating indexing and retrieval in vector databases.
- cuOpt: Optimization engine for solving complex routing and logistics problems.
- TensorRT & Triton: Software stack for optimizing and serving AI inference models in data centers.
- CV-CUDA & VPF: Libraries for accelerating computer vision and video processing pipelines.
- cuLitho: Software library that accelerates computational lithography for semiconductor manufacturing.
- Grace CPU: Arm-based CPU designed for high-performance AI and cloud computing workloads.
- BlueField DPU: Data Processing Unit that offloads networking, storage, and security tasks from the CPU.
- AI Foundations (NeMo, Picasso, BioNeMo): Cloud services providing pre-trained models and frameworks for building custom generative AI.
- Omniverse & USD: Platform based on Universal Scene Description for creating 3D digital twins and industrial simulations.
Demos Shown (7)
- [16:10] Visualization of cuOpt solving a complex routing problem.
- True
- [24:40] Animation explaining computational lithography and the impact of cuLitho.
- True
- [38:40] Examples of generative AI applications (Tabnine, Omnikey, Core AI, Jasper).
- True
- [42:00] Runway’s cloud-based video editing tools powered by CV-CUDA.
- True
- [48:24] BioNeMo predicting protein structures and generating molecules.
- True
- [58:56] Simulation of Amazon’s Proteus robot navigating a warehouse in Isaac Sim.
- True
- [01:06:00] A live, collaborative session inside the digital twin of BMW’s new EV factory using Omniverse.
- True
Predictions / Commitments (5)
- [07:50, Current/Ongoing] The iPhone moment of AI has started.
- [26:32, June 2023] TSMC will be qualifying cuLitho for production starting in June.
- [35:24, Near future] Oracle Cloud Infrastructure (OCI) will be the first DGX Cloud.
- [49:15, Long-term] Generative AI will reinvent nearly every industry.
- [01:11:11, Near future] NVIDIA Omniverse Cloud will be hosted in Microsoft Azure.
Companies Mentioned (17)
Cadence, Ansys, Siemens · IBM, Google, Baidu, AWS · GCP, AWS, Databricks, Cloudera · Meta, Milvus, Redis · AT&T · Microsoft, Amazon, Amex, USPS, Uber, Roblox · PacBio, Oxford Nanopore, Ultima · Medtronic · ASML, TSMC, Synopsys · Check Point, Cisco, DDN, Dell, Juniper, Palo Alto, Red Hat, VMware · Baidu, CoreWeave, JD.com, Azure, OCI, Tencent · Microsoft Azure, Google GCP, Oracle OCI · Getty Images, Shutterstock, Adobe · Amgen, AstraZeneca, Insilico Medicine · Siemens, Bentley, Rockwell, Unity · BMW · Microsoft
Notable Quotes (4)
The iPhone moment of AI has started. — Jensen Huang @ 07:48
The chip industry is the foundation of nearly every industry. — Jensen Huang @ 22:34
Generative AI is a new kind of computer, one that we program in human language. — Jensen Huang @ 37:53
Together, we are helping the world do the impossible. — Jensen Huang @ 01:17:38
Key Topics
Accelerated Computing · Generative AI · Large Language Models (LLMs) · Quantum Computing Simulation · Computational Lithography · AI Inference · Digital Twins · Industrial Metaverse · Drug Discovery · Cloud Computing · Semiconductor Manufacturing · Robotics Simulation
Takeaways
- Moore’s Law is slowing down, making accelerated computing essential for future performance gains and energy efficiency across all industries.
- Generative AI represents a fundamental shift in computing, acting as a new platform that is programmable via human language.
- NVIDIA is expanding its business model from selling hardware to offering full-stack cloud services, including DGX Cloud, AI Foundations, and Omniverse Cloud.
- The introduction of cuLitho is a major breakthrough for semiconductor manufacturing, drastically reducing the time and energy required to design next-generation chips.
- Omniverse is positioning itself as the standard operating system for industrial digitalization, enabling companies to build and simulate digital twins of factories and products before physical construction.