GTC 2023 Keynote (Hopper / GPT-era inflection)

Category: Main Keynote · Year: 2023 · ▶ Watch

Speakers: Jensen Huang - CEO, NVIDIA · Milan Nedeljković - Member of the Board of Management, BMW AG Production

Segments (15)

00:00 · Introduction & The AI Era
- Jensen Huang discusses the end of Moore’s Law and the rise of accelerated computing and AI.
04:40 · I am AI
- A video montage showcasing various applications of AI across different industries.
07:47 · Accelerated Computing & AI Milestones
- Reviewing the history of AI breakthroughs from AlexNet to ChatGPT.
10:10 · Acceleration Libraries & Quantum Computing
- Updates on CUDA libraries for CFD and the cuQuantum platform for quantum simulation.
14:05 · Data Processing & Optimization
- Introducing Spark RAPIDS, RAFT for vector databases, and cuOpt for logistics.
17:53 · AI Inference
- Highlighting TensorRT, Triton, and new libraries for computer vision and video processing.
20:30 · Healthcare & Genomics
- Advancements in genomics with Parabricks and medical devices with Clara Holoscan.
22:30 · Computational Lithography
- Unveiling cuLitho to accelerate semiconductor manufacturing with TSMC, ASML, and Synopsys.
26:43 · Grace CPU & BlueField-3
- Detailing the Grace CPU Superchip for AI workloads and the BlueField-3 DPU.
31:40 · DGX H100 & DGX Cloud
- Announcing the DGX H100 AI supercomputer and the new DGX Cloud service.
35:30 · Generative AI & AI Foundations
- Introducing NeMo, Picasso, and BioNeMo cloud services for building custom generative AI models.
50:00 · New Inference Platforms
- Launching L4, L40, H100 NVL, and Grace-Hopper platforms for diverse AI inference workloads.
53:30 · Omniverse & Industrial Digitalization
- Showcasing Omniverse as the platform for building digital twins and automating factories.
01:06:00 · BMW Virtual Factory Demo
- BMW demonstrates planning and operating a virtual EV factory using Omniverse.
01:11:15 · Omniverse Cloud on Azure & Conclusion
- Announcing Omniverse Cloud hosted on Microsoft Azure and closing remarks.

Product Announcements (10)

[13:43] Quantum Control Link
- A link connecting NVIDIA GPUs to quantum computers for error correction.
- specs: Developed in partnership with Quantum Machines for high-speed error correction.
- availability: N/A
[15:04] RAFT
- An acceleration library for indexing and retrieving data in vector databases.
- specs: Integrated into Meta’s FAISS, Milvus, and Redis.
- availability: Open source
[21:21] Parabricks 4.1
- A suite of AI-accelerated libraries for genomics analysis.
- specs: Available in public clouds and genomic platforms.
- availability: Available now
[25:21] cuLitho
- A library for computational lithography.
- specs: Accelerates the process by over 40x, reducing power and server requirements.
- availability: Qualifying for production starting in June
[31:24] BlueField-3 DPU
- Data Processing Unit for offloading infrastructure software.
- specs: Adopted by major cloud service providers.
- availability: In production
[34:18] NVIDIA DGX Cloud
- An AI supercomputing service accessed via a web browser.
- specs: Hosted on Azure, GCP, and OCI; includes NVIDIA AI Enterprise software.
- availability: Available soon
[39:52] NVIDIA AI Foundations
- Cloud services for building custom generative AI models (NeMo, Picasso, BioNeMo).
- specs: Allows enterprises to train models with proprietary data with guardrails.
- availability: Available now/Early access
[51:05] New Inference Platforms
- L4, L40, H100 NVL, and Grace-Hopper hardware platforms.
- specs: Optimized for video, graphics, LLMs, and recommender systems respectively.
- availability: N/A
[01:10:07] Omniverse Workstations and OVX Servers
- Hardware optimized for running NVIDIA Omniverse.
- specs: Powered by Ada RTX GPUs and BlueField-3.
- availability: Starting in March
[01:11:11] NVIDIA Omniverse Cloud on Microsoft Azure
- A fully managed cloud service for Omniverse.
- specs: Connects to Microsoft 365 productivity suite.
- availability: N/A

Specific Numbers (12)

Timestamp	Metric	Value	Context
01:52	developers	4 million	Number of developers in the global NVIDIA ecosystem.
09:03	FLOPS	262 quadrillion	Floating point operations required to train AlexNet.
09:36	FLOPS	323 zettaflops	Floating point operations required to train GPT-3.
11:58	throughput	9x	Throughput increase of A100 vs CPU servers for CFD.
16:16	moves per second	30 billion	Moves analyzed per second by cuOpt for the traveling salesperson problem.
25:38	speedup	40x	Acceleration of computational lithography using cuLitho.
26:14	servers	500	Number of DGX H100 systems needed to replace 40,000 CPU servers for TSMC.
28:20	cores	72	Number of Arm cores in a single Grace CPU.
36:24	users	100 million	Number of users ChatGPT reached in just a few months.
53:18	performance	10x	Performance of L40 compared to T4 for Omniverse and graphics.
56:18	performance	10x	Performance of H100 NVL compared to HGX A100 for GPT-3 175B inference.
57:08	bandwidth	7x	Data transfer speed of Grace-Hopper compared to PCIe.

Benchmark Claims (8)

[11:58] High-fidelity CFD (Cadence): 9x throughput, 17x less energy
- vs: CPU servers
- gain: Significant cost and energy savings for fluid dynamics simulations.
[25:38] Computational Lithography: 40x speedup
- vs: CPU-based processing
- gain: Reduces processing time from weeks to hours.
[29:45] Microservices: 1.3x faster
- vs: Next-gen x86 CPUs
- gain: Higher performance for cloud microservices.
[29:54] Data Processing: 1.2x faster
- vs: Next-gen x86 CPUs
- gain: Higher performance for big data workloads.
[30:00] Energy Efficiency: 1.7x more efficient
- vs: Next-gen x86 CPUs
- gain: Significant power savings at the data center level.
[53:18] Omniverse and Graphics: 10x performance
- vs: NVIDIA T4
- gain: Massive leap in rendering and simulation capabilities.
[56:18] LLM Inference (GPT-3 175B): 10x faster
- vs: HGX A100
- gain: Drastic reduction in processing costs for large language models.
[57:08] CPU-GPU Bandwidth: 7x faster
- vs: PCIe
- gain: Eliminates data transfer bottlenecks for massive datasets.

Customer Stories (8)

[16:26] AT&T
- Used cuOpt to optimize dispatch routing for 30,000 technicians.
- outcome: Found solutions 100x faster, enabling real-time dispatch updates.
[18:23] Uber
- Used Triton inference server for ETA predictions.
- outcome: Serves hundreds of thousands of predictions per second.
[19:49] Tencent
- Used CV-CUDA and VPF for video processing.
- outcome: Processes 300,000 videos per day.
[22:00] Medtronic
- Built the GI Genius system for colon cancer detection on NVIDIA Holoscan.
- outcome: Created a software-defined medical device platform.
[26:00] TSMC
- Implemented cuLitho for computational lithography.
- outcome: Reduced 40,000 CPU servers to 500 DGX systems, cutting power from 35MW to 5MW.
[41:59] Runway
- Used CV-CUDA for cloud-based generative AI video editing.
- outcome: Enabled features like object removal and background changes in minutes.
[58:56] Amazon Robotics
- Used Omniverse and Isaac Sim to simulate the Proteus autonomous robot.
- outcome: Generated synthetic data to improve marker detection success rate from 88.6% to 98%.
[01:06:00] BMW
- Used Omniverse to build a digital twin of a new EV factory in Debrecen, Hungary.
- outcome: Enabled global teams to collaborate virtually, optimizing layouts and resolving issues before physical construction.

Key Technologies (12)

CUDA: Parallel computing platform and programming model that accelerates applications across various domains.
cuQuantum: Acceleration library for simulating quantum circuits on GPUs.
Spark RAPIDS: Accelerates Apache Spark data processing workloads on GPUs.
RAFT: Library for accelerating indexing and retrieval in vector databases.
cuOpt: Optimization engine for solving complex routing and logistics problems.
TensorRT & Triton: Software stack for optimizing and serving AI inference models in data centers.
CV-CUDA & VPF: Libraries for accelerating computer vision and video processing pipelines.
cuLitho: Software library that accelerates computational lithography for semiconductor manufacturing.
Grace CPU: Arm-based CPU designed for high-performance AI and cloud computing workloads.
BlueField DPU: Data Processing Unit that offloads networking, storage, and security tasks from the CPU.
AI Foundations (NeMo, Picasso, BioNeMo): Cloud services providing pre-trained models and frameworks for building custom generative AI.
Omniverse & USD: Platform based on Universal Scene Description for creating 3D digital twins and industrial simulations.

Demos Shown (7)

[16:10] Visualization of cuOpt solving a complex routing problem.
- True
[24:40] Animation explaining computational lithography and the impact of cuLitho.
- True
[38:40] Examples of generative AI applications (Tabnine, Omnikey, Core AI, Jasper).
- True
[42:00] Runway’s cloud-based video editing tools powered by CV-CUDA.
- True
[48:24] BioNeMo predicting protein structures and generating molecules.
- True
[58:56] Simulation of Amazon’s Proteus robot navigating a warehouse in Isaac Sim.
- True
[01:06:00] A live, collaborative session inside the digital twin of BMW’s new EV factory using Omniverse.
- True

Predictions / Commitments (5)

[07:50, Current/Ongoing] The iPhone moment of AI has started.
[26:32, June 2023] TSMC will be qualifying cuLitho for production starting in June.
[35:24, Near future] Oracle Cloud Infrastructure (OCI) will be the first DGX Cloud.
[49:15, Long-term] Generative AI will reinvent nearly every industry.
[01:11:11, Near future] NVIDIA Omniverse Cloud will be hosted in Microsoft Azure.

Companies Mentioned (17)

Cadence, Ansys, Siemens · IBM, Google, Baidu, AWS · GCP, AWS, Databricks, Cloudera · Meta, Milvus, Redis · AT&T · Microsoft, Amazon, Amex, USPS, Uber, Roblox · PacBio, Oxford Nanopore, Ultima · Medtronic · ASML, TSMC, Synopsys · Check Point, Cisco, DDN, Dell, Juniper, Palo Alto, Red Hat, VMware · Baidu, CoreWeave, JD.com, Azure, OCI, Tencent · Microsoft Azure, Google GCP, Oracle OCI · Getty Images, Shutterstock, Adobe · Amgen, AstraZeneca, Insilico Medicine · Siemens, Bentley, Rockwell, Unity · BMW · Microsoft

Notable Quotes (4)

The iPhone moment of AI has started. — Jensen Huang @ 07:48

The chip industry is the foundation of nearly every industry. — Jensen Huang @ 22:34

Generative AI is a new kind of computer, one that we program in human language. — Jensen Huang @ 37:53

Together, we are helping the world do the impossible. — Jensen Huang @ 01:17:38

Key Topics

Accelerated Computing · Generative AI · Large Language Models (LLMs) · Quantum Computing Simulation · Computational Lithography · AI Inference · Digital Twins · Industrial Metaverse · Drug Discovery · Cloud Computing · Semiconductor Manufacturing · Robotics Simulation

Takeaways

Moore’s Law is slowing down, making accelerated computing essential for future performance gains and energy efficiency across all industries.
Generative AI represents a fundamental shift in computing, acting as a new platform that is programmable via human language.
NVIDIA is expanding its business model from selling hardware to offering full-stack cloud services, including DGX Cloud, AI Foundations, and Omniverse Cloud.
The introduction of cuLitho is a major breakthrough for semiconductor manufacturing, drastically reducing the time and energy required to design next-generation chips.
Omniverse is positioning itself as the standard operating system for industrial digitalization, enabling companies to build and simulate digital twins of factories and products before physical construction.