GTC China 2020 Keynote

Category: China Keynote · Year: 2020 · ▶ Watch

Speakers: Ashok Pandey - VP, Operations & Partners, APAC, NVIDIA · Bill Dally - Chief Scientist and SVP of Research, NVIDIA · Greg Estes - VP, Corporate Marketing & Developer Programs, NVIDIA · Jay Puri - EVP, Worldwide Field Operations, NVIDIA · Kimberly Powell - VP, Healthcare, NVIDIA · Raymond Teh - VP, Sales & Marketing, APAC, NVIDIA

Switch language → 中文

Segments (15)

  • 00:00 · Introduction
    • Opening video highlighting NVIDIA’s impact across various industries.
  • 03:14 · Keynote: Ampere Architecture and Software Stack
    • Bill Dally introduces the Ampere A100 GPU, its new features like TF32 and structured sparsity, and the CUDA software ecosystem.
  • 09:21 · Keynote: DGX Systems and Supercomputing
    • Overview of DGX A100, DGX SuperPOD, and the Selene supercomputer’s ranking on the Top500 and Green500 lists.
  • 11:56 · Keynote: Deep Learning Performance and MLPerf
    • Discussion on the evolution of Tensor Cores, Huang’s Law, and NVIDIA’s dominance in MLPerf training and inference benchmarks.
  • 17:28 · Keynote: Real-Time Graphics and Ray Tracing
    • Showcase of RTX DI, RTX GI, and DLSS 2.0 enabling photorealistic real-time rendering.
  • 25:56 · Keynote: AI Applications - GANs, NLP, and Recommenders
    • Exploration of Generative Adversarial Networks (GANs), conversational AI with Jarvis, Megatron NLP, and the Merlin recommender framework.
  • 35:10 · Keynote: AI in Healthcare
    • Introduction to Clara Discovery for drug discovery, genomics with Parabricks, and AI’s role in fighting COVID-19.
  • 42:59 · Keynote: Robotics and Autonomous Vehicles
    • Advancements in robotic manipulation, reinforcement learning in simulation, and the NVIDIA DRIVE platform for autonomous vehicles.
  • 50:18 · Keynote: NVIDIA Research Projects
    • Deep dive into future technologies including efficient inference accelerators (RC18, MAGNet), silicon photonics for interconnects, and the Legate programming system.
  • 01:01:00 · Executive Panel: Introduction
    • Raymond Teh introduces the executive panel to discuss NVIDIA’s business and strategy in China.
  • 01:10:59 · Panel: Importance of the China Market
    • Jay Puri and Greg Estes discuss the strategic importance of China, its massive developer base, and the gaming ecosystem.
  • 01:16:45 · Panel: AI in Healthcare and COVID-19 Response
    • Kimberly Powell explains how AI and accelerated computing are creating a ‘computational global defense system’ for healthcare.
  • 01:26:59 · Panel: Cloud Service Providers and Live Streaming
    • Ashok Pandey details collaborations with Chinese CSPs (Alibaba, Tencent, Baidu) and the use of GPUs in the booming live streaming industry.
  • 01:46:49 · Panel: Startups and the Inception Program
    • Greg Estes highlights NVIDIA’s support for over 800 AI startups in China through the Inception program.
  • 01:50:30 · Panel: DGX Strategy and Partner Ecosystem
    • Jay Puri clarifies the strategy behind NVIDIA’s DGX systems and how they enable OEM partners to build certified AI platforms.

Product Announcements (8)

  • [03:42] Ampere A100 GPU
    • Data center GPU architecture
    • specs: 7nm, 54 billion transistors, 3rd Gen Tensor Cores, TF32 support, Multi-Instance GPU (MIG), Structured Sparsity
    • availability: Available
  • [09:21] DGX A100
    • AI system
    • specs: 8x A100 GPUs, 9x Mellanox ConnectX-6 NICs, 160 Teraflops FP64
    • availability: Available
  • [19:14] RTX DI and RTX GI
    • Rendering technologies
    • specs: Direct Illumination using ReSTIR algorithm, Global Illumination using light probes for real-time path tracing
    • availability: Available in NVIDIA graphics products
  • [21:29] DLSS 2.0
    • Deep Learning Super Sampling
    • specs: AI-driven upscaling, temporally stable, generalized network across games
    • availability: Available
  • [31:31] NVIDIA Jarvis
    • Multimodal conversational AI service
    • specs: Speech-to-text, NLP, text-to-speech pipeline
    • availability: Available
  • [35:50] Triton Inference Server
    • Open-source inference serving software
    • specs: Supports multiple frameworks (TensorFlow, PyTorch, ONNX), dynamic batching, concurrent model execution
    • availability: Available
  • [38:00] Clara Discovery
    • Computational drug discovery platform
    • specs: Genomics (Parabricks), Cryo-EM (CryoSPARC), molecular docking (AutoDock), NLP (BioMegatron)
    • availability: Available
  • [49:20] DRIVE AGX Orin
    • Autonomous vehicle compute platform
    • specs: Scalable from 10 TOPS (5W) for ADAS to 2,000 TOPS (800W) for Level 5 Robotaxi
    • availability: Announced

Specific Numbers (9)

Timestamp Metric Value Context
05:51 Transistors 54 billion Number of transistors on the Ampere A100 chip.
07:28 TFLOPS 19.5 FP64 Tensor Core performance on A100.
07:36 TFLOPS 156 TF32 Tensor Core performance on A100 for deep learning training.
07:45 PETAOPS 1.25 INT8 inference performance on A100 with sparsity.
10:26 Ranking #5 Selene supercomputer ranking on the Top500 list.
14:18 Performance Multiplier 317x Increase in single-chip inference performance over 8 years (Huang’s Law).
01:11:15 Developers 400,000+ Registered NVIDIA developers in China.
01:11:59 CPUs Sold 22 billion Number of ARM CPUs sold annually.
01:47:38 Startups 800+ Number of startups in the NVIDIA Inception program in China.

Benchmark Claims (3)

  • [15:10] MLPerf Training: Up to 2.5x
    • vs: Volta V100
    • gain: A100 is up to 2.5x faster than V100 in training benchmarks, sweeping all categories.
  • [16:10] MLPerf Data Center Inference: Up to 237x
    • vs: CPU
    • gain: A100 is up to 237x faster than CPU and 6-8x faster than the previous generation T4.
  • [17:00] MLPerf Edge Inference: Leading
    • vs: Centaur
    • gain: Jetson AGX Xavier and T4 sweep categories, outperforming competitors like Centaur.

Customer Stories (4)

  • [01:18:18] Ping An, United Imaging, Infervision
    • Deployed Clara medical imaging COVID AI technology into thousands of hospitals across China.
    • outcome: Provided frontline workers with AI tools to make better choices and treat patients faster.
  • [01:28:40] Alibaba Cloud, Tencent Cloud, Baidu Cloud
    • Adopted the A100 GPU architecture for their cloud services.
    • outcome: Achieved significant performance-to-price improvements and supported complex AI models.
  • [01:44:50] Taobao
    • Used GPUs to accelerate computer vision and NLP during live streams.
    • outcome: Improved real-time content understanding and user experience.
  • [01:45:00] Bigo Live
    • Used GPUs to improve real-time content understanding and creation capabilities.
    • outcome: Enhanced live streaming features.

Key Technologies (6)

  • TensorFloat-32 (TF32): A new math format that provides the range of FP32 and the precision of FP16, accelerating AI training without code changes.
  • Structured Sparsity: Allows 2 out of 4 weights in a neural network to be zero, doubling math throughput and reducing memory bandwidth requirements.
  • RTX Direct Illumination (RTX DI): Uses ReSTIR algorithm to render millions of dynamic lights with physically accurate shadows in real-time.
  • RTX Global Illumination (RTX GI): Computes infinite bounces of indirect light using light probes without light leaks, enabling real-time global illumination.
  • DLSS 2.0: Uses a deep neural network to upscale lower-resolution rendered images to higher resolutions (e.g., 4K) while maintaining temporal stability.
  • Silicon Photonics: Uses light instead of electrical signals for chip-to-chip communication, offering higher bandwidth and longer reach at lower power.

Demos Shown (6)

  • [18:19] Marbles RTX tech demo showcasing real-time path tracing, soft shadows, and reflections.
    • True
  • [23:00] DLSS 2.0 comparison in Death Stranding, showing native 4K vs DLSS 4K.
    • True
  • [30:09] Maxine video conferencing demo using GANs to animate a face from keypoints, including mapping to a cartoon avatar.
    • True
  • [32:20] GauGAN demo turning simple painted shapes into photorealistic landscapes.
    • True
  • [43:30] Robotic arm using Riemannian Motion Policies to avoid obstacles and grasp unknown objects.
    • True
  • [45:00] Four-legged robots learning to walk in simulation and transferring that skill to the real world.
    • True

Predictions / Commitments (3)

  • [25:11, Long term] In the long run, we expect computer graphics to be generated by AI… without ever having geometry.
  • [50:41, Future generations] We are looking at an alternative technology to actually signal out of our GPUs… using light, using photonics.
  • [54:23, Ongoing] We are continuing this evolution of Huang’s Law, continuing to more than double inference performance each year.

Companies Mentioned (6)

Google · Huawei · Intel · Xilinx · ARM · Alibaba, Tencent, Baidu

Notable Quotes (3)

This curve has come to be known as Huang’s Law, which is that inference performance doubles every year. Actually, we’re more than doubling it every year. — Bill Dally @ 14:23

The future of graphics is AI. In fact, the future of almost everything is AI. — Bill Dally @ 25:55

It’s absolutely what I call the perfect storm for a computational global defense system. — Kimberly Powell @ 01:17:34

Key Topics

Ampere Architecture · Deep Learning Inference · Ray Tracing · Generative AI · Healthcare & Drug Discovery · Robotics · Autonomous Vehicles · Silicon Photonics · China Market Strategy · ARM Acquisition · Cloud Computing · Startup Ecosystem

Takeaways

  • The Ampere A100 GPU delivers massive performance leaps for both AI training and inference, driven by TF32 and structured sparsity.
  • NVIDIA is outpacing Moore’s Law with ‘Huang’s Law’, achieving a 317x increase in inference performance over 8 years through architectural innovations.
  • AI is fundamentally transforming computer graphics, enabling real-time path tracing and AI-driven upscaling (DLSS), with a future where graphics are entirely AI-generated.
  • The COVID-19 pandemic has accelerated the adoption of AI in healthcare, creating a ‘computational global defense system’ for drug discovery and medical imaging.
  • NVIDIA is heavily investing in future technologies like silicon photonics to overcome electrical bandwidth limitations in data center interconnects.
  • The China market is highly strategic for NVIDIA, supported by deep partnerships with major Cloud Service Providers (Alibaba, Tencent, Baidu) and a massive developer base.
  • NVIDIA’s planned acquisition of ARM aims to bring ARM’s energy-efficient architecture into the data center, creating a viable alternative to x86.