Dwarkesh + Ilya Sutskever: Age of Research
Category: Expert Interviews · Duration: 96 min · ▶ Watch
Speakers: Dwarkesh Patel · Ilya Sutskever
Segments (8)
- 00:00 · The Impact of AI and RL vs Pre-training
- Discussion on the economic impact of AI and the differences between RL and pre-training.
- 05:00 · RL Scaling and Environments
- Exploring how RL scales and the importance of diverse environments.
- 12:00 · Value Functions and Emotions
- Comparing human emotions to value functions in reinforcement learning.
- 18:00 · The Age of Scaling vs The Age of Research
- Transitioning from an era dominated by scaling back to an era of research.
- 22:30 · Sponsor Break and RL Scaling Paper
- Dwarkesh discusses a paper on scaling RL compute and a toy experiment with Gemini.
- 26:00 · SSI and AI Alignment
- Ilya discusses Safe Superintelligence (SSI) and approaches to AI alignment.
- 38:00 · The Future of AI and AGI
- Predictions on the timeline to AGI and the societal impact of superintelligence.
- 48:00 · AI Competition and Convergence
- How different companies might converge on similar AI capabilities and the resulting dynamics.
Specific Prices (2)
| Timestamp | Item | Value | Context |
|---|---|---|---|
| 40:56 | SSI Funding | $3 billion | The amount of money raised by Safe Superintelligence (SSI). |
| 22:08 | OpenAI Research Spending | $5-6 billion a year | Estimated spending by OpenAI on research experiments. |
Bottleneck Claims (3)
- [17:38] Ideas and engineering were the bottlenecks in the 90s.
- Evidence: People had good ideas but lacked the compute to prove them.
- [18:08] Compute was the bottleneck for AlexNet.
- Evidence: AlexNet was built on just 2 GPUs, which was the maximum available compute at the time.
- [18:43] Compute is no longer the primary bottleneck for proving new ideas.
- Evidence: Current compute is large enough that you don’t need massive scale to demonstrate a new concept’s viability.
Predictions (3)
- [22:24, 5-20 years] Superintelligence will be achieved in 5 to 20 years.
- [48:31, Long-term] As AI becomes more powerful, people will change their behaviors and society will adapt.
- [51:27, Near-term to Mid-term] Multiple AIs will be created roughly at the same time by different companies.
Key Technologies (4)
- Reinforcement Learning (RL): Trains models by rewarding desired behaviors, but can make them narrow.
- Pre-training: Trains models on vast amounts of data to build a broad foundation of knowledge.
- Value Functions: Evaluates the long-term reward of a given state or action in RL.
- Transformers: The underlying architecture for modern LLMs, which required significant compute to prove effective.
Companies Mentioned (6)
Google / Gemini · OpenAI · Anthropic · Labelbox · Sardine · SSI (Safe Superintelligence)
Notable Quotes (3)
If ideas are so cheap, how come no one’s having any ideas? — Ilya Sutskever @ 17:14
The whole problem of AI and AGI is the power. — Ilya Sutskever @ 37:41
Change is the only constant. — Ilya Sutskever @ 49:08
Key Topics
Reinforcement Learning vs Pre-training · Scaling Laws in AI · AI Alignment and Safety · The Future of AGI · Compute Bottlenecks · Value Functions and Human Emotions
Takeaways
- The AI industry is transitioning from an era of pure scaling back to an era of research, as simple scaling of pre-training data hits limits.
- Reinforcement learning can make models highly capable in specific domains but may reduce their general adaptability compared to pre-training.
- Human emotions function similarly to value functions in RL, guiding long-term decision making.
- Safe Superintelligence (SSI) is focusing on research and alignment rather than just competing in the compute scaling race.
- The development of AGI will likely see multiple companies converging on similar capabilities around the same time.