Dwarkesh + Ilya Sutskever: Age of Research

Category: Expert Interviews · Duration: 96 min · ▶ Watch

Speakers: Dwarkesh Patel · Ilya Sutskever

Segments (8)

00:00 · The Impact of AI and RL vs Pre-training
- Discussion on the economic impact of AI and the differences between RL and pre-training.
05:00 · RL Scaling and Environments
- Exploring how RL scales and the importance of diverse environments.
12:00 · Value Functions and Emotions
- Comparing human emotions to value functions in reinforcement learning.
18:00 · The Age of Scaling vs The Age of Research
- Transitioning from an era dominated by scaling back to an era of research.
22:30 · Sponsor Break and RL Scaling Paper
- Dwarkesh discusses a paper on scaling RL compute and a toy experiment with Gemini.
26:00 · SSI and AI Alignment
- Ilya discusses Safe Superintelligence (SSI) and approaches to AI alignment.
38:00 · The Future of AI and AGI
- Predictions on the timeline to AGI and the societal impact of superintelligence.
48:00 · AI Competition and Convergence
- How different companies might converge on similar AI capabilities and the resulting dynamics.

Specific Prices (2)

Timestamp	Item	Value	Context
40:56	SSI Funding	$3 billion	The amount of money raised by Safe Superintelligence (SSI).
22:08	OpenAI Research Spending	$5-6 billion a year	Estimated spending by OpenAI on research experiments.

Bottleneck Claims (3)

[17:38] Ideas and engineering were the bottlenecks in the 90s.
- Evidence: People had good ideas but lacked the compute to prove them.
[18:08] Compute was the bottleneck for AlexNet.
- Evidence: AlexNet was built on just 2 GPUs, which was the maximum available compute at the time.
[18:43] Compute is no longer the primary bottleneck for proving new ideas.
- Evidence: Current compute is large enough that you don’t need massive scale to demonstrate a new concept’s viability.

Predictions (3)

[22:24, 5-20 years] Superintelligence will be achieved in 5 to 20 years.
[48:31, Long-term] As AI becomes more powerful, people will change their behaviors and society will adapt.
[51:27, Near-term to Mid-term] Multiple AIs will be created roughly at the same time by different companies.

Key Technologies (4)

Reinforcement Learning (RL): Trains models by rewarding desired behaviors, but can make them narrow.
Pre-training: Trains models on vast amounts of data to build a broad foundation of knowledge.
Value Functions: Evaluates the long-term reward of a given state or action in RL.
Transformers: The underlying architecture for modern LLMs, which required significant compute to prove effective.

Companies Mentioned (6)

Google / Gemini · OpenAI · Anthropic · Labelbox · Sardine · SSI (Safe Superintelligence)

Notable Quotes (3)

If ideas are so cheap, how come no one’s having any ideas? — Ilya Sutskever @ 17:14

The whole problem of AI and AGI is the power. — Ilya Sutskever @ 37:41

Change is the only constant. — Ilya Sutskever @ 49:08

Key Topics

Reinforcement Learning vs Pre-training · Scaling Laws in AI · AI Alignment and Safety · The Future of AGI · Compute Bottlenecks · Value Functions and Human Emotions

Takeaways

The AI industry is transitioning from an era of pure scaling back to an era of research, as simple scaling of pre-training data hits limits.
Reinforcement learning can make models highly capable in specific domains but may reduce their general adaptability compared to pre-training.
Human emotions function similarly to value functions in RL, guiding long-term decision making.
Safe Superintelligence (SSI) is focusing on research and alignment rather than just competing in the compute scaling race.
The development of AGI will likely see multiple companies converging on similar capabilities around the same time.