Kimi Founder Yang Zhilin: K2, Agentic LLMs, Beginning of Infinity
Duration: 100 min · ▶ Watch on YouTube
Guest: Yang Zhilin (杨植麟) · Founder and CEO of Moonshot AI
Chapters (10)
- 00:00 · Reflections on the First Year of Entrepreneurship
- Yang Zhilin reflects on the rapid progress of AI models and compares the journey to climbing an endless, unknown snow-capped mountain.
- 07:25 · Defining AGI and the Evolution of Work
- Discussion on AGI as a direction rather than a fixed point, and how it will eventually make everyone a ‘superman’.
- 09:15 · Reasoning Models and Test-Time Scaling
- Exploration of long-thinking reasoning models, the process of proposing and verifying conjectures, and the importance of test-time scaling.
- 15:00 · Scaffolding and Context Engineering
- The role of scaffolding and context engineering in maximizing a model’s capabilities and solving complex tasks.
- 23:55 · Key Decisions: From SFT to RL
- Yang outlines the shift in research focus from Supervised Fine-Tuning (SFT) in 2023-2024 to Reinforcement Learning (RL) in 2024-2025.
- 27:15 · Optimizers and Token Efficiency
- The introduction of the Muon optimizer to improve token efficiency compared to the traditional Adam optimizer.
- 40:15 · Agent Generalization and Tool Use
- Challenges in agent generalization, the transition from specific tool environments to general-purpose tool use, and the limitations of current benchmarks.
- 53:15 · Open Source vs. Closed Source and Commercialization
- Perspectives on the open-source ecosystem, market integration, and the commercialization potential of AI products.
- 70:45 · The K2 Model and Future Challenges
- Discussion on the K2 model, the necessity of continuous iteration, and the ultimate goal of building a world model.
- 89:35 · AI as a Meta-Science and Personal Philosophy
- Yang shares his view of AI becoming a meta-science, the inevitable progress of technology, and his personal reflections on creation and meaning.
Specific Numbers (5)
| Time | Fact | Value | Context |
|---|---|---|---|
| 01:45 | Model progress | Two years ago | It was hard to imagine the current capabilities of models two years ago. |
| 07:45 | AGI capability | 99% | AGI might do things better than 99% of humans. |
| 23:55 | Key decisions timeline | 2023-2024 | SFT was the focus of the research paradigm. |
| 24:05 | Key decisions timeline | 2024-2025 | The focus shifted to Reinforcement Learning (RL). |
| 27:25 | Optimizer usage | 10 years | The Adam optimizer has been used for ten years. |
Research Claims & Predictions (5)
- [13:15] Test-time scaling is crucial for effective reasoning.
- evidence: It allows models to propose conjectures, verify them, and fix bugs iteratively, leading to better answers.
- [27:15] The Muon optimizer significantly improves token efficiency.
- evidence: It allows models to absorb data faster, making one data point equal to two for others, though it presents stability challenges during training.
- [43:45] Agent generalization is the next major challenge.
- evidence: Current agents struggle with out-of-distribution (OOD) scenarios and require better on-policy sampling and RL to improve.
- [82:35] Building a world model is equivalent to creating a world.
- evidence: A good world model will have a higher ceiling and more knowledge, acting similarly to a reinforcement learning process.
- [91:05] AI will become a meta-science.
- evidence: It will take decades, but AI will eventually drive the progress of other scientific fields.
Key Concepts (7)
- [07:25] AGI (Artificial General Intelligence)
- Described not as a specific endpoint, but as a continuous direction of improvement where AI eventually surpasses most human capabilities.
- [13:15] Test-time scaling
- The process of allowing a model to spend more compute time during inference to iteratively refine, verify, and polish its answers.
- [15:05] Scaffolding
- External structures or tools built around a model to help it perform complex tasks or utilize tools it couldn’t handle natively.
- [15:45] Context Engineering
- Designing the input context and methods to guide the model’s logic and behavior effectively.
- [27:15] Muon Optimizer
- An alternative to the Adam optimizer that improves token efficiency and learning speed, though it is harder to stabilize.
- [82:35] World Model
- A model that understands and simulates the rules of the world, providing a higher ceiling for intelligence and knowledge.
- [91:05] Meta-science
- The concept that AI will become a foundational science that accelerates and enables discoveries in all other scientific domains.
People Mentioned (3)
- Yang Zhilin — Founder and CEO of Moonshot AI, the interviewee.
- Xiaojun — The host conducting the interview.
- Isaac Newton — Mentioned metaphorically to explain how theories and models need continuous adjustment and explanation.
Companies Mentioned (4)
Moonshot AI · Anthropic · OpenAI · ByteDance
Notable Quotes (4)
A place full of snow… Like the paradigm of reinforcement learning. — Yang Zhilin @ 02:20
But AGI is what you keep doing. — Yang Zhilin @ 08:25
AI will become a meta-science. — Yang Zhilin @ 91:05
In the end, the progress of technology is inevitable. — Yang Zhilin @ 91:45
Career Arc & Personal Stories (2)
- [00:00] Yang Zhilin reflects on his first year of entrepreneurship with Moonshot AI, describing the journey as climbing an endless, unknown snow-capped mountain where new problems constantly arise but are ultimately solvable.
- [92:25] Yang discusses his personal philosophy, emphasizing the importance of human experience and creative work, and how AI’s progress, while inevitable, should aim to help people live better lives.
Tools & Models Discussed (8)
- Kimi: Moonshot AI’s primary conversational AI product.
- Claude: Anthropic’s AI model, referenced for its reasoning capabilities and coding tools.
- K1.5: A model mentioned as following OpenAI’s trajectory, likely an internal or upcoming iteration.
- K2: Moonshot AI’s base model, noted for performing very well and serving as a foundation for further scaling and multimodal capabilities.
- Cursor: An AI-powered code editor mentioned in the context of coding agents.
- Claude Code: An agentic coding tool by Anthropic.
- Adam: A widely used optimization algorithm for training deep learning models.
- Muon: A newer optimizer used by Moonshot to improve token efficiency during model training.
Topics
AGI Development · Reinforcement Learning (RL) · Test-Time Scaling · Agent Generalization · Model Optimization (Muon vs. Adam) · Open Source vs. Closed Source AI · World Models · AI Commercialization
Takeaways
- The journey to AGI is like climbing an endless mountain; it’s a continuous process of solving new, complex problems.
- Test-time scaling and reinforcement learning are currently the most critical areas for advancing model reasoning capabilities.
- Improving token efficiency through new optimizers like Muon is essential, despite the engineering challenges they present.
- Generalizing AI agents to handle out-of-distribution tasks without relying heavily on scaffolding is the next major hurdle.
- AI is on a trajectory to become a ‘meta-science’ that will fundamentally accelerate all other fields of human knowledge.