Episode 140 — Shunyu Yao

Host: Xiaojun · Duration: 230 min · ▶ Watch on YouTube

Let Me Go a Little Crazy! A 4-hour interview with Shunyu Yao on training models at Anthropic and Gemini, technical predictions, and the end of heroism

Switch language → zh

Chapters (40)

  • 00:00:00 · Guest’s Self-Introduction and Distinguishing Between ‘Two Yao Shunyu’s’
    • Guest Yao Shunyu introduced his physics background and transition to AI, clarifying the distinction between himself and another AI professional with the same name, primarily in their academic backgrounds (physics vs. computer science).
  • 00:06:50 · Shift in AI Development Stages: From ‘Can it be done’ to ‘What should be done’
    • The guest believes AI development has entered a new phase where people are no longer concerned about whether AI can achieve something, but rather how to properly define problems and determine AI’s application directions, which requires more human insight.
  • 00:08:10 · Homogenization and Differentiation of Model Capabilities
    • The guest pointed out that while AI models show convergence in public benchmark tests, users can still perceive differences in actual experience, with different models having advantages in specific areas such as tool use, coding, and reasoning.
  • 00:12:28 · The OpenCloud Phenomenon and the Evolution of Product Forms
    • The guest believes that the emergence of OpenCloud is not a surprising technological breakthrough, but rather a natural overflow of model capabilities. It demonstrates a possibility and prompts the industry to consider how to productize model capabilities.
  • 00:15:35 · Survival Strategies for AI Startups and the ‘Data Flywheel’
    • The guest discussed the reasons for AI startups (such as Minus and OpenCloud) being acquired, believing that long-term survival requires barriers, currently mainly at the model level, and potentially at the product level in the future. He mentioned that in AI-native application scenarios, apart from code generation, there haven’t been successful cases that truly form a ‘data flywheel’.
  • 00:20:56 · Improvement in Model Learning Capabilities and Continuous Progress in Pre-training
    • The guest believes that the pace of model capability improvement has not slowed down; instead, learning capabilities are becoming stronger. He pointed out that the Scaling Law for pre-training has not yet peaked, and significant progress will continue in the coming months, unlocking more application scenarios, such as realizing the dream of a personal assistant.
  • 00:26:37 · Drivers of AI Development: Data, Computing Power, and Algorithms
    • The guest analyzed the key factors driving the improvement of AI model capabilities. He believes that under the current clear pre-training and post-training framework, data and computing power are the main drivers, and they are interconnected. The role of algorithms is to break through bottlenecks, achieving a transformation from ‘cannot do’ to ‘can do’.
  • 00:36:23 · Rapid Development and Advantages of the Code Generation Field
    • The guest pointed out that code generation is one of the fastest-developing areas in AI, with its advantages lying in clear feedback signals and naturally high-quality data sources (GitHub), which enables models to learn and iterate efficiently.
  • 00:38:26 · Impact of AI on Programming Work Efficiency
    • The guest shared AI’s application in programming, estimating that 90% or even more code is generated by models, greatly simplifying programming products. AI can also significantly accelerate the understanding and processing of complex code, boosting experimental efficiency by 20-50 times.
  • 00:41:13 · Impact of AI on Work Patterns and Time
    • Although AI has improved work efficiency, the guest found that his working hours have actually increased because faster development allows for trying more ideas. AI has also led to higher work intensity; Google is no longer a ‘retirement home’ in the AI field.
  • 00:42:53 · Impact and Challenges of AI in Other Fields
    • AI has begun to influence basic scientific research (e.g., mathematics, physics), accelerating derivations and experiments. AI excels at tasks with clear logic and objective evaluation criteria but struggles with areas lacking clear standards, such as product management.
  • 00:46:20 · Impact and Advice of AI on the Future of Programmers’ Careers
    • The guest believes AI will gradually replace some programming jobs. In the future, a small number of top programmers will command more value. He advises programmers to embrace new technologies, learn to collaborate effectively with AI, and cultivate technical strength, organizational understanding, and planning skills.
  • 00:49:50 · Views on Gemini and China’s AI Development
    • The guest believes Gemini’s release is more about excellent technical execution rather than a paradigm shift, but it puts pressure on multimodal teams. He points out that the AI gap between China and the US is narrowing, and China’s computational disadvantage has spurred technological innovations like distillation.
  • 01:03:16 · Views on Robotics and Personal Growth Experiences
    • The guest believes robot AI is still in its early stages and has not yet achieved generalized expansion. He shared his experience of transitioning from physics to AI, emphasizing the importance of daring to try and seize opportunities, and finds the work in robotics labs very interesting.
  • 01:17:44 · Striving to Enter Tsinghua: A Text Message That Changed Destiny
    • The guest recalls his experience of entering Tsinghua University through independent admissions via a physics competition in high school. Despite a temporary policy change, he actively sent a text message to the admissions office teacher to strive for, eventually obtaining the qualification for the exam and being admitted. He believes this reflects Tsinghua’s spirit of being willing to provide equal opportunities to students.
  • 01:20:05 · Personality and Family: Rebellious, Competitive, and Parents’ ‘Laissez-faire’ Approach
    • The guest talks about his personality, believing he is very opinionated and, once he sets his mind on something, will do his best. He is also competitive, but mainly with himself. He describes his parents’ educational approach as ‘laissez-faire’ (无为而治), because they couldn’t control him, so they chose to let go and let him make his own decisions.
  • 01:21:55 · Undergraduate Research’s ‘Fortuitous Coincidence’: Entering the Field of Condensed Matter Theory
    • The guest recounts how he ‘fortuitously’ chose condensed matter theory as his research direction during his undergraduate studies. Following the tradition of Tsinghua’s basic science program, he entered the Institute for Advanced Study early on, began theoretical research with Professor Wang Zhong, and believed it was a very suitable direction for undergraduates to get started in.
  • 01:25:08 · Non-Hermitian System Research: A Paradigm Breakthrough
    • The guest elaborates on his important research work on open quantum systems (non-Hermitian systems) during his undergraduate studies. They discovered that the traditional Bloch wave theory used to describe Hermitian systems failed, and systematically established a new set of theoretical methods to describe the behavior of such systems. This work was later proven to be a significant advance in the field.
  • 01:28:33 · Human Weakness: Why Abandon a Successful Research Direction
    • After achieving significant research results in his undergraduate studies, the guest chose to switch directions during his Ph.D. He attributes this to ‘human weakness,’ meaning he always wants to challenge himself with unfamiliar and more difficult things, feeling that the most core work in his original direction had been completed, and subsequent exploration was no longer as exciting.
  • 01:38:34 · Ph.D. Lessons: High-Energy Theory and the Dilemma of ‘Serving Old Lamps’
    • During his Ph.D., the guest chose high-energy theory, which is extremely difficult and cannot be verified by experiments. He reflects that although this experience helped him grow, it contributed almost nothing to the world, and he drew an important lesson: to do things with objective evaluation standards that can have a real impact, rather than wasting time ‘serving old lamps’ (flattering authorities in the field).
  • 01:44:27 · Turning to AI: The Choice Between Quantum Computing and Artificial Intelligence
    • After his postdoctoral studies, the guest chose AI over quantum computing. He believed that the bottleneck for quantum computing at the time lay in experimental physics, while AI’s research paradigm (proposing ideas, verifying with numerical experiments) was more similar to the theoretical physics research he was good at and enjoyed, more like the golden age of physics in the 18th century when theory and experiment were inseparable.
  • 01:54:42 · Joining Anthropic: The Physics Community’s Network
    • The guest shared the process of joining Anthropic, mainly benefiting from his network. Many members of Anthropic’s founding team have theoretical physics backgrounds, and he got an interview opportunity through a recommendation from a former colleague. He believes that this continuation of cross-disciplinary networks is a characteristic of early AI companies.
  • 01:55:18 · Choosing Anthropic: Embracing the Uncertainty of Reinforcement Learning
    • The guest recounted his interview choices between OpenAI and Anthropic. He was attracted by Anthropic’s exploration of large-scale reinforcement learning, and although he knew little about the field at the time, he saw it as a good opportunity full of uncertainty. To prepare, he studied courses by Andrej Karpathy, among other methods.
  • 01:57:11 · Early Impressions of Anthropic: A Highly Executing Top-down Workshop
    • The guest recalled that when he first joined, Anthropic was still a small company of 700-800 people, and his Horizon team had only about 10 members. His first impression of the company was its extremely strong execution, unique top-down culture, and open internal atmosphere, making it an excellent learning environment.
  • 02:00:10 · Cultural Origins: Tech Leaders as Founders, Empowering Efficient Decision-Making
    • The guest deeply analyzed the source of Anthropic’s highly efficient execution. He believes the key lies in the company’s technical leaders also being co-founders (e.g., Jared Kaplan), who possess decision-making power and are accountable for it, thereby achieving an efficient top-down decision-making mechanism. This is crucial for startups that need to ‘make bets’.
  • 02:07:35 · The Collectivist Era of AI R&D: Individual Contributions and Systemic Achievements
    • After participating in the Claude 3.7 project, the guest believes that AI R&D has entered a collectivist era, no longer a stage for individual heroism. He attributes his achievements to joining important projects at the right time and emphasizes that AI’s progress is the result of the entire organization and system working together, rather than the contribution of a single individual.
  • 02:16:16 · Paradigm Bottleneck: Technology Not Peaked, Application Imagination Leads
    • The guest corrected his earlier view that ‘pre-training has reached its end,’ believing that neither pre-training nor post-training has reached a technological plateau. He pointed out that the current bottleneck lies more in our imagination for AI applications, being limited to known scenarios like chatbots and code assistants, and not knowing what to teach the model next.
  • 02:20:22 · Why Leave: Personal Philosophy, Cultural Changes, and Learning Aspirations
    • The guest elaborated on three main reasons for leaving Anthropic: first, disagreement with some of the CEO’s politicized statements; second, cultural dilution caused by the company’s rapid expansion and a dislike for the ‘boasting’ atmosphere; and third, a desire to broaden his learning scope and explore areas Anthropic had not yet ventured into, such as multimodal AI and low-level engineering.
  • 02:33:44 · Anthropic’s Original Intention for AI Safety and Naive Implementation Path
    • The guest discussed the founding background of Anthropic, which was driven by the original intention of AI safety. He believes Anthropic’s initial idea of gaining a voice in AI safety by building the strongest model was ‘very naive,’ and, drawing an analogy to the multi-party checks and balances of nuclear weapons, suggested that AI safety might ultimately require a similar decentralized mechanism.
  • 02:36:18 · The Essence of AI: Experimentability and Self-Evolution
    • The guest proposed a personal view: the essence of AI is simple, and its simplicity lies in its ‘ability to conduct experiments.’ Unlike disciplines such as physics, which are limited by experimental conditions, any idea in the AI field can be verified through experiments. The current bottleneck is ‘having too many ideas that need to be tested one by one.’
  • 02:38:45 · Looking Back at Anthropic: Misjudgment of Business Model and Individual Power of Product Innovation
    • The guest admitted that he was pessimistic about Anthropic’s business model when he left, believing that a pure API model was a ‘bad business’ that would eventually lead to a price war. However, he acknowledged that he was ‘overly pessimistic,’ as Anthropic later stabilized its position through clever product innovations (such as Claude Code), with the ‘individual heroism’ of people like Boris playing a crucial role.
  • 02:42:25 · New Journey at DeepMind: ML Coding and Long Horizon
    • The guest introduced his two main research directions at Google DeepMind: ‘ML Coding’ aims to achieve a closed loop of AI self-research; ‘Long Horizon’ explores how to make models ‘train with finite, but use as infinite,’ meaning training with limited context but being able to handle infinitely long tasks.
  • 02:47:48 · Reversal of the Competitive Landscape: How Gemini Fought Back
    • The guest analyzed how Google attracted users through ‘Nano-Bard’ marketing and then retained them with the powerful Gemini 1.5 model, achieving a strategy of ‘punching back’ to regain market share. He believes this combination of moves allowed Google to shift from a passive to an active position, becoming a significant player in the market.
  • 02:58:30 · The Far-Off Endgame: New Forms of AI Beyond the Chatbox
    • The guest believes that ‘no one’s position is stable’ in the current AI field, and the endgame is far from here. He finds the domestic ‘Super App’ all-in approach ‘perplexing’ and frankly states that the current Chatbot interaction form is ‘stupid,’ failing to fully unleash the model’s capabilities, and looks forward to the emergence of new product forms.
  • 03:12:10 · World Models and Long Horizon
    • The guest discussed two methods to solve the ‘long horizon’ problem: fine-tuning model weights and context management. He believes both are essentially for achieving long-term tasks and questioned the ambiguity of the ‘world model’ concept, noting that different researchers have different definitions for it.
  • 03:14:39 · Leadership and Culture in AI Labs
    • The guest talked about the decision-makers at Google and DeepMind, pointing out that Sergey Brin is the ultimate decision-maker, while Koray Kavukcuoglu is more active at the execution level. He believes a good technical leader needs the ability to solve problems personally and the magnanimity to accommodate others’ ideas.
  • 03:16:01 · The Importance of Systems Thinking and Reliability
    • The guest emphasized that in the current era of large models, AI is systems engineering. The most important quality is not intelligence, but ‘reliability’ – being meticulous and responsible in work, and having the ability to think about problems from a holistic perspective, rather than ‘hacking’ metrics just to make personal project data look good.
  • 03:22:32 · Hardware and Architectural Choices: TPU vs. GPU
    • The guest compared the architectural design philosophies of TPU and GPU. GPUs achieve high-speed interconnection within small clusters via NVLink, while TPUs adopt a massively scalable 3D Torus topology. He believes there is no absolute superiority or inferiority between the two in large-scale commercial scenarios; the key lies in whether the accompanying compilers and software stacks can leverage the hardware advantages.
  • 03:25:30 · Comparing US and Chinese AI Product Strategies
    • The guest analyzed the differences in US and Chinese AI product strategies. The US market excels in direct, clear efficiency and enterprise-level software, while China is adept at creating complex consumer-facing products with indirect business models, such as Douyin. He believes that US consumer product managers are not as capable as their Chinese counterparts, partly because it’s ‘too easy’ to make money in the US market.
  • 03:29:51 · Personal Philosophy and Career Advice
    • The guest believes that the era of ‘individual heroism’ in AI has passed, and it is now an era of ‘collectivism’. He advises young people not to flock to the hottest language model field, but to explore newer ‘blue oceans’ such as multimodal AI and robotics. He also stated that he would not stay at one company for long and would continue to seek challenges that ‘torment’ him.

Notable Quotes (43)

  • 00:02:27 — 姚顺宇:

    Original (中文): 我觉得这个行业就是最重要的特质就是靠谱。就是做事细,然后对自己做的事负责任,这是最重要的特质。 I think the most important characteristic of this industry is reliability. Being meticulous in work and taking responsibility for what you do, that’s the most important characteristic.

    • The guest emphasized that reliability and responsibility are the most important qualities for AI professionals, which is particularly crucial in the rapidly changing AI field.
  • 00:07:03 — 姚顺宇:

    Original (中文): 对我来说确实现在AI进入了一个阶段就是我觉得大家都已经开始不那么担心一件事AI是不是能够做得到,而是担心这件是不是被良好定义。 For me, AI has indeed entered a stage where I think everyone is no longer so worried about whether AI can do something, but rather whether that thing is well-defined.

    • The guest pointed out a fundamental shift in AI development stages, moving from technical feasibility to thinking about problem definition and application directions.
  • 00:07:57 — 姚顺宇:

    Original (中文): 我觉得现在对大家更难的事情是是想明白要做什么。 I think what’s harder for everyone now is figuring out what to do.

    • The guest summarized the core challenge currently facing the AI field, which is how to clarify goals and application scenarios, rather than merely pursuing technological breakthroughs.
  • 00:22:15 — 姚顺宇:

    Original (中文): 我觉得对它来说最大的用处就是,如果不抛掉花了多少钱之外,它最大的用处是获得了一批很好的在亚洲的产品团队。 I think its biggest use, if you disregard how much money was spent, is acquiring a very good product team in Asia.

    • The guest analyzed the strategic value of Meta’s acquisition of Minus, believing its main purpose was to acquire talent and teams, rather than the product itself.
  • 00:25:13 — 姚顺宇:

    Original (中文): 我觉得模型做到了train with finite context, use as infinite context。就是换句话说就是你用有限的这个context length去训练它,但是可以在使用的时候用非常非常长甚至接近无限的context length。 I think the model has achieved ‘train with finite context, use as infinite context.’ In other words, you train it with a limited context length, but you can use a very, very long, even nearly infinite, context length during inference.

    • The guest elaborated on a key technological breakthrough in AI models’ context processing capabilities, which will greatly expand the models’ application scope.
  • 00:27:51 — 姚顺宇:

    Original (中文): 我觉得模型学习能力越来越强了。以前可能让模型学会干一件事,需要动很多脑筋。但现在可能不需要动那么多脑筋了。最重要的事是你要把这个问题定义清楚,然后想清楚怎么去构建合适的数据。 I think models’ learning capabilities are getting stronger and stronger. In the past, teaching a model to do something might have required a lot of thought. But now, it might not require as much thought. The most important thing is to clearly define the problem and figure out how to construct appropriate data.

    • The guest emphasized the significant improvement in model learning capabilities, shifting the key to AI development towards problem definition and data construction, rather than complex model tuning.
  • 00:35:04 — 姚顺宇:

    Original (中文): 人就是这样,就是当你没有撞到头的时候,你其实不知道这路有多长。我能我能看到的就是现在还没撞到头。我也不知道哪天会撞到头。 That’s how people are; when you haven’t hit your head, you don’t really know how long the road is. What I can see is that we haven’t hit our heads yet. I also don’t know when we will.

    • The guest used a vivid metaphor to express the unpredictability of AI technology’s development path, emphasizing that the current stage is still one of rapid exploration.
  • 00:36:40 — 姚顺宇:

    Original (中文): Coding这个事,其实从Cloud 3.5 new,或者外界有人管它叫Cloud 3.6,从那个之后一直都处于高速发展的状态。 Coding, this matter, has actually been in a state of rapid development since Cloud 3.5 new, or what some outside call Cloud 3.6.

    • The guest pointed out that code generation is an application scenario in the AI field that continues to develop rapidly, and provided a specific timeframe.
  • 00:37:12 — 姚顺宇:

    Original (中文): Coding的优势就是它的reward signal,就是它的那个反馈信号是很好定义的。 The advantage of coding is that its reward signal, its feedback signal, is very well-defined.

    • The guest explained the unique advantage of code generation as an AI application scenario, namely that its clear feedback mechanism helps models learn efficiently.
  • 00:38:07 — 姚顺宇:

    Original (中文): Coding的数据有一个非常天然的基础,这个基础就是GitHub。 Coding data has a very natural foundation, and that foundation is GitHub.

    • The guest pointed out the natural advantage of the code generation field having high-quality, large-scale datasets, which is an important reason for its rapid development.
  • 00:45:51 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我个人觉得,做产品经理,是我现在想不明白该怎么训练AI去做的事。 Personally, I think product management is something I currently can’t figure out how to train AI to do.

    • The guest pointed out areas where AI is currently difficult to replace, namely tasks lacking clear standards and objective evaluation.
  • 00:47:02 — 姚顺宇 (Shunyu Yao):

    Original (中文): AI是一个很centralized的technology,它会让少部分人变得更强,但会让大部分人失去他们的独特价值。 AI is a very centralized technology; it will make a small number of people stronger, but it will cause most people to lose their unique value.

    • The guest offered profound insights into the potential social stratification and career impact brought by AI technology.
  • 00:47:45 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我觉得对程序员来说,最重要的事是怎么样和AI去有效地协作。 I think for programmers, the most important thing is how to effectively collaborate with AI.

    • The guest emphasized that in the age of AI, programmers need to change their roles and make collaboration with AI a core competency.
  • 01:01:59 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我个人觉得,它(Doubao)的语音生成可能是全世界最好的之一,我客气地说可能是全世界最好的。 Personally, I think its (Doubao’s) speech generation might be one of the best in the world; politely speaking, it might be the best in the world.

    • The guest highly praised the speech generation capabilities of China’s AI model, Doubao.
  • 01:10:03 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我个人觉得,机器人模型目前还处于GPT-1之前的时刻,还没有到GPT-1的时刻。 Personally, I think robot models are currently still in the pre-GPT-1 era, not yet at the GPT-1 moment.

    • The guest’s assessment of the development stage of robot AI technology, believing it has not yet reached the breakthrough level of language models like GPT-1.
  • 01:13:21 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我这个人,我个人的个性就是,总是爱干一些自己不太会的事。 My personal characteristic is that I always like to do things I’m not very good at.

    • The guest shared his personal trait of daring to challenge the unknown and try new things.
  • 01:16:08 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我从那件事(争取入学机会)得到的人生最重要的道理,就是胆子要大。 The most important lesson I learned from that incident (fighting for admission) is to be bold.

    • The guest summarized his experience of changing his destiny by seizing opportunities, emphasizing the importance of courage.
  • 01:18:30 — guest:

    Original (中文): 我感觉这个学校是愿意给大家提供机会, 给大家提供平等机会的。 I feel that this university is willing to provide opportunities to everyone, to provide equal opportunities to everyone.

    • Highly praises the spirit of Tsinghua University and points out that this is an important feeling of gratitude from his personal experience.
  • 01:19:06 — guest:

    Original (中文): 难道不是没有干到最好, 就是很菜吗?然后我显然没有干到最好, 所以就是很菜。 Isn’t it true that if you don’t do your best, you’re just bad? And I clearly didn’t do my best, so I’m just bad.

    • Evaluates his performance in the physics competition with an extreme standard and a self-deprecating tone, reflecting his personality.
  • 01:20:56 — guest:

    Original (中文): 当你没有办法理解别人在干什么的时候, 别指手画脚就是最好的。我觉得我爸妈这个道理懂得很好。 When you can’t understand what others are doing, it’s best not to point fingers. I think my parents understood this principle very well.

    • Succinctly summarizes his parents’ educational approach and elevates it to a universal wisdom.
  • 01:28:43 — guest:

    Original (中文): 这就是人性的弱点, 就是我感觉我总爱挑战一些自己不会的事。 This is human weakness; I feel like I always love to challenge things I’m not good at.

    • Provides a profound self-analysis of his motivation to constantly cross boundaries and challenge new fields.
  • 01:40:57 — guest:

    Original (中文): 这个大教训就是要去做有比较客观评价标准的事。 This big lesson is to do things that have relatively objective evaluation standards.

    • Summarizes his core reflection from his Ph.D. research in high-energy theory; this lesson directly influenced his subsequent career choice to switch to AI.
  • 01:46:41 — guest:

    Original (中文): 这个世界上所有东西都是黑盒…科学其实也不是真的有一个从它微观的行为一路演化到宏观的体现的这种理解。 Everything in this world is a black box… Science doesn’t really have an understanding that evolves from its microscopic behavior all the way to its macroscopic manifestation.

    • Provides a grander and deeper analogy for the ‘black box’ problem from a physics perspective, suggesting that understanding any complex system is based on effective theories at specific scales, rather than ultimate truth.
  • 01:50:42 — guest:

    Original (中文): 为什么要把自己的时间浪费在伺候老灯身上。 Why waste your time serving old lamps?

    • Uses very direct and sharp language to express his aversion to an academic environment that lacks objective standards and relies on subjective evaluations from authorities.
  • 01:59:48 — guest:

    Original (中文): 我觉得公司的印象就是执行力非常强…它其实是一个比较top-down的公司。 I think the company’s impression is that its execution is very strong… it’s actually a relatively top-down company.

    • Accurately summarizes the distinctive organizational cultural characteristics of Anthropic.
  • 02:01:26 — guest:

    Original (中文): 我觉得这个公司很强的一点,就是它execution,执行力非常非常强。一旦给它一个信号,让它觉得是很reasonable,这个公司该做的事,那就会扑上去。 I think one very strong point about this company is its execution, its execution ability is extremely strong. Once it’s given a signal that it deems reasonable, something the company should do, it will pounce on it.

    • Vividly describes Anthropic’s rapid response capability to market and technological opportunities.
  • 02:03:21 — guest:

    Original (中文): Anthropic有这个条件,就是说它的技术上的leader,它的领导人,其实是公司的co-founder。 Anthropic has this condition, meaning its technical leaders, its leaders, are actually the company’s co-founders.

    • Points out the organizational structure root that enables Anthropic to achieve efficient top-down decision-making.
  • 02:19:11 — guest:

    Original (中文): 我觉得我个人对任何一个模型的贡献,我的阐述都是,我觉得我自己对那个事没那么重要,我觉得我更多的是我很幸运,有机会在那个时候加入了一个重要的项目,做了一些事。 I think my personal contribution to any model, my explanation is, I don’t think I’m that important to that matter, I think I’m more fortunate to have had the opportunity to join an important project at that time and do some things.

    • Expresses the view that in the current era of large models, individual contributions are relatively small, and platforms and timing are more important.
  • 02:19:32 — guest:

    Original (中文): 它不在于你这个人去干或者不干,你不干自有别人一样能干出来的。 It’s not about whether you do it or not; if you don’t, someone else will do it just as well.

    • Emphasizes the inevitability and unstoppable trend of AI technology development, downplaying individual heroism.
  • 02:24:32 — guest:

    Original (中文): Idea is cheap. 想法是是是便宜的,很多想法其实很显然,所有人也都知道,难的是怎么把实现,怎么把它变成一个一个小的可实现的步骤,把它做出来。 Idea is cheap. Ideas are cheap, many ideas are actually obvious, everyone knows them. The difficult part is how to implement them, how to turn them into small, achievable steps and make them happen.

    • Sharply points out that in the complex field of AI engineering, execution is far more important than mere ideas.
  • 02:29:39 — guest:

    Original (中文): 我觉得本质上还是这个组织做了这样一件事情,或者这个世界需要这样。 I think essentially, it’s still this organization that did such a thing, or the world needed it this way.

    • Summarizes the driving forces behind major AI breakthroughs from a higher dimension, as a combination of organizational capability and societal needs.
  • 02:34:38 — 姚顺宇 (Shunyu Yao):

    Original (中文): 从我个人角度来说我觉得这个想法是非常幼稚的…更可能发生就是大家都有很好的前沿模型,而你没有办法阻止这个事,任何事发生。 From my personal perspective, I think this idea is very naive… What’s more likely to happen is that everyone will have excellent frontier models, and you won’t be able to stop anything from happening.

    • Fundamentally questions the strategy of leading companies attempting to dominate AI safety rules through technological superiority.
  • 02:36:41 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我觉得它本质上简单的点在于它能做实验。它和本质上难的东西,比如说物理,它的区别在于,那个东西你没有那个能标下的实验数据,你就是理解不了那个能标下的理论。 I think its essential simplicity lies in its ability to conduct experiments. The difference between it and something inherently difficult, like physics, is that in physics, if you don’t have experimental data at a certain energy scale, you simply cannot understand the theory at that energy scale.

    • Proposed a counter-intuitive view that the essence of AI is ‘simple,’ with its core being infinite experimentability, which explains the fundamental reason for rapid iteration in this field.
  • 02:39:22 — 姚顺宇 (Shunyu Yao):

    Original (中文): 这个生意只有对一个公司是好生意,就是 Google。因为这个生意最后就是要打价格战。 This business is only a good business for one company, and that’s Google. Because this business will ultimately lead to a price war.

    • Precisely pointed out the fragility of the pure API business model and predicted that it would eventually evolve into a price war where only full-stack giants could survive.
  • 02:43:33 — 姚顺宇 (Shunyu Yao):

    Original (中文): Train with finite, but use as infinite. 我觉得想要把这个训练的长度一直一直一直变长,可能并不是一个很现实的方案。 Train with finite, but use as infinite. I think trying to make the training length longer and longer and longer might not be a very realistic solution.

    • Summarized the core philosophy for solving long-context problems, which is not to blindly pursue infinitely long training contexts, but to explore more efficient ways of utilization during inference.
  • 02:59:41 — 姚顺宇 (Shunyu Yao):

    Original (中文): 我觉得这事很蠢,就是这个模型明明有那么多能力,但居然用的方法是 chatbot。很不 make sense。 I think this is stupid; this model clearly has so many capabilities, but the method used is a chatbot. It doesn’t make sense at all.

    • Sharply criticized the current mainstream Chatbot interaction form, believing it greatly limits the potential of AI models, and called for fundamental innovation at the product level.
  • 03:12:42 — 姚顺宇 (Shunyu Yao):

    Original (中文): 一万个人有一万个世界模型…首先我不知道什么叫做一个世界模型,其次就是每个人在说他们做的世界模型的时候,可能也在说不一样的事。 Ten thousand people have ten thousand world models… First, I don’t know what a world model is, and second, when everyone talks about the world models they’re building, they might be talking about different things.

    • Points out the current situation where the popular concept of ‘world model’ is vaguely defined and lacks consensus.
  • 03:19:06 — 姚顺宇 (Shunyu Yao):

    Original (中文): 在现在这个时代,一个研究员如果做不到对全局去考虑的话,他就不是一个好的研究员。这个和你在学术界做research是很不一样的事。 In this era, if a researcher cannot consider the big picture, they are not a good researcher. This is very different from doing research in academia.

    • Clearly defines the core difference between AI researchers in industry and academia: whether they possess systemic, holistic thinking and a sense of responsibility towards the company.
  • 03:31:02 — 姚顺宇 (Shunyu Yao):

    Original (中文): 这个行业最重要的特质,就是靠谱。就是做事细,然后对自己做的事负责任,这是最重要的特质。你说那些东西有多需要脑子,我觉得都是一些本科生就能干的活。 The most important quality in this industry is reliability. That is, being meticulous in work and taking responsibility for what you do. This is the most important quality. As for how much brainpower those things require, I think they are all tasks that undergraduates can handle.

    • Counter-intuitively emphasizes that in the field of AI, ‘reliability’ and ‘responsibility’ are more important than ‘intelligence’.
  • 03:32:24 — 姚顺宇 (Shunyu Yao):

    Original (中文): 这是一个集体主义的事。 This is a collectivist endeavor.

    • Provides a precise summary of the nature of the current large model development stage, echoing his earlier statement that ‘the era of individual heroism has passed’.
  • 03:35:00 — 姚顺宇 (Shunyu Yao):

    Original (中文): 纯做语言模型已经不是一个蓝海了,我觉得末班车已经发车了。 Pure language model development is no longer a blue ocean; I think the last train has already departed.

    • Gives clear judgment and advice on career development in the AI field.
  • 03:41:19 — 姚顺宇 (Shunyu Yao):

    Original (中文): 短期一定会有人恨你,但长期大家会会欣赏这件事情。 In the short term, some people will definitely hate you, but in the long term, everyone will appreciate it.

    • Expresses a workplace philosophy about the long-term value of direct communication and sticking to one’s convictions.
  • 03:46:20 — 姚顺宇 (Shunyu Yao):

    Original (中文): 别相信老登算吗? Don’t trust the old-timers, right?

    • Expresses a critical attitude towards authority and experience in a playful yet sharp manner.

Predictions (10)

  • 00:25:13 (今年) — 姚顺宇: The model has achieved training with finite context but using very long, even nearly infinite, context during inference. This has a chance to be realized this year.
  • 00:25:32 (在技术实现之后) — 姚顺宇: After the aforementioned technology is realized, it will unlock many new applications, such as realizing everyone’s dream personal assistant.
  • 00:25:58 (今年) — 姚顺宇: The technological breakthrough in model context processing will be realized this year, no matter what.
  • 00:28:54 (未来四个月) — 姚顺宇: In the next four months, there are no signs that the Scaling Law for pre-training has reached its limit.
  • 00:46:32 (未来) — 姚顺宇 (Shunyu Yao): The day when programmers are completely replaced will come, but it won’t happen in an instant; it will definitely be a gradual process.
  • 00:47:02 (未来) — 姚顺宇 (Shunyu Yao): AI is a very centralized technology; it will make a small number of people stronger, but it will cause most people to lose their unique value.
  • 00:47:13 (未来) — 姚顺宇 (Shunyu Yao): The eventual outcome might be that one-thousandth of the current population does the work of everyone in the past, earning a hundred times the current salary.
  • 01:07:13 (未来) — 姚顺宇 (Shunyu Yao): Personally, I feel that they (robotics teams) will become very important in the future, but they haven’t found their way yet.
  • 02:37:50 (Next 6 to 12 months) — 姚顺宇 (Shunyu Yao): AI will be able to conduct its own experiments, forming a complete closed loop from writing code, running experiments, analyzing results, to proposing new hypotheses.
  • 03:25:00 (Unspecified) — 姚顺宇 (Shunyu Yao): Most newly established AI Labs will fail.

Visual Signals (Beyond the Transcript)

Production setting: A casual indoor setting, likely an office or a relaxed studio space, with a prominent potted plant and a wooden wall in the background. · production: Casual and authentic. It appears to be a single-camera setup focused on the guest, with soft, natural lighting. The style is typical of a modern podcast interview rather than a high-budget studio production.

  • props: Large potted plant in a white container next to the guest., Guest’s white t-shirt with a small black label reading ‘WETIDONE’., Guest’s distinctive gold-rimmed glasses.

Energy Shifts (10)

  • 📈 01:17:49 — Being asked what percentage of his work at Google uses competitor AI coding tools.
    • A sudden burst of genuine, hearty laughter. He leans back, his eyes crinkle, and his body language becomes much more open and amused as he jokes about the question potentially getting him fired.
  • 📈 01:42:43 — Recounting his personal story of choosing a high school to get into the competition class.
    • The speaker’s energy lifts noticeably. He becomes more expressive, with a broader smile and more animated facial expressions. He leans into the story, visually signaling that this is a topic he enjoys recounting and that is core to his identity.
  • 📉 01:49:20 — Discussing the gradual replacement of programmers by AI.
    • His expression becomes more serious and his smile fades. He adopts a more measured, thoughtful posture, with less movement, reflecting the gravity of the topic.
  • 📈 01:55:04 — Recalling his interview experiences at Gemini and Anthropic.
    • The speaker’s smile broadens and he becomes more animated, laughing as he recounts the story. The energy is light and nostalgic.
  • 📈 02:14:00 — Discussing the fundamental differences between startups and large companies in the AI space.
    • His speech becomes more animated and he begins to use hand gestures to emphasize his points. He smiles broadly when making the provocative claim that AI work doesn’t require much ‘brainpower’ but rather reliability.
  • 📉 02:23:30 — Discussing his departure from Anthropic and the cultural shifts within the company.
    • His smiling subsides, and his expression becomes more neutral and thoughtful. The pace of his speech slows slightly, and he appears more introspective.
  • 📈 02:33:45 — Recalling his proactive effort to get into Tsinghua’s special program.
    • His smile widens and he becomes more animated, using a small hand gesture to emphasize the urgency he felt at the time. His energy is high and positive.
  • 📉 02:44:21 — Explaining why he didn’t continue with his paradigm-shifting undergraduate research.
    • His smile fades, his gaze becomes more distant and contemplative, and his posture becomes slightly more still as he discusses the ‘weakness of human nature’ and his desire to tackle new challenges.
  • 📈 02:49:00 — Being asked to differentiate himself from the other famous ‘Yao Shunyu’ in the AI field.
    • He breaks into a genuine, hearty laugh, shaking his head in amusement before explaining their different backgrounds (Physics vs. Computer Science). His demeanor is light and self-aware.
  • 📉 02:57:17 — Reflecting on his PhD experience being less impactful on the world.
    • He looks down, his expression turns more serious and introspective, and he speaks in a more measured tone, conveying a sense of disappointment despite his personal growth.

Gestures Emphasizing Claims (11)

  • 01:18:35 — “Good code has common standards like being concise and having a clear structure.”
    • He makes small, precise chopping motions with his right hand, as if delineating separate, clean concepts. · The gesture visually reinforces the idea of structure, clarity, and the separation of distinct, well-defined components in code.
  • 01:55:45 — “The gap between US and Chinese models is getting smaller.”
    • He brings his thumb and index finger close together, leaving a small space between them, and then moves them even closer. · A direct and universally understood visual metaphor for a gap shrinking, making the abstract concept of capability difference tangible.
  • 01:57:59 — “He chose the reinforcement learning team at Anthropic because it was more uncert”
    • He smiles widely and nods slightly. · The smile visually communicates his attraction to and excitement for tackling unknown, challenging problems, a key insight into his motivations that goes beyond the words themselves.
  • 01:58:26 — “There are two types of distillation: ‘hard distillation’ and ‘smart distillation”
    • He holds up two fingers (index and middle) on his right hand to visually separate the two concepts he is about to explain. · A simple enumerating gesture that primes the listener to expect two distinct categories, adding structure to his explanation.
  • 02:09:00 — “The important thing for a startup is to ‘make a bet’.”
    • He makes a small, decisive chopping motion with his right hand. · The gesture visually underscores the idea of making a firm, committed, and singular strategic decision.
  • 02:30:00 — “The most important trait for an AI practitioner is being ‘reliable’ (靠谱).”
    • He brings his hands together and interlaces his fingers in front of him, holding a stable posture. · This grounded gesture visually reinforces the concepts of reliability, meticulousness, and taking responsibility for one’s work.
  • 02:33:47 — “The need to seize the opportunity immediately (‘现在就得争取’).”
    • Makes a small, decisive chopping motion with his right hand. · The gesture physically underscores the urgency and finality of the decision he made in that moment.
  • 02:43:38 — “The hard part is not the idea, but breaking it down into executable steps and ac”
    • He uses a subtle chopping motion with his right hand. · The gesture visually represents the act of ‘breaking down’ a large idea into smaller, manageable pieces, reinforcing his point about the importance of execution over abstract concepts.
  • 02:52:20 — “Describing how a small initial perturbation can lead to an exponentially large d”
    • Spreads his hands apart quickly and widely, from a close position to a far one. · A clear visual metaphor for exponential growth, making the abstract concept of the butterfly effect more tangible.
  • 02:58:09 — “Comparing achieving external standards to training a model (‘就像训练模型一样’).”
    • Makes a circular, repetitive motion with his right index finger. · This gesture illustrates the iterative, mechanical, and somewhat predictable process of optimizing for a known evaluation metric.
  • 31:38:00 — “The importance of being systematic (‘做事系统’) when debugging or analyzing unexpect”
    • He uses his right hand to draw a structured, box-like shape in the air while speaking. · The gesture creates a visual metaphor for a system, a framework, or a structured process for problem-solving.

Authenticity Tells (10)

  • 01:18:13 — A full-bodied, uninhibited laugh.: Reacting to the host’s question about using competitor AI tools at Google, which he jokes could get him fired.
    • The laugh appears completely genuine and spontaneous, not forced. It shows he is comfortable with the interviewer, finds the situation genuinely funny, and is not actually worried, using humor to deflect a tricky question.
  • 01:47:00 — A broad, nostalgic smile and increased animation.: When telling the personal story of how he chose his high school based on an ‘underdog’ strategy to get into the competition class.
    • His visible enjoyment in telling this story suggests it’s a formative memory he is proud of. The shift from a professional, analytical demeanor to a more personal, storytelling mode feels authentic and reveals a key part of his personality and motivation.
  • 01:51:40 — Slight hesitation and more deliberate, measured speech.: When asked to name which Chinese companies are engaging in ‘hard’ vs. ‘smart’ distillation of other models.
    • He visually and audibly slows down, choosing his words carefully. This signals he is navigating a sensitive topic and is consciously avoiding making direct accusations while still conveying his opinion, which he eventually does after being prompted.
  • 02:01:28 — Smiling while declining to answer.: When asked for specific technical details about Anthropic’s models, which are under NDA, he smiles, shakes his head slightly, and says ‘不能说’ (can’t say).
    • This happens multiple times (e.g., 00:13:54). The friendly, non-confrontational refusal signals that he is bound by confidentiality but not being evasive or difficult. It reinforces his position as an insider with valuable knowledge while respecting his legal obligations, which adds to his credibility.
  • 02:24:14 — A brief, thoughtful pause before answering.: Before explaining his view on the AI field being an ‘unstoppable’ force, he pauses for a moment, looking slightly down and to the side.
    • This pause indicates he is not giving a rehearsed soundbite but is genuinely formulating a complex thought, lending weight and sincerity to the philosophical point he is about to make.
  • 02:35:43 — He laughs and calls his own impressive competition record ‘挺菜的’ (quite lame).: When asked about his performance in academic competitions, which was actually at a very high level (provincial team).
    • This is a form of humblebragging (‘凡尔赛’). The self-deprecating humor, combined with his relaxed smile, shows he is aware of his achievements but chooses to downplay them, a common cultural trait among high-achievers that feels authentic and relatable.
  • 02:45:36 — He laughs and says his personality is ‘爱折磨自己’ (likes to torture himself).: When asked about his tendency to repeatedly switch to more difficult, unfamiliar fields.
    • This candid and humorous self-assessment reveals a high degree of self-awareness. He is able to laugh at his own intense drive, making his ambitious nature seem more human and less intimidating.
  • 03:08:35 — He immediately shakes his head, laughs, and says ‘我就是没搞明白啊’ (I just couldn’t figure it out).: When asked to explain the intricacies of building an optical experiment setup.
    • This is a moment of genuine intellectual honesty. By openly and cheerfully admitting his limitations in experimental physics, he reinforces his credibility as an expert in theoretical domains. It shows he is not afraid to be vulnerable about what he doesn’t know.
  • 22:02:00 — A long, thoughtful pause, looking up and away from the host before answering.: When asked to explain the strategic rationale behind recent major acquisitions like Meta/Minus, he honestly replies, ‘I don’t understand’.
    • The pause and his candid admission of not fully understanding the situation, despite being an expert, project a high degree of intellectual honesty and authenticity. It shows he is thinking through the question rather than giving a prepared answer.
  • 26:44:00 — A quick, subtle shake of the head and a slight smile.: In response to the host’s suggestion that the pace of model improvement is slowing down.
    • His immediate, non-verbal disagreement (‘I think not at all’) precedes his verbal explanation, revealing a genuine, uncoached conviction that progress is not slowing down. It feels like a gut reaction from someone on the front lines.

Facts the Transcript Loses

  • The guest’s consistently calm, friendly, and confident demeanor. He smiles frequently, which makes complex and high-stakes topics feel more accessible and less intimidating.
  • The casual, intimate nature of the interview. The guest is clearly speaking to a host who is physically present just off-camera, creating a natural conversational dynamic that a transcript would miss.
  • The contrast between the relaxed, almost home-like setting (with the large plant) and the discussion of the hyper-competitive, fast-paced global AI industry.
  • The overall tone of the interview is one of intellectual curiosity and candid sharing, rather than a formal or confrontational exchange. The guest’s body language is consistently open and relaxed.
  • The speaker’s consistently cheerful and relaxed demeanor, frequently smiling even when discussing complex or serious topics, which creates a very approachable and engaging tone.
  • The stark visual contrast between his broad, easy laugh when joking (e.g., about getting fired) and his serious, focused expression when discussing the societal impact of AI.
  • The subtle but clear shift in body language and expression when he moves from technical analysis to personal anecdotes, becoming more animated and revealing more of his personality.
  • The warm, professional-but-not-corporate visual setting, which contributes to the interview’s intimate and conversational feel.
  • The guest’s pervasive and easy-going smile, which conveys a sense of comfort, confidence, and genuine enjoyment throughout the entire conversation, a quality not fully captured by the text alone.
  • The subtle, constant nodding and affirmative facial expressions he makes while the host is speaking, indicating active and engaged listening.
  • The contrast in his demeanor when discussing personal history (relaxed, broadly smiling) versus explaining scientific concepts (more focused, leaning in slightly, using precise hand gestures).
  • The way he often looks slightly down and to the side with a small smile before answering a complex or personal question, a visual cue that he is taking a moment to formulate a thoughtful and precise response.
  • The consistent visual theme of the guest’s relaxed, smiling demeanor, which creates a stark contrast with the highly technical and competitive nature of the topics being discussed (e.g., scaling laws, corporate strategy).
  • The subtle but noticeable shift in his body language from open and amused when talking about his past, to more guarded and serious when discussing his reasons for leaving Anthropic.
  • The visual branding of the podcast, including the ‘ZHANG XIAOJUN’ and ‘PODCAST #140’ text overlay, which frames the conversation.
  • The way he good-naturedly ‘stonewalls’ questions about trade secrets, using a smile and a simple ‘can’t say’ to navigate NDAs without creating tension.

Named Entities

People (20): Andrej Karpathy, Ben Mann, Boris, Dario Amodei, Demis Hassabis, F. Duncan Haldane, Fei-Fei Li, Geoffrey Hinton, Ilya Sutskever, Jared Kaplan, Koray Kavukcuoglu, Sam McCandlish, Sergey Brin, Tom Brown, Wolfgang Pauli, 吴泳辉, 姚顺宇, 张首晟, 杨振宁, 王中

Companies / Institutions (25): Anthropic, Apple, ByteDance, Cursor, DeepMind, DeepSeek, Dexterity, Google, Google DeepMind, Isomorphic Labs, Meta, Midjourney, Minus, OpenAI, OpenCloud, Sakana AI, SpaceX, Tencent, WinSurf, Zhipu AI, xAI, 伯克利, 斯坦福大学, 格致中学, 清华大学

Papers / Methods / Datasets (44): 3D Torus, AlphaFold, AlphaTensor, Claude Code, Cloud Code, Distillation, GPT-1, Gemini, Gemini 1.5, Long Horizon, ML Coding, Multi-agent training, NVLink, Nano-Bard, Policy Gradient, Post-training, Pre-training, RL, Reinforcement Learning (RL), SFT, Scaling Law, Scaling Laws, Sparse Attention, Transformer, VLA, World Models, 凝聚态理论, 布洛赫波 (Bloch wave), 弦论 (String Theory), 强化学习, 拓扑现象 (Topological Phenomena), 智能涌现 (Emergent Abilities), 物理竞赛, 监督学习, 自主招生, 薛定谔的猫 (Schrödinger’s cat), 蝴蝶效应 (Butterfly Effect), 重整化群 (Renormalization Group), 量子物理, 量子纠缠 (Quantum Entanglement), 量子计算 (Quantum Computing), 非厄米系统 (Non-Hermitian System), 预训练, 高能理论 (High-Energy Theory)

Takeaways

  • AI development has shifted from focusing on technical feasibility to how to define and apply problems, making human insight more critical.
  • Models converge on benchmark tests, but actual user experience and specific capabilities (e.g., tool use, coding, reasoning) still show differences, which are key areas of competition for various companies.
  • AI models are expected to achieve significant breakthroughs in context processing capabilities, potentially realizing ‘finite training, infinite use’ context ability, which will be the foundation for new applications like personal assistants.
  • AI startups face severe challenges, needing to build strong model barriers or find sufficiently small niche markets to survive, otherwise they are easily integrated by giants.
  • Model learning capabilities have significantly strengthened, and the Scaling Law for pre-training has not yet peaked, with rapid progress expected to continue in the coming months.
  • Code generation is one of the fastest-developing scenarios in the AI field, benefiting from clear feedback signals and high-quality data sources like GitHub.
  • Under the current framework, the core drivers of AI development primarily come from computing power and data, while algorithms play a crucial role in breaking through bottlenecks.
  • AI has greatly improved programming efficiency, especially in experimentation and idea validation, with efficiency gains of 20-50 times.
  • The widespread adoption of AI has led to increased work intensity and hours, as improved efficiency stimulates a desire to try more new ideas.
  • AI currently excels in tasks with strong logic and objectivity, but it remains difficult to replace humans in areas lacking clear standards and objective evaluation (e.g., product managers).
  • The impact of AI on programmers’ careers is gradual; in the future, a small number of top programmers will command more resources and value, while most will need to learn to collaborate effectively with AI.
  • Despite a disadvantage in computing resources, China’s AI development has spurred technological innovations such as distillation.
  • Robot AI is still in its early stages, yet to find generalizable capabilities that can scale horizontally like LLMs, but its future potential is immense.
  • In personal growth experiences, daring to challenge unfamiliar things and boldly seizing opportunities are important driving forces.
  • Actively striving for opportunities is crucial; even if rules seem unfavorable, proactive action can create turning points.
  • Constantly challenging the unknown and difficulties is a powerful driver for growth, but it may also mean giving up existing achievements, which is a ‘human weakness’ and also a choice.
  • When choosing a career direction, one should lean towards fields with objective evaluation standards that can have a real impact on the world, avoiding purely subjective and abstract dilemmas.
  • The cross-disciplinary shift from physics to AI is fundamentally driven by the similarity in research paradigms—that is, the combination of theoretical conception and numerical experimentation, rather than the transfer of specific knowledge or skills.
  • In the early stages of a career, especially in emerging fields, networking and mutual recommendations within the community play a crucial and indispensable role.
  • Anthropic’s success is largely attributed to its unique top-down culture, which is ensured by an organizational structure where technical leaders also serve as co-founders and possess ultimate decision-making power.
  • This structure enables Anthropic, as a startup, to make high-risk ‘tech bets’ and execute quickly, forming its core advantage over larger companies like Google or later-stage OpenAI.
  • Large model R&D has moved beyond the era of ‘individual heroism’; breakthroughs now are more the product of large-scale systems engineering and collective collaboration. The role of individuals is relatively limited, and platforms and timing are crucial.
  • The current bottleneck in AI development may not be technology itself (neither pre-training nor post-training has peaked), but rather our insufficient imagination for new application scenarios, leading to the inability to fully exploit the value of technological progress.
  • For individual researchers, when choosing a platform, it’s important to clarify personal goals: if pursuing direct, clear impact on the final product, a startup is better; if seeking broader learning opportunities and research freedom, a large company platform offers more advantages.
  • The ultimate solution for AI safety may not be dominated by a single giant, but rather a multi-party checks and balances mechanism similar to nuclear deterrence.
  • The core driving force behind AI development is its ‘experimentability’; any idea can be quickly verified, so the bottleneck for innovation is not a lack of ideas, but the speed of verifying them.
  • A business model purely reliant on API sales is fragile and prone to price wars; only companies with full-stack capabilities (from chips to applications) can gain an advantage in it.
  • The current mainstream Chatbot interaction form is far from the ultimate form of AI; it greatly limits the model’s capabilities, and a paradigm shift in products and interactions represents a significant opportunity for the future.
  • The competitive landscape in the AI field is far from stable; rapid iteration of technology and products means that any company’s leading position can be quickly challenged, and there are no absolutely solid moats.
  • In the current era of large models, AI development is systems engineering; ‘reliability’ and ‘systems thinking’ are more important than individual talent.
  • The era of ‘individual heroism’ in AI has passed; it is now a ‘collectivist’ stage requiring close collaboration.
  • Excellent technical leaders need the ability to personally solve difficult problems and the magnanimity to accommodate different opinions.
  • There are significant differences in product strategies between the US and Chinese AI ecosystems: the US excels in enterprise software with direct monetization, while China is adept at building complex consumer-facing ecosystems and indirect business models.
  • For newcomers, the window of opportunity in pure language model development is closing, while directions such as multimodal AI, robotics, and AI for Science may be new ‘blue oceans’.
  • In the objective field of AI, maintaining critical thinking and daring to directly express coherent views is more valuable in the long run than simply ‘converging’.