Automating SDLC with LangChain, LangSmith, Gemini
Year: 2026 · ▶ Watch on YouTube
Stephanie Wong (Global Lead, Developer Programs) · Harrison Chase (Co-Founder and CEO)
Segments (6)
- 00:00:00 · Introduction — Stephanie Wong
- Introduction of Harrison Chase, CEO and Co-founder of LangChain, to discuss building applications with LLMs.
- 00:00:20 · The Agent Harness Layer — Harrison Chase
- An agent harness is the scaffold around an LLM that connects it to tools and the environment, and engineering this layer is often more effective than fine-tuning model weights.
- 00:03:47 · Combining Open Source with Managed Infrastructure — Harrison Chase
- Combining open-source frameworks like LangChain with managed runtimes like Google’s Reasoning Engine solves the major challenges of scaling, state management, and reliability when moving agents from prototype to production.
- 00:05:48 · Improving Harness Code with Traces and Evals — Harrison Chase
- Using traces and evals (both explicit and inferred from user feedback) is crucial for identifying when to optimize agent code, a process facilitated by tools like LangSmith.
- 00:09:18 · How Foundational Model Capabilities Impact Harness Engineering — Harrison Chase
- Advances in foundational models (e.g., long context, multimodality) simplify or change the nature of harness engineering, but the core need for observability and evaluation remains constant.
- 00:11:29 · The Future: Meta-Harnesses and the ‘AI AI Engineer’ — Harrison Chase
- The future involves a ‘meta-harness’ or an ‘AI AI Engineer’—an automated loop where agents analyze their own performance traces and use tools like Gemini Code Assist to rewrite and improve their own code.
Products Announced (4)
- 00:03:54 ·
Reasoning Engine on Google Cloud(Discussed)- Secure, managed environment for deploying LangChain and LangGraph applications · Handles scaling, state management, and reliability for agentic workflows · Part of the Gemini Enterprise agent platform
- Available on Google Cloud
- 01:06:40 ·
LangSmith(Discussed)- Observability and tracing for LLM applications · Evaluation (Evals) framework for testing agent performance · Supports online evals and custom evaluators
- Available from LangChain
- 01:06:09 ·
Gemini Code Assist(Discussed)- AI-powered code assistance · Can be used to rewrite agent code as part of an automated improvement loop · Integrated across the SDLC
- Available on Google Cloud
- 01:08:48 ·
LangGraph(Discussed)- Library for building stateful, multi-actor applications with LLMs · Allows for creating more deterministic workflows and cycles · Used for building complex harnesses
- Open Source
Competitor Mentions / Comparisons (3)
- 00:02:37 · vs ChatGPT — Mentioned as a general-purpose baseline that a specialized agent, with its specific context and tools, differentiates itself from.
- 00:11:46 · vs OpenAI — Mentioned in the context of researching how agent harnesses might need to be adapted for different foundational models (OpenAI vs. Anthropic vs. Google).
- 00:11:47 · vs Anthropic — Mentioned in the context of researching how agent harnesses might need to be adapted for different foundational models (OpenAI vs. Anthropic vs. Google).
Benchmarks Shown (1)
- 00:01:51 ·
Terminal-Bench: 5th place- Improved from 30th to 5th place by only tuning the DevAgents harness, with no changes to the underlying model.
Notable Quotes (4)
- 00:01:39 — Harrison Chase:
Changing that harness can be just as effective, and often times way easier, than changing the weights of the underlying model.
- 00:11:29 — Harrison Chase:
Everything in the SDLC is getting automated, and so is that like, turning of the flywheel.
- 00:12:21 — Harrison Chase:
We’re really creating this like, AI AI engineer.
- 00:13:50 — Harrison Chase:
You can’t really improve what you don’t know what happened, and that’s where observability comes in.
Visual Signals
On-screen (3)
- 00:00:05 ·
Lower third: 'Stephanie Wong, Global Lead, Developer Programs, Google Cloud'- Identifies the host and her role.
- 00:00:48 ·
Lower third: 'Harrison Chase, Co-Founder and CEO, LangChain'- Identifies the guest speaker and his role.
- 00:20:37 ·
Google Cloud Next '26 logo- End card for the session video.
Stage (1)
- 00:00:00 · The interview takes place in a studio setting at the Google Cloud Next ‘26 event, with two speakers sitting at a desk with microphones.
Key Topics
AI Agents · LangChain · Agent Harness · Harness Engineering · LLMs · Foundational Models · Observability · Evals (Evaluation) · LangSmith · LangGraph · Google Cloud · Reasoning Engine · Gemini · SDLC for AI · Meta-Harness
Takeaways
- The ‘agent harness’—the scaffolding of prompts, tools, and memory around an LLM—is a critical layer for building effective AI agents, and engineering it can yield more performance gains than changing model weights.
- Moving AI agents from prototype to production requires solving for reliability, state management, and scalability, which is where managed infrastructure like Google’s Reasoning Engine adds significant value to open-source frameworks like LangChain.
- The development lifecycle for agents is an iterative loop: build the agent, observe its behavior with traces (e.g., in LangSmith), evaluate its performance, and then use those insights to improve the agent’s code or harness.
- User feedback, both explicit (thumbs up/down) and implicit (corrective language), is a key signal for evaluating agent performance and can be automated with ‘online evals’.
- The evolution of foundational models (e.g., longer context windows, multimodality) directly impacts harness design, often simplifying it but reinforcing the need for robust observability and evaluation.
- The future of agent development points towards a ‘meta-harness’ or an ‘AI AI Engineer’—a self-improving system where an agent analyzes its own performance and automatically suggests or applies code changes to its own logic.
- While models are becoming more capable, the core challenges of observability and evaluation are constant and essential for building reliable, production-grade agentic systems.