Automating SDLC with LangChain, LangSmith, Gemini

Year: 2026 · ▶ Watch on YouTube

Stephanie Wong (Global Lead, Developer Programs) · Harrison Chase (Co-Founder and CEO)

Switch language → zh

Segments (6)

00:00:00 · Introduction — Stephanie Wong
- Introduction of Harrison Chase, CEO and Co-founder of LangChain, to discuss building applications with LLMs.
00:00:20 · The Agent Harness Layer — Harrison Chase
- An agent harness is the scaffold around an LLM that connects it to tools and the environment, and engineering this layer is often more effective than fine-tuning model weights.
00:03:47 · Combining Open Source with Managed Infrastructure — Harrison Chase
- Combining open-source frameworks like LangChain with managed runtimes like Google’s Reasoning Engine solves the major challenges of scaling, state management, and reliability when moving agents from prototype to production.
00:05:48 · Improving Harness Code with Traces and Evals — Harrison Chase
- Using traces and evals (both explicit and inferred from user feedback) is crucial for identifying when to optimize agent code, a process facilitated by tools like LangSmith.
00:09:18 · How Foundational Model Capabilities Impact Harness Engineering — Harrison Chase
- Advances in foundational models (e.g., long context, multimodality) simplify or change the nature of harness engineering, but the core need for observability and evaluation remains constant.
00:11:29 · The Future: Meta-Harnesses and the ‘AI AI Engineer’ — Harrison Chase
- The future involves a ‘meta-harness’ or an ‘AI AI Engineer’—an automated loop where agents analyze their own performance traces and use tools like Gemini Code Assist to rewrite and improve their own code.

Products Announced (4)

00:03:54 · Reasoning Engine on Google Cloud (Discussed)
- Secure, managed environment for deploying LangChain and LangGraph applications · Handles scaling, state management, and reliability for agentic workflows · Part of the Gemini Enterprise agent platform
- Available on Google Cloud
01:06:40 · LangSmith (Discussed)
- Observability and tracing for LLM applications · Evaluation (Evals) framework for testing agent performance · Supports online evals and custom evaluators
- Available from LangChain
01:06:09 · Gemini Code Assist (Discussed)
- AI-powered code assistance · Can be used to rewrite agent code as part of an automated improvement loop · Integrated across the SDLC
- Available on Google Cloud
01:08:48 · LangGraph (Discussed)
- Library for building stateful, multi-actor applications with LLMs · Allows for creating more deterministic workflows and cycles · Used for building complex harnesses
- Open Source

Competitor Mentions / Comparisons (3)

00:02:37 · vs ChatGPT — Mentioned as a general-purpose baseline that a specialized agent, with its specific context and tools, differentiates itself from.
00:11:46 · vs OpenAI — Mentioned in the context of researching how agent harnesses might need to be adapted for different foundational models (OpenAI vs. Anthropic vs. Google).
00:11:47 · vs Anthropic — Mentioned in the context of researching how agent harnesses might need to be adapted for different foundational models (OpenAI vs. Anthropic vs. Google).

Benchmarks Shown (1)

00:01:51 · Terminal-Bench: 5th place
- Improved from 30th to 5th place by only tuning the DevAgents harness, with no changes to the underlying model.

Notable Quotes (4)

00:01:39 — Harrison Chase:

Changing that harness can be just as effective, and often times way easier, than changing the weights of the underlying model.
00:11:29 — Harrison Chase:

Everything in the SDLC is getting automated, and so is that like, turning of the flywheel.
00:12:21 — Harrison Chase:

We’re really creating this like, AI AI engineer.
00:13:50 — Harrison Chase:

You can’t really improve what you don’t know what happened, and that’s where observability comes in.

Visual Signals

On-screen (3)

00:00:05 · Lower third: 'Stephanie Wong, Global Lead, Developer Programs, Google Cloud'
- Identifies the host and her role.
00:00:48 · Lower third: 'Harrison Chase, Co-Founder and CEO, LangChain'
- Identifies the guest speaker and his role.
00:20:37 · Google Cloud Next '26 logo
- End card for the session video.

Stage (1)

00:00:00 · The interview takes place in a studio setting at the Google Cloud Next ‘26 event, with two speakers sitting at a desk with microphones.

Key Topics

AI Agents · LangChain · Agent Harness · Harness Engineering · LLMs · Foundational Models · Observability · Evals (Evaluation) · LangSmith · LangGraph · Google Cloud · Reasoning Engine · Gemini · SDLC for AI · Meta-Harness

Takeaways

The ‘agent harness’—the scaffolding of prompts, tools, and memory around an LLM—is a critical layer for building effective AI agents, and engineering it can yield more performance gains than changing model weights.
Moving AI agents from prototype to production requires solving for reliability, state management, and scalability, which is where managed infrastructure like Google’s Reasoning Engine adds significant value to open-source frameworks like LangChain.
The development lifecycle for agents is an iterative loop: build the agent, observe its behavior with traces (e.g., in LangSmith), evaluate its performance, and then use those insights to improve the agent’s code or harness.
User feedback, both explicit (thumbs up/down) and implicit (corrective language), is a key signal for evaluating agent performance and can be automated with ‘online evals’.
The evolution of foundational models (e.g., longer context windows, multimodality) directly impacts harness design, often simplifying it but reinforcing the need for robust observability and evaluation.
The future of agent development points towards a ‘meta-harness’ or an ‘AI AI Engineer’—a self-improving system where an agent analyzes its own performance and automatically suggests or applies code changes to its own logic.
While models are becoming more capable, the core challenges of observability and evaluation are constant and essential for building reliable, production-grade agentic systems.