LangChain, Inc.

LangSmith

Observability and evals for LLM applications.

LangSmith is a platform for tracing, debugging, evaluating, and monitoring LLM apps. It works with LangChain, LangGraph, and any framework via the OpenTelemetry-compatible SDK — capturing every prompt, tool call, and token usage.

Install

bash
pip install langsmith
bash
npm install langsmith

Quickstart

A minimal example to verify your setup.

python
import os
from langsmith import traceable
from openai import OpenAI

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = "<your-key>"

client = OpenAI()

@traceable
def answer(question: str) -> str:
    res = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": question}],
    )
    return res.choices[0].message.content

print(answer("What is observability?"))

Core concepts

Tracing

Capture nested runs with inputs, outputs, latency, token counts, and errors. Works automatically with LangChain and via @traceable for any code.

Datasets & evals

Build evaluation datasets from production traces. Run LLM-as-judge, heuristic, and pairwise evaluators on every commit.

Prompt hub

Version, share, and A/B test prompts across teams. Pull prompts from code with a single SDK call.

Monitoring

Dashboards for latency, cost, error rates, and user feedback. Set alerts on regressions and drift.

Common use cases

  • Debugging failed agent runs
  • Regression testing prompt and model changes
  • Tracking cost and latency in production
  • Collecting human feedback at scale

Resources