LangChain, Inc.

LangSmith

Observability and evals for LLM applications.

LangSmith is a platform for tracing, debugging, evaluating, and monitoring LLM apps. It works with LangChain, LangGraph, and any framework via the OpenTelemetry-compatible SDK — capturing every prompt, tool call, and token usage.

Install

bash

pip install langsmith

bash

npm install langsmith

Quickstart

A minimal example to verify your setup.

python

import os
from langsmith import traceable
from openai import OpenAI

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = "<your-key>"

client = OpenAI()

@traceable
def answer(question: str) -> str:
    res = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": question}],
    )
    return res.choices[0].message.content

print(answer("What is observability?"))

Core concepts

Tracing

Capture nested runs with inputs, outputs, latency, token counts, and errors. Works automatically with LangChain and via @traceable for any code.

Datasets & evals

Build evaluation datasets from production traces. Run LLM-as-judge, heuristic, and pairwise evaluators on every commit.

Prompt hub

Version, share, and A/B test prompts across teams. Pull prompts from code with a single SDK call.

Monitoring

Dashboards for latency, cost, error rates, and user feedback. Set alerts on regressions and drift.

Common use cases

›Debugging failed agent runs
›Regression testing prompt and model changes
›Tracking cost and latency in production
›Collecting human feedback at scale

Resources

← All frameworks Next: LlamaIndex