Reference

Compare the stack

These tools overlap but aren't competitors. Most production teams use several together.

Feature	LangChain	LangGraph	LangSmith	LlamaIndex	Haystack	DSPy	CrewAI	AutoGen	Pydantic AI	Semantic Kernel
Primary purpose	Compose LLM building blocks	Orchestrate stateful agents	Observe, evaluate, monitor	Connect LLMs to your data	Production search & RAG pipelines	Optimize prompts programmatically	Role-based multi-agent teams	Conversational multi-agent systems	Type-safe structured agents	Enterprise polyglot AI SDK
Best for	RAG, chains, prompt pipelines	Multi-step / multi-agent flows	Debugging & regression tests	Enterprise document Q&A	Hybrid search + RAG at scale	Measurable prompt iteration	Specialist agent crews	Code-writing agent teams	Validated structured outputs	.NET / Java AI integration
Runtime	Library (Py & JS)	Library + hosted Platform	Hosted SaaS + self-host	Library (Py & JS)	Library (Py)	Library (Py)	Library (Py)	Library (Py & .NET)	Library (Py)	SDK (.NET, Py, Java)
Plays well with	Any model, any vector DB	LangChain runnables	LangChain, LlamaIndex, OTel	LangChain, agents, LangSmith	Elasticsearch, Weaviate, Qdrant	Any LM via LiteLLM	LangChain tools, LiteLLM	Azure OpenAI, MCP, Docker	FastAPI, Logfire, LangSmith	Azure AI, OpenAI, Hugging Face

A typical stack

1. LlamaIndex — ingest and index your private documents.
2. LangChain — wrap retrievers, models, and tools as composable runnables.
3. LangGraph — orchestrate the agent loop with state, retries, and human review.
4. LangSmith — trace every run, score quality, and catch regressions.