Reference
Compare the stack
These tools overlap but aren't competitors. Most production teams use several together.
| Feature | LangChain | LangGraph | LangSmith | LlamaIndex | Haystack | DSPy | CrewAI | AutoGen | Pydantic AI | Semantic Kernel |
|---|---|---|---|---|---|---|---|---|---|---|
| Primary purpose | Compose LLM building blocks | Orchestrate stateful agents | Observe, evaluate, monitor | Connect LLMs to your data | Production search & RAG pipelines | Optimize prompts programmatically | Role-based multi-agent teams | Conversational multi-agent systems | Type-safe structured agents | Enterprise polyglot AI SDK |
| Best for | RAG, chains, prompt pipelines | Multi-step / multi-agent flows | Debugging & regression tests | Enterprise document Q&A | Hybrid search + RAG at scale | Measurable prompt iteration | Specialist agent crews | Code-writing agent teams | Validated structured outputs | .NET / Java AI integration |
| Runtime | Library (Py & JS) | Library + hosted Platform | Hosted SaaS + self-host | Library (Py & JS) | Library (Py) | Library (Py) | Library (Py) | Library (Py & .NET) | Library (Py) | SDK (.NET, Py, Java) |
| Plays well with | Any model, any vector DB | LangChain runnables | LangChain, LlamaIndex, OTel | LangChain, agents, LangSmith | Elasticsearch, Weaviate, Qdrant | Any LM via LiteLLM | LangChain tools, LiteLLM | Azure OpenAI, MCP, Docker | FastAPI, Logfire, LangSmith | Azure AI, OpenAI, Hugging Face |
A typical stack
- 1. LlamaIndex — ingest and index your private documents.
- 2. LangChain — wrap retrievers, models, and tools as composable runnables.
- 3. LangGraph — orchestrate the agent loop with state, retries, and human review.
- 4. LangSmith — trace every run, score quality, and catch regressions.