// the find
langchain-ai/langgraph
Build resilient agents.
LangGraph is a Python framework for building stateful, long-running agent workflows modeled as directed graphs. It handles checkpointing, human-in-the-loop interrupts, and multi-agent coordination — things you'd otherwise have to wire up yourself. It's the go-to choice if you're building anything more complex than a single-shot LLM call.
First-class durable execution: agents can survive crashes and resume from exactly the last checkpoint, which is genuinely hard to get right. The checkpoint abstraction is well-designed — swappable backends (Postgres, SQLite, Redis) with a conformance test suite that actually validates them. Human-in-the-loop is a real feature, not an afterthought — you can inspect and modify graph state mid-execution. The graph model (Pregel-inspired) maps naturally to the retry/branch/merge patterns that multi-agent systems actually need.
The LangChain ecosystem coupling is a recurring papercut — docs and examples keep pushing you toward LangChain abstractions even though standalone use is technically supported. Debugging graph state across checkpoints requires LangSmith, which is a paid product; the open-source observability story is thin. The API surface has grown fast and shows it: multiple overlapping ways to define nodes and edges, with older patterns still in docs alongside newer ones. Python only for the main framework — the JS port exists but lags and has a different mental model, so you can't share logic across stacks.