finds.dev← search

// the find

vstorm-co/pydantic-deepagents

★ 889 · Python · MIT · updated Jun 2026

Build Claude Code–style deep agents in Python: tool-calling, sandboxed execution, multi-agent teams, skills, checkpoints, and unlimited context — all on Pydantic AI.

A Python agent harness built on pydantic-ai that adds the infrastructure layer most people wire by hand: tool-calling, multi-agent delegation, persistent memory, Docker sandboxing, MCP, and a TUI. The headline feature is live run forking — splitting an in-flight agent.run() into parallel branches with copy-on-write filesystem isolation, per-branch budgets, and an AI judge to pick the winner. Aimed at Python developers who want Claude Code–style capabilities without building the plumbing themselves.

The forking model is genuinely novel and well-designed — copy-on-write overlay filesystems, per-branch budget caps, four merge modes, and a test-runner hook that feeds exit codes into the judge score formula are all concrete and thought-through. Type safety is taken seriously: Pyright strict + MyPy strict with 100% test coverage claimed, and structured output via output_type gives you real Pydantic models instead of dict parsing. The modular package split (subagents, summarization, shields, backend as separate PyPI packages) means you can pull in just what you need rather than taking the whole thing. MCP support with OAuth-handled auth and the ability to import servers straight from Claude Code config is a nice integration point for teams already using Claude Code.

The README is 40% feature table and comparison grid, which makes it hard to understand what the actual abstraction boundary is — it's never clear what pydantic-ai does versus what pydantic-deep adds, and that matters a lot for debugging when something goes wrong. The forking feature depends on a copy-on-write filesystem overlay that only works with local filesystems; if your agent is doing anything with a real database, external APIs, or shared state, branches are no longer isolated and the judge's verdict becomes meaningless. Being at v0.3.x with a very active changelog (three releases on the same day in June) suggests the API surface is still moving; adopting this in production means absorbing churn. The consultancy framing in the README footer is a yellow flag — the OSS project exists partly as a sales vehicle for Vstorm, which affects how you should read the roadmap and whether issues get prioritized based on user needs versus what clients ask for.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →