// the find

ghostwright/phantom

★ 1,432 · TypeScript · Apache-2.0 · updated May 2026

An AI co-worker with its own computer. Self-evolving, persistent memory, MCP server, secure credential collection, email identity. Built on the Claude Agent SDK.

Phantom gives an AI agent its own dedicated VM — it installs software, spins up databases, builds dashboards, and persists memory across sessions, rather than running ephemerally on your laptop. It targets anyone who wants an AI that accumulates domain knowledge over weeks, exposes results at a public URL, and extends itself by creating new MCP tools at runtime. Default backend is Claude via the official SDK, but the provider abstraction lets you swap in Ollama or Z.AI with two lines of YAML.

- The self-evolution pipeline is architecturally serious: Opus runs the session, Sonnet is the default cross-model judge (explicitly to avoid self-enhancement bias), changes require 5-gate validation plus triple-judge voting with minority veto, and every version is stored with rollback. Most 'self-improving' agents handwave this; here the design is documented and the tradeoffs are acknowledged.

- MCP server exposure turns Phantom into a first-class tool provider — Claude Code and other agents can call into a running Phantom instance, query its memory, and use dynamic tools the agent built for itself. That's a real integration path, not just a Slack chatbot.

- Test coverage is substantial for a project this size: 1819 tests per the README badge, separate vitest configs for the chat UI, and Biome + tsc --noEmit in CI. The src/ directory has __tests__ co-located with every module. This is not a demo repo.

- Multi-provider support is done right — a single YAML block routes both the main agent and the evolution judges through the chosen provider. Existing deployments require no config changes when new providers are added.

- The Docker socket mount grants the container root-equivalent access to the host Docker daemon. The README acknowledges this and says 'run on a dedicated machine' — but that advice will be ignored by most people who try this on their dev box first. There's no sandboxing alternative offered, no seccomp profile, no mention of rootless Docker.

- The phantom-config/ directory is committed to the repo and includes session-log.jsonl, evolution-log.jsonl, corrections.md, and agent-notes.md. Real operational state — including potentially sensitive conversation artifacts — lives in git. Anyone who forks or pushes to a public remote ships their agent's memory history with it.

- The research/ directory contains ~30 internal phase-10X planning documents committed to the public repo. These are development artifacts, not documentation. They make the repo harder to navigate and suggest the team is working fast without a clean separation between published and internal work.

- The claim that GLM-5.1 delivers 'comparable coding quality' to Claude Opus at 15x cheaper cost comes from the README, not a benchmark. The self-evolution judges default to Sonnet when the main agent runs Opus — sound reasoning — but the behavior when both are swapped to a third-party model is undocumented and likely untested.

View on GitHub → Homepage ↗