// the find
thedotmack/claude-mem
Persistent Context Across Sessions for Every Agent – Captures everything your agent does during sessions, compresses it with AI, and injects relevant context back into future sessions. Works with Claude Code, OpenClaw, Codex, Gemini, Hermes, Copilot, OpenCode + More
A Claude Code plugin that hooks into the agent lifecycle to capture tool observations, compress them with an LLM, and inject relevant context back into future sessions. The hook-based approach is clever: it requires no changes to Claude Code itself, just lifecycle scripts that talk to a local SQLite + Chroma worker service. Aimed at developers who want their AI coding assistant to remember what happened last week.
The 3-layer MCP search pattern (search index → timeline → full fetch) is genuinely good thinking — it gives the LLM a way to filter before expanding, which matters when you're burning tokens on context injection. The architecture of a local HTTP worker managing both SQLite (fast keyword search via FTS5) and Chroma (semantic search) gives you hybrid retrieval without needing a cloud vector DB. The `<private>` tag for excluding sensitive content from storage is a small but important escape hatch that most memory tools skip. Progressive disclosure as a first-class concept rather than an afterthought is the right framing for this problem.
The dependency chain is awkward: Node.js ≥ 20, Bun (auto-installed), uv for Python (auto-installed), SQLite, and Chroma — that's a lot of moving parts for what should be a local cache. The 'CMEM token' section at the bottom of the README is a red flag; mixing a crypto token into a developer tool's README destroys trust with exactly the audience this is trying to reach. The plugin currently runs as a local-only service, so there's no story for syncing memory across machines or team members, which limits its value for anyone working across multiple boxes. Token cost for the compression LLM call is opaque — the README doesn't tell you how much each session costs to summarize.