finds.dev← search

// the find

vxcontrol/pentagi

★ 17,701 · Go · MIT · updated Jun 2026

Fully autonomous AI Agents system capable of performing complex penetration testing tasks

PentAGI is a self-hosted, multi-agent system for autonomous penetration testing. It spins up specialized AI agents (researcher, developer, executor) inside isolated Docker containers alongside a full suite of security tools like nmap, metasploit, and sqlmap. The target audience is security professionals who want to automate recon and exploitation workflows without sending data to a cloud service.

The multi-agent supervision design is genuinely thoughtful — separate agents for research, planning, and execution reduce the scope creep problem that plagues single-agent pen-test tools. The chain summarization system for managing LLM context windows is well-documented and tunable, which matters a lot for long-running attack flows. Provider flexibility is real: you can swap OpenAI for Anthropic, Bedrock, or a local vLLM instance without changing agent code. The monitoring stack (OpenTelemetry + Langfuse + Jaeger) is production-grade and gives you actual visibility into what the agent did and why, which most AI security tools completely skip.

The Docker stack complexity is brutal — a full deployment pulls in PostgreSQL with pgvector, Neo4j, Redis, ClickHouse, MinIO, Loki, VictoriaMetrics, Jaeger, Grafana, a custom scraper image, and the main app; debugging failures across that many containers is painful. The agent runs as root to access docker.sock, and the README buries the security note after several pages of setup steps — easy to miss in a rush. The 'beta' tags on both Execution Monitoring and Intelligent Task Planning, the two features that actually make smaller models useful, mean the core value proposition isn't stable yet. The knowledge graph (Graphiti + Neo4j) is optional but the README presents it as central, so you'll spend time setting it up before realizing you didn't need it for basic flows.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →