// the find

bytebot-ai/bytebot

★ 11,053 · TypeScript · Apache-2.0 · updated Sep 2025

Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.

Bytebot gives an AI its own containerized Ubuntu desktop (XFCE + Firefox + VS Code) and lets it operate it via computer-use APIs from Claude, GPT, or Gemini. The practical use case is automating tasks that don't have APIs — scraping portals, processing PDFs, clicking through legacy web UIs. Aimed at developers and small ops teams who want browser/desktop automation without writing Playwright scripts.

The computer-use daemon (bytebotd) exposes a clean REST API for screenshots, clicks, and keyboard input, which means you can drive the desktop programmatically independent of the AI layer — useful for testing or custom integrations. Two separate agent packages (bytebot-agent and bytebot-agent-cc) give you a choice between the standard multi-provider approach and a Claude-Code-specific variant, with LiteLLM sitting in the middle so you can swap models without rewriting tool definitions. The Prisma migration history is tidy and shows real iteration — task scheduling, file uploads, auth added then removed — which signals active maintenance rather than a demo that got open-sourced. Railway one-click deploy and Helm charts both exist and appear complete, so you're not stuck self-hosting by hand.

There are two nearly-identical agent packages (bytebot-agent and bytebot-agent-cc) with duplicated Prisma schemas and migration histories, and no clear documentation on when to pick one over the other — this will confuse anyone trying to contribute or debug diverged behavior. The desktop environment is a full Ubuntu image with XFCE, which means your container is heavy by default; there's no slim or headless variant for tasks that don't need a GUI. Task isolation is weak — each task runs in the same persistent desktop environment, so a failed task can leave browser tabs open, files scattered, or application state that poisons the next task. No sandboxing or snapshot/restore between tasks is a real operational risk if you run this on anything sensitive.

View on GitHub → Homepage ↗