// the find
callstack/agent-device
CLI to control iOS and Android devices for AI agents
agent-device is a CLI (and MCP server) that gives AI coding agents a live feedback loop against real mobile apps — take an accessibility snapshot, get semantic refs like @e1/@e2, interact, capture evidence. It sits between 'read the code and guess' and 'write a Detox test first' by letting agents verify what actually happens on a running device. Built by Callstack, who ships React Native at scale, so the React Native-specific features (component trees, Metro integration, React profiler) are unusually deep.
- The semantic ref system is well-designed for token efficiency. Agents get a flat list of interactive elements tied to the current snapshot, not a full accessibility tree dump on every command. Refs invalidate on UI change, forcing a re-snapshot — annoying in practice but correct in principle.
- Native platform backends (XCTest for iOS, ADB + a custom snapshot instrumentation APK for Android) rather than routing through Appium's WebDriver HTTP stack. This gives faster round-trips and access to capabilities that Appium can't expose cleanly, like RN component trees and XCTest network recording.
- Evidence capture primitives are genuinely useful for a debugging agent: video, logcat/syslog, network traffic, React profiler output, and crash context are all first-class commands, not afterthoughts bolted on.
- The .ad replay format turns exploratory agent sessions into repeatable CI scripts, and Maestro YAML export means you're not fully locked into a proprietary format if you need to hand off to a different toolchain.
- iOS automation requires macOS with Xcode — there's no path around this. macOS GitHub Actions runners cost roughly 10x Linux runners, and the README mentions a GitHub Actions template as 'coming soon.' The CI story for iOS is currently more expensive and less mature than the Android side.
- Physical device setup complexity is understated in the docs. Provisioning profiles, device trust dialogs, and entitlements for XCTest on a real device are non-trivial, especially in a CI environment. Teams with physical device labs will hit real friction before this works.
- Remote and cloud execution routes through their paid Agent Device Cloud product. There's no clear self-hosted path for running agents against remote devices at scale — the open-source story ends at local dev and expensive macOS CI runners.
- WebView and hybrid app interactions are a gap. If your React Native or Flutter app renders significant UI in a WebView (common for auth flows, in-app browsers, or embedded web content), the accessibility snapshot won't see inside it and the agent hits a dead end.