// the find

wanshuiyin/Auto-claude-code-research-in-sleep

★ 12,967 · Python · MIT · updated Jul 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works with Claude Code, Codex, OpenClaw, or any LLM agent.

ARIS is a collection of ~77 Markdown skill files that script ML research workflows inside Claude Code, Codex, or Cursor — literature search, cross-model review loops, paper writing, experiment tracking. There's also a standalone Rust CLI (ARIS-Code) that packages these skills with multi-provider routing and MCP support. Primary audience is ML researchers who want to automate the boilerplate of academic research using LLM agents.

The skill-as-Markdown approach is genuinely portable: each SKILL.md is just instructions the agent reads, no framework coupling, works anywhere you can supply a context file. The cross-model review loop (GPT reviews Claude's draft, and vice versa) is the right answer to single-model sycophancy — the changelog documents real reviewer catches, including an off-by-one in grep line-mapping and missing stream-level tests that blocked bad code before GO. The ARIS-Code Rust CLI has real engineering depth: MCP stdio transport was corrected from LSP Content-Length framing to NDJSON after real-machine testing against codex exposed the bug that every fake-server test missed, and the streaming robustness work (UTF-8 chunk boundary corruption fix, premature-EOF retry logic, idle timeout) is the kind of thing that matters when you're running overnight. The Anti-Autoresearch companion project is self-aware in a useful way — cataloging 61 integrity-failure patterns addresses the obvious criticism that autonomous research tools produce unverifiable output.

The README is unusable for orientation — it's a 5000-word product launch crossed with a full changelog, WeChat group links, Chinese-only announcements, and 20 spin-off project plugs. There's no clear 'here's how to run your first skill' quickstart that survives the scroll. The skills are prompt templates with no ground-truth evaluation; all the cross-model review ceremony can't tell you whether the underlying workflow prompt is any good — that question the repo doesn't answer. The self-referential credibility loop is a real problem: the arXiv paper was presumably generated with ARIS, the tutorials were generated with ARIS, the movie director demo demonstrates ARIS. Independently verifiable results from someone who didn't build the tool are absent. The Rust CLI and the Markdown skill files are architecturally disconnected — if you use just the skills in Claude Code (the advertised no-lock-in path), you get none of the streaming fixes, MCP routing, or multi-provider support that dominate the changelog; the two halves serve different users and should be separate repos.

View on GitHub →