// the find

kyegomez/OpenMythos

★ 14,571 · Python · MIT · updated May 2026

A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature.

OpenMythos is a speculative PyTorch/JAX implementation of what the author believes Claude Mythos's architecture looks like, based on public research and Twitter threads. It implements a Recurrent-Depth Transformer with MoE feed-forwards, switchable GQA/MLA attention, and LTI-stable injection parameters. This is fan theory as code — interesting ML research, not a working replica of anything Anthropic built.

The LTI stability fix is the most technically credible part: constraining the injection matrix A via negative-diagonal parameterization and ZOH discretization to enforce spectral radius < 1 is a real and well-motivated solution to the training instability problem that plagues looped transformers. The MLA attention implementation (compressed KV latent with split RoPE/no-RoPE head dims) is a faithful port of DeepSeek-V2's approach and is worth studying on its own. The README is unusually honest about what is known versus speculated, and the reference list is solid — Parcae, Universal Transformers, the Saunshi et al. reasoning paper are all real and relevant. Pre-configured model scales (1B to 1T) with a working training script against FineWeb-Edu lower the barrier to actually running experiments.

The core premise is unverifiable and almost certainly wrong in the details — this is architecture speculation dressed up as reconstruction, and the confidence in the README ('Mythos almost certainly has some version of this') is not earned. There are no trained checkpoints, no benchmark numbers showing the looped model actually outperforms a baseline transformer at the same parameter count, and the 'bench_vs_transformer.py' test file name suggests this comparison exists but results are nowhere in the README. The repo is essentially a single large file (open_mythos/main.py) with no published evals, which means you can instantiate the model but have no signal whether the training recipe actually produces the claimed benefits. The tokenizer wraps 'openai/gpt-oss-20b' which as of mid-2026 is not a publicly available model, so the training script will fail for most people without modification.

View on GitHub → Homepage ↗