// the find
Unity-Technologies/ml-agents
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
ML-Agents is Unity's official toolkit for training game agents with reinforcement learning and imitation learning. The Python training side uses PyTorch (PPO, SAC, GAIL); the C# Unity side defines observations, actions, and rewards. It's for game developers who want smart NPCs and for RL researchers who want visually rich, physics-based environments without building them from scratch.
17+ polished example environments covering single-agent, multi-agent cooperative, and competitive setups — you can see what good reward shaping looks like in practice. The gym and PettingZoo wrappers mean your Unity env drops straight into any standard RL training loop. ONNX inference via Unity's Inference Engine means trained models run in-editor and in shipped builds without a Python runtime. MA-POCA for cooperative multi-agent is genuinely good and not something you get out of most RL frameworks.
The split between Python training and C# environment means debugging a training run requires context-switching between two runtimes and two languages — reward shaping bugs are painful to trace. Training speed is bottlenecked by Unity's render loop unless you carefully strip out all rendering; getting multiple headless instances running in parallel on a server is not a smooth experience. The toolkit is clearly maintained at Unity's pace, not the RL research community's pace — PPO and SAC are 2018-era algorithms and there's no obvious path to plugging in newer methods like GRPO or RLHF. Documentation migration from GitHub Pages to Unity Package docs is incomplete, and the old URLs are now dead links scattered across Stack Overflow and forums.