// the find

lilianweng/deep-reinforcement-learning-gym

★ 315 · Python · updated Mar 2023

Deep reinforcement learning model implementation in Tensorflow + OpenAI gym

A teaching repo from Lilian Weng (of OpenAI/blog fame) implementing classic deep RL algorithms — Q-learning, DQN, REINFORCE, PPO, DDPG, Actor-Critic — in TensorFlow 1.x against OpenAI Gym. It exists to accompany a 2018 blog post, not to be a production library. Best for someone reading that post and wanting runnable code alongside it.

The algorithm selection is solid for learning fundamentals: you get both value-based (DQN with conv/LSTM/dense variants) and policy-gradient methods (REINFORCE, PPO) plus DDPG for continuous control. JSON-driven config makes it easy to swap hyperparameters without touching code. The companion blog post is genuinely one of the better RL explanations on the internet, so the code and writing reinforce each other. Lilian Weng's credibility means the implementations are unlikely to have subtle algorithmic bugs.

Pinned to TensorFlow 1.x and Python 3.6, which means you're fighting deprecations before you write a single line — TF1 sessions, `feed_dict`, none of it runs on a modern stack without a compatibility shim. OpenAI Gym has since been forked to Gymnasium and the API changed, so half the environments will error out. No activity since 2023 and the setup instructions still reference Homebrew on macOS as the only path. If you actually want to train something today, use Stable-Baselines3 instead; come here only to read the code.

View on GitHub → Homepage ↗