// the find
OptMLGroup/VRP-RL
Reinforcement Learning for Solving the Vehicle Routing Problem
Academic implementation of the 2018 NeurIPS paper on using attention-based RL (pointer networks) to solve TSP and CVRP. It's a research artifact — the code exists to reproduce paper results, not to be used in production routing systems.
- Clean separation between TSP and VRP problem formulations, making it easy to follow how the two differ in the attention model
- The attention mechanism in shared/attention.py directly maps to the paper, so it's useful for understanding how pointer networks work in combinatorial optimization
- Minimal dependencies — just NumPy, TensorFlow, and tqdm, so it's easy to get running if you have the right TF version
- Requires TensorFlow >=1.2, which is five-plus major versions behind and incompatible with modern CUDA setups — you'll fight dependency hell before training a single epoch
- Last commit was May 2021, and the repo has not been updated to TF2; there's no PyTorch port here despite the acknowledgement referencing one
- Scales only to VRP10/20/50/100 node sizes hardcoded in task_specific_params.py — nothing in the codebase supports arbitrary instance sizes or real-world constraint types like time windows
- No pretrained checkpoints are distributed, so you must train from scratch to get any results, and the paper's reported training times are substantial