// the find

OptMLGroup/VRP-RL

★ 711 · Python · updated May 2021

Reinforcement Learning for Solving the Vehicle Routing Problem

Academic implementation of the 2018 NeurIPS paper on using attention-based RL (pointer networks) to solve TSP and CVRP. It's a research artifact — the code exists to reproduce paper results, not to be used in production routing systems.

- Clean separation between TSP and VRP problem formulations, making it easy to follow how the two differ in the attention model

- The attention mechanism in shared/attention.py directly maps to the paper, so it's useful for understanding how pointer networks work in combinatorial optimization

- Minimal dependencies — just NumPy, TensorFlow, and tqdm, so it's easy to get running if you have the right TF version

- Requires TensorFlow >=1.2, which is five-plus major versions behind and incompatible with modern CUDA setups — you'll fight dependency hell before training a single epoch

- Last commit was May 2021, and the repo has not been updated to TF2; there's no PyTorch port here despite the acknowledgement referencing one

- Scales only to VRP10/20/50/100 node sizes hardcoded in task_specific_params.py — nothing in the codebase supports arbitrary instance sizes or real-world constraint types like time windows

- No pretrained checkpoints are distributed, so you must train from scratch to get any results, and the paper's reported training times are substantial

View on GitHub →