// the find

mveres01/pytorch-drl4vrp

★ 538 · Python · updated May 2020

Implementation of: Nazari, Mohammadreza, et al. "Deep Reinforcement Learning for Solving the Vehicle Routing Problem." arXiv preprint arXiv:1802.04240 (2018).

A PyTorch implementation of the 2018 Nazari et al. paper that uses attention-based deep RL (pointer networks + REINFORCE) to solve TSP and VRP without hand-crafted heuristics. Aimed at researchers wanting a working baseline for neural combinatorial optimization, not practitioners solving real routing problems.

The masking logic for VRP is well-thought-out and clearly documented — handling depot revisits, demand satisfaction constraints, and minibatch padding edge cases correctly is where most reproductions fall apart. Results are honestly benchmarked against the paper with gaps acknowledged rather than cherry-picked. The training time tables are unusually useful for anyone deciding whether to run this on their own hardware. Code is structured cleanly into model/tasks/trainer with no framework magic hiding what's happening.

Frozen at PyTorch 0.4.1 from 2018 — the API has broken several times since; you will spend time porting before you can run anything. Only implements greedy decoding at test time; the beam search and sampling strategies from the paper are missing, which is where meaningful quality gains come from. VRP100 and TSP100 results are blank in the table, suggesting those scales either didn't converge or were never attempted. No pretrained weights for anything above VRP20, and the Google Drive link for sample weights has an informal sharing URL that may stop working.

View on GitHub →