This repository contains the source code of the paper "Imitation Improvement Learning for Large-scale Capacitated Vehicle Routing Problems".
Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones.
In this work, we propose a new improvement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar or better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into subproblems by clustering sequentially nodes in clockwise order, and then learning to solve them simultaneously.
Our approaches enhance state-of-the-art CVRP solvers while attaining competitive solution quality on several well-known datasets, including real-world instances with sizes up to 30,000 nodes.
Our best methods are able to achieve new state-of-the-art results for several large instances and generalize to a wide range of CVRP variants and solvers.
We also contribute new datasets and results to test the generalizability of our deep RL algorithms.
- torch>=1.12.1
- torch_geometric>=2.1.0
- apex (optional)
- VRP heuristic c++ extension:
pip install git+https://github.com/vietbt/VRPcpp.git
├── env
│ ├── Score.py # Solution cost class
│ ├── VRPNode.py # Node class
│ ├── InitSolution.py # Clockhand/DBCA initializer
│ ├── VRPInstance.py # CC & IIL frameworks
│ ├── VRPEnv.py # Main env wrapper
│ ├── VectorizedVRP.py # Wrap to multi-processing env
│ └── utils.py # Environment utilities
├── network
│ ├── encoder.py # GatConv encoder
│ ├── decoder.py # Pointer Attention for k-opt
│ ├── layers.py # Neural network layers
│ ├── model.py # PPO main model
│ └── policy.py # Policy and IIL loss function
├── init_solution.py # Initate solutions
├── runner.py # Env runner with replay buffer
├── main.py # Train
└── evaluate.py # Evaluate
All 6 datasets indicated on our paper are available for downloading at here. You can download and decompress them to the current directory.
- Run
python init_solution.py --data-folder=<data-folder>for initializing solutions for all instances of the input directory and saving to__pycache__
- Run
python main.py --seed=<seed> --data-folder=<data-folder> --algo=<algo> ...for training deep RL and save all optimal solutions to./logsfolder - We have many arguments for
main.pysuch as which heuristic to be used, using imitation loss or not, round solution cost (similar to CVRPLIB authors), number of tours of sub-problem selection, number of workers, etc. - We ran the following command lines in our experiments:
ID Method Command line 1 IIL+VNS python main.py --data-folder=dataset/train/data_cvrp --algo=VNS --round-int --max-count=1281 RL+VNS python main.py --data-folder=dataset/train/data_cvrp --algo=VNS --round-int --max-count=128 --imitation-rate=01 IIL+HGS python main.py --data-folder=dataset/train/data_cvrp --algo=HGS --round-int --max-count=481 RL+HGS python main.py --data-folder=dataset/train/data_cvrp --algo=HGS --round-int --max-count=48 --imitation-rate=02 IIL+HGS python main.py --data-folder=dataset/train/uniform_N500 --algo=HGS --max-count=42 IIL+HGS python main.py --data-folder=dataset/train/uniform_N2000 --algo=HGS --max-count=12 IIL+HGS python main.py --data-folder=dataset/train/uniform_N1000 --algo=HGS --max-count=23 IIL+VNS python main.py --data-folder=dataset/train/data_dimacs --algo=VNS --round-int --max-count=10243 RL+VNS python main.py --data-folder=dataset/train/data_dimacs --algo=VNS --round-int --max-count=1024 --imitation-rate=03 IIL+HGS python main.py --data-folder=dataset/train/data_dimacs --algo=HGS --round-int --max-count=10243 RL+HGS python main.py --data-folder=dataset/train/data_dimacs --algo=HGS --round-int --max-count=1024 --imitation-rate=04 IIL+VNS python main.py --data-folder=dataset/train/realworld/Leuven --algo=VNS --max-count=1024 --min-extend-nodes=512 --n-envs=164 IIL+HGS python main.py --data-folder=dataset/train/realworld/Leuven --algo=HGS --max-count=1024 --min-extend-nodes=512 --n-envs=164 IIL+VNS python main.py --data-folder=dataset/train/realworld/Ghent --algo=VNS --max-count=1024 --min-extend-nodes=512 --n-envs=164 IIL+HGS python main.py --data-folder=dataset/train/realworld/Ghent --algo=HGS --max-count=1024 --min-extend-nodes=512 --n-envs=164 IIL+VNS python main.py --data-folder=dataset/train/realworld/Flanders --algo=VNS --max-count=1024 --min-extend-nodes=1024 --n-envs=164 IIL+HGS python main.py --data-folder=dataset/train/realworld/Flanders --algo=HGS --max-count=1024 --min-extend-nodes=1024 --n-envs=164 IIL+VNS python main.py --data-folder=dataset/train/realworld/Brussels --algo=VNS --max-count=1024 --min-extend-nodes=1024 --n-envs=164 IIL+HGS python main.py --data-folder=dataset/train/realworld/Brussels --algo=HGS --max-count=1024 --min-extend-nodes=1024 --n-envs=164 IIL+VNS python main.py --data-folder=dataset/train/realworld/Antwerp --algo=VNS --max-count=1024 --min-extend-nodes=512 --n-envs=164 IIL+HGS python main.py --data-folder=dataset/train/realworld/Antwerp --algo=HGS --max-count=1024 --min-extend-nodes=512 --n-envs=165 IIL+VNS python main.py --data-folder=dataset/train/data_evrp_wcci --algo=VNS --max-count=1285 RL+VNS python main.py --data-folder=dataset/train/data_evrp_wcci --algo=VNS --max-count=128 --imitation-rate=05 IIL+HGS python main.py --data-folder=dataset/train/data_evrp_wcci --algo=HGS --max-count=245 RL+HGS python main.py --data-folder=dataset/train/data_evrp_wcci --algo=HGS --max-count=24 --imitation-rate=0 - Add
--seed=<seed>to each command line if you want more random seeds - Due to the limitation of computing resources, we cannot benchmark all methods for each dataset
- Run
python evaluate.py --seed=<seed> --data-folder=<data-folder> --algo=<algo> --model-path=<trained-model-path> ...for testing the generalizability of our models - We ran the following command lines in our experiments:
ID Method Command line 2 IIL+HGS python evaluate.py --data-folder=dataset/test/uniform_N500 --algo=HGS --max-count=1 --model-path=logs/dataset_train_uniform_N500_HGS_0.1/seed_1/model.pt2 IIL+HGS python evaluate.py --data-folder=dataset/test/uniform_N1000 --algo=HGS --max-count=1 --model-path=logs/dataset_train_uniform_N1000_HGS_0.1/seed_1/model.pt2 IIL+HGS python evaluate.py --data-folder=dataset/test/uniform_N2000 --algo=HGS --max-count=1 --model-path=logs/dataset_train_uniform_N2000_HGS_0.1/seed_1/model.pt6 RL+VNS python evaluate.py --data-folder=dataset/test/data_evrp_random/seed_1 --algo=VNS --max-count=128 --imitation-rate=0 --model-path=logs/dataset_train_data_evrp_wcci_VNS_0.0/seed_1/model.pt6 IIL+VNS python evaluate.py --data-folder=dataset/test/data_evrp_random/seed_1 --algo=VNS --max-count=128 --model-path=logs/dataset_train_data_evrp_wcci_VNS_0.1/seed_1/model.pt6 RL+HGS python evaluate.py --data-folder=dataset/test/data_evrp_random/seed_1 --algo=HGS --max-count=24 --imitation-rate=0 --model-path=logs/dataset_train_data_evrp_wcci_HGS_0.0/seed_1/model.pt6 IIL+HGS python evaluate.py --data-folder=dataset/test/data_evrp_random/seed_1 --algo=HGS --max-count=24 --model-path=logs/dataset_train_data_evrp_wcci_HGS_0.1/seed_1/model.pt6 ... ... - Note that there are 14 different folders on dataset 6, which random generated with same distribution of dataset 5. Change
--data-folderand--model-pathrespectively to evaluate all of them
- We create 6 notebooks for analyzing results of 6 datasets
- Check all these files for more detail
ID Command line 1 experiments_CVRP.ipynb2 experiments_CVRP_Uniform.ipynb3 experiments_DIMACS.ipynb4 experiments_realworld.ipynb5 experiments_EVRP.ipynb6 experiments_EVRP_generation.ipynb
...