Deep Policy Dynamic Programming for Vehicle Routing Problems
| Authors | |
|---|---|
| Publication date | 2022 |
| Host editors |
|
| Book title | Integration of Constraint Programming, Artificial Intelligence, and Operations Research |
| Book subtitle | 19th International Conference, CPAIOR 2022, Los Angeles, CA, USA, June 20-23, 2022 : proceedings |
| ISBN |
|
| ISBN (electronic) |
|
| Series | Lecture Notes in Computer Science |
| Event | 19th International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research |
| Pages (from-to) | 190–213 |
| Publisher | Cham: Springer |
| Organisations |
|
| Abstract |
Routing problems are a class of combinatorial problems with many
practical applications. Recently, end-to-end deep learning methods have
been proposed to learn approximate solution heuristics for such
problems. In contrast, classical dynamic programming (DP) algorithms
guarantee optimal solutions, but scale badly with the problem size. We
propose Deep Policy Dynamic Programming (DPDP), which aims to
combine the strengths of learned neural heuristics with those of DP
algorithms. DPDP prioritizes and restricts the DP state space using a
policy derived from a deep neural network, which is trained to predict
edges from example solutions. We evaluate our framework on the
travelling salesman problem (TSP), the vehicle routing problem (VRP) and
TSP with time windows (TSPTW) and show that the neural policy improves
the performance of (restricted) DP algorithms, making them competitive
to strong alternatives such as LKH, while also outperforming most other
‘neural approaches’ for solving TSPs, VRPs and TSPTWs with 100 nodes.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.48550/arXiv.2102.11756 https://doi.org/10.1007/978-3-031-08011-1_14 |
| Downloads |
2102.11756
(Submitted manuscript)
978-3-031-08011-1_14
(Final published version)
|
| Permalink to this page | |
