Learning Hierarchical Planning-Based Policies from Offline Data

Open Access
Authors
Publication date 2023
Host editors
  • D. Koutra
  • C. Plant
  • M. Gomes Rodriguez
  • E. Baralis
  • F. Bonchi
Book title Machine Learning and Knowledge Discovery in Databases: Research Track
Book subtitle European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023 : proceedings
ISBN
  • 9783031434204
ISBN (electronic)
  • 9783031434211
Series Lecture Notes in Computer Science
Event 2023 European Conference on Machine Learning and Knowledge Discovery in Databases
Volume | Issue number IV
Pages (from-to) 489–505
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Hierarchical policy architectures incorporating some planning component into the top-level have shown superior performance and generalization in agent navigation tasks. Cost or safety reasons may, however, prevent training in an online (RL) fashion with continuous environment interaction. We therefore propose HORIBLe-VRN, an algorithm to learn a hierarchical policy with a top-level planning-based module from pre-collected data. A key challenge is to deal with the unknown, latent high-level (HL) actions. Our algorithm features an EM-style hierarchical imitation learning stage, incorporating HL action inference, and a subsequent offline RL refinement stage for the top-level policy. We empirically evaluate HORIBLe-VRN in a long horizon, sparse reward agent navigation task, investigating performance, generalization capabilities, and robustness with respect to sub-optimal demonstration data.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-031-43421-1_29
Downloads
978-3-031-43421-1_29 (Final published version)
Permalink to this page
Back