Pareto Local Policy Search for MOMDP Planning

Open Access
Authors
  • C. Kooijman
  • M. de Waard
  • M. Inja
  • D.M. Roijers
Publication date 2015
Host editors
  • M. Verleysen
Book title ESANN 2015
Book subtitle 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, April 22-23-24, 2015 : proceedings
ISBN
  • 9782875870148
Event 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Pages (from-to) 53-58
Publisher Louvain-la-Neuve: Ciaco
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Standard single-objective methods such as dynamic programming are not applicable to Markov decision processes (MDPs) with multiple objectives because they depend on a maximization function over rewards, which is not defined if the rewards are multi-dimensional. As a result, special multi-objective algorithms are needed to find a set of policies that contains all optimal trade-offs between objectives, i.e. a set of Pareto-optimal policies. In this paper, we propose Pareto Local Policy Search (PLoPS), a new planning method for multi-objective MDPs (MOMDPs) based on Pareto Local Search (PLS). This method produces a good set of policies by iteratively scanning the neighbourhood of locally non-dominated policies for improvements. It is fast because neighbouring policies can be quickly identified as improvements, and their values can be computed incrementally. We test the performance of PLoPS on several MOMDP benchmarks, and compare it to popular decision-theoretic and evolutionary alternatives. The results indicate that PLoPS outperforms the alternatives.
Document type Conference contribution
Language English
Published at https://www.esann.org/proceedings/2015
Downloads
es2015-65 (Final published version)
Permalink to this page
Back