Pareto Local Policy Search for MOMDP Planning
| Authors |
|
|---|---|
| Publication date | 2015 |
| Host editors |
|
| Book title | ESANN 2015 |
| Book subtitle | 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, April 22-23-24, 2015 : proceedings |
| ISBN |
|
| Event | 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning |
| Pages (from-to) | 53-58 |
| Publisher | Louvain-la-Neuve: Ciaco |
| Organisations |
|
| Abstract |
Standard single-objective methods such as dynamic programming are not applicable to Markov decision processes (MDPs) with multiple objectives because they depend on a maximization function over rewards, which is not defined if the rewards are multi-dimensional. As a result, special multi-objective algorithms are needed to find a set of policies that contains all optimal trade-offs between objectives, i.e. a set of Pareto-optimal policies. In this paper, we propose Pareto Local Policy Search (PLoPS), a new planning method for multi-objective MDPs (MOMDPs) based on Pareto Local Search (PLS). This method produces a good set of policies by iteratively scanning the neighbourhood of locally non-dominated policies for improvements. It is fast because neighbouring policies can be quickly identified as improvements, and their values can be computed incrementally. We test the performance of PLoPS on several MOMDP benchmarks, and compare it to popular decision-theoretic and evolutionary alternatives. The results indicate that PLoPS outperforms the alternatives.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://www.esann.org/proceedings/2015 |
| Downloads |
es2015-65
(Final published version)
|
| Permalink to this page | |