Exploiting best-match equations for efficient reinforcement learning
| Authors |
|
|---|---|
| Publication date | 2011 |
| Journal | Journal of Machine Learning Research |
| Volume | Issue number | 12 |
| Pages (from-to) | 2045-2094 |
| Organisations |
|
| Abstract |
This article presents and evaluates best-match learning, a new approach to reinforcement learning that trades off the sample efficiency of model-based methods with the space efficiency of model-free methods. Best-match learning works by approximating the solution to a set of best-match equations, which combine a sparse model with a model-free Q-value function constructed from samples not used by the model. We prove that, unlike regular sparse model-based methods, best-match learning is guaranteed to converge to the optimal Q-values in the tabular case. Empirical results demonstrate that best-match learning can substantially outperform regular sparse model-based methods, as well as several model-free methods that strive to improve the sample efficiency of temporal-difference methods. In addition, we demonstrate that best-match learning can be successfully combined with function approximation.
|
| Document type | Article |
| Language | English |
| Published at | http://jmlr.csail.mit.edu/papers/v12/vanseijen11a.html |
| Downloads |
345513.pdf
(Final published version)
|
| Permalink to this page | |