Factored Adaptation for Non-Stationary Reinforcement Learning

F. Feng; B. Huang; S. Magliacane; K. Zhang

doi:https://doi.org/10.48550/arXiv.2203.16582

Factored Adaptation for Non-Stationary Reinforcement Learning

Authors	F. Feng B. Huang S. Magliacane K. Zhang
Publication date	2023
Host editors	S. Koyejo S. Mohamed A. Agarwal D. Belgrave K. Cho A. Oh
Book title	36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Book subtitle	New Orleans, Louisiana, USA, 28 November-9 December 2022
ISBN	9781713871088
ISBN (electronic)	9781713873129
Series	Advances in Neural Information Processing Systems
Event	Thirty-sixth Conference on Neural Information Processing Systems
Volume \| Issue number	41
Pages (from-to)	31957-31971
Publisher	San Diego, CA: Neural Information Processing Systems Foundation
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Dealing with non-stationarity in environments (e.g., in the transition dynamics) and objectives (e.g., in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.
Document type	Conference contribution
Note	With supplemental file
Language	English
Published at	https://doi.org/10.48550/arXiv.2203.16582 (Accepted author manuscript)
Published at	https://papers.nips.cc/paper_files/paper/2022/hash/cf4356f994917177213c55ff438ddf71-Abstract-Conference.html (Accepted author manuscript)
Other links	https://www.proceedings.com/68431.html
Downloads	NeurIPS-2022-factored-adaptation-for-non-stationary-reinforcement-learning-Paper-Conference (Accepted author manuscript) 2203.16582_with appendix (Accepted author manuscript)
Supplementary materials	NeurIPS-2022-factored-adaptation-for-non-stationary-reinforcement-learning-Supplemental-Conference
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Factored Adaptation for Non-Stationary Reinforcement Learning