Making Universal Policies Universal
| Authors | |
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | AAMAS '25 |
| Book subtitle | Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems : May 19-23, 2025, Detroit, Michigan, USA |
| ISBN (electronic) |
|
| Event | 24th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2025 |
| Pages (from-to) | 2553-2555 |
| Number of pages | 3 |
| Publisher | International Foundation for Autonomous Agents and Multiagent Systems |
| Organisations |
|
| Abstract |
The development of a generalist agent capable of solving a wide range of sequential decision-making tasks remains a significant challenge. We address this problem in a cross-agent setup where agents share the same observation space but differ in their action spaces. Our approach builds on the universal policy framework, which decouples policy learning into two stages: a diffusion-based planner that generates observation sequences and an inverse dynamics model that assigns actions to these plans. We propose a method for training the planner on a joint dataset composed of trajectories from all agents. This method offers the benefit of positive transfer by pooling data from different agents, while the primary challenge lies in adapting shared plans to each agent's unique constraints. We evaluate our approach on the BabyAI environment, covering tasks of varying complexity, and demonstrate positive transfer across agents. Additionally, we examine the planner's ability to generalise to unseen agents and show that our method outperforms traditional imitation learning approaches. |
| Document type | Conference contribution |
| Note | Extended abstract. |
| Language | English |
| Published at | https://doi.org/10.48550/arXiv.2502.14777 |
| Published at | https://www.ifaamas.org/Proceedings/aamas2025/pdfs/p2553.pdf https://dl.acm.org/doi/10.5555/3709347.3743934 |
| Other links | https://www.scopus.com/pages/publications/105009808607 |
| Downloads |
p2553
(Final published version)
|
| Permalink to this page | |
