Data Augmentation for Instruction Following Policies via Trajectory Segmentation

Open Access
Authors
Publication date 2025
Host editors
  • T. Walsh
  • J. Shah
  • Z. Kolter
Book title Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence
Book subtitle February 25-March 4, 2025, Philadelphia, Pennsylvania, USA
ISBN (electronic)
  • 9781577358978
Event 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
Volume | Issue number 16
Pages (from-to) 17214-17222
Number of pages 9
Publisher Washington, DC: AAAI Press
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

The scalability of instructable agents in robotics or gaming is often hindered by limited data that pairs instructions with agent trajectories. However, large datasets of unannotated trajectories containing sequences of various agent behaviour (play trajectories) are often available. In a semi-supervised setup, we explore methods to extract labelled segments from play trajectories. The goal is to augment a small annotated dataset of instruction-trajectory pairs to improve the performance of an instruction-following policy trained downstream via imitation learning. Assuming little variation in segment length, recent video segmentation methods can effectively extract labelled segments. To address the constraint of segment length, we propose Play Segmentation (PS), a probabilistic model that finds maximum likely segmentations of extended subsegments, while only being trained on individual instruction segments. Our results in a game environment and a simulated robotic gripper setting underscore the importance of segmentation; randomly sampled segments diminish performance, while incorporating labelled segments from PS improves policy performance to the level of a policy trained on twice the amount of labelled data.

Document type Conference contribution
Language English
Published at https://doi.org/10.1609/aaai.v39i16.33892
Other links https://www.scopus.com/pages/publications/105003971992
Downloads
33892-Article Text-37960-1-2-20250410 (Final published version)
Permalink to this page
Back