Algorithms for knowledge-guided sequential decision-making Integrating graphs, demonstrations, human and cross-agent experience

Open Access
Authors
Supervisors
Cosupervisors
  • I. Tiddi
Award date 25-03-2026
ISBN
  • 9789083673240
Number of pages 154
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Reinforcement learning and imitation learning have become foundational frameworks for learning control, particularly within a two-stage pipeline combining large-scale pretraining with reward-based fine-tuning. This thesis investigates mechanisms to enhance the scalability and generalization of this pipeline across embodied, commonsense, and scientific AI domains. In the embodied AI setting, we address the pretraining data bottleneck in two ways: first, by developing a semi-supervised model that extracts task segments from unstructured videos, achieving performance comparable to five times the labeled data; and second, by introducing a cross-agent framework using a shared diffusion planner to pool data from diverse embodiments. In the fine-tuning stage, where knowledge is typically transferred via pretrained weights, we instead propose transferring knowledge by integrating knowledge graphs directly into deep reinforcement learning algorithms. This approach exploits object class hierarchies to compose policies at multiple levels of abstraction, substantially improving generalization to unseen objects. Finally, for scientific agents, we explore the potential for defining a reward signal for fine-tuning. By curating a large-scale dataset of metadata-annotated peer reviews, we demonstrate that specialized scientific embedding-based models predict citation and review scores more reliably than large language models, suggesting their potential utility as a reward signal for fine-tuning.
Document type PhD thesis
Language English
Downloads
Permalink to this page
cover
Back