Algorithms for knowledge-guided sequential decision-making

N.R. Höpner

Algorithms for knowledge-guided sequential decision-making Integrating graphs, demonstrations, human and cross-agent experience

Authors	N.R. Höpner
Supervisors	H.C. van Hoof
Cosupervisors	I. Tiddi
Award date	25-03-2026
ISBN	9789083673240
Number of pages	154
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Reinforcement learning and imitation learning have become foundational frameworks for learning control, particularly within a two-stage pipeline combining large-scale pretraining with reward-based fine-tuning. This thesis investigates mechanisms to enhance the scalability and generalization of this pipeline across embodied, commonsense, and scientific AI domains. In the embodied AI setting, we address the pretraining data bottleneck in two ways: first, by developing a semi-supervised model that extracts task segments from unstructured videos, achieving performance comparable to five times the labeled data; and second, by introducing a cross-agent framework using a shared diffusion planner to pool data from diverse embodiments. In the fine-tuning stage, where knowledge is typically transferred via pretrained weights, we instead propose transferring knowledge by integrating knowledge graphs directly into deep reinforcement learning algorithms. This approach exploits object class hierarchies to compose policies at multiple levels of abstraction, substantially improving generalization to unseen objects. Finally, for scientific agents, we explore the potential for defining a reward signal for fine-tuning. By curating a large-scale dataset of metadata-annotated peer reviews, we demonstrate that specialized scientific embedding-based models predict citation and review scores more reliably than large language models, suggesting their potential utility as a reward signal for fine-tuning.
Document type	PhD thesis
Language	English
Downloads	Thesis
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Algorithms for knowledge-guided sequential decision-making Integrating graphs, demonstrations, human and cross-agent experience