Algorithms for knowledge-guided sequential decision-making Integrating graphs, demonstrations, human and cross-agent experience
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors |
|
| Award date | 25-03-2026 |
| ISBN |
|
| Number of pages | 154 |
| Organisations |
|
| Abstract |
Reinforcement learning and imitation learning have become foundational frameworks for learning control, particularly within a two-stage pipeline combining large-scale pretraining with reward-based fine-tuning. This thesis investigates mechanisms to enhance the scalability and generalization of this pipeline across embodied, commonsense, and scientific AI domains. In the embodied AI setting, we address the pretraining data bottleneck in two ways: first, by developing a semi-supervised model that extracts task segments from unstructured videos, achieving performance comparable to five times the labeled data; and second, by introducing a cross-agent framework using a shared diffusion planner to pool data from diverse embodiments. In the fine-tuning stage, where knowledge is typically transferred via pretrained weights, we instead propose transferring knowledge by integrating knowledge graphs directly into deep reinforcement learning algorithms. This approach exploits object class hierarchies to compose policies at multiple levels of abstraction, substantially improving generalization to unseen objects. Finally, for scientific agents, we explore the potential for defining a reward signal for fine-tuning. By curating a large-scale dataset of metadata-annotated peer reviews, we demonstrate that specialized scientific embedding-based models predict citation and review scores more reliably than large language models, suggesting their potential utility as a reward signal for fine-tuning.
|
| Document type | PhD thesis |
| Language | English |
| Downloads | |
| Permalink to this page | |
