Out-of-distribution detection for reliable medical AI

M. Azizmalayeri

Out-of-distribution detection for reliable medical AI

Authors	M. Azizmalayeri
Supervisors	A. Abu-Hanna
Cosupervisors	G Cinà
Award date	03-07-2026
Number of pages	189
Organisations	Faculty of Medicine (AMC-UvA)
Abstract	This thesis investigates the impact of distribution shift on clinical prediction models and examines how out-of-distribution (OOD) detection can improve the reliability and safety of medical artificial intelligence systems in clinical practice. Clinical prediction models are typically developed using specific datasets, yet their performance may deteriorate when applied in different healthcare settings due to variations in patient populations, institutional practices, data collection procedures, and temporal changes in care delivery. Such shifts can result in unreliable predictions and affect clinical decision-making. To address this challenge, the thesis establishes a methodological foundation for applying OOD detection to medical data and demonstrates its clinical relevance across several studies. A benchmarking study evaluated existing OOD detection methods on medical tabular data under multiple distribution-shift scenarios, showing that although current methods can detect obvious OOD cases, they struggle when faced by more subtle shifts. The thesis further demonstrates the clinical consequences of distribution shift in ICU benchmarking and fall-risk prediction. In both applications, predictive performance deteriorated for OOD patients, leading to unreliable patient-level predictions, potentially affecting downstream assessments such as ICU quality evaluation and fall prevention strategies. These findings highlight the importance of integrating OOD detection into clinical workflows. In addition, this thesis proposes methodological improvements, including the Capturing Extreme Activations (CEA) approach and a framework for improving external validation by identifying distribution-shifted cases. Overall, this work demonstrates that strong average model performance alone does not guarantee reliable real-world deployment. By explicitly accounting for distribution shift, OOD detection can help identify unreliable predictions, improve the interpretation of model performance, and support safer, more trustworthy implementation of AI systems in healthcare.
Document type	PhD thesis
Language	English
Downloads	Thesis (complete) (Embargo up to 2028-07-03) Front matter Chapter 1: General introduction Chapter 2: A benchmark for out-of-distribution detection in medical tabular data Chapter 3: Mitigating overconfidence by capturing extreme activations Chapter 4: Managing atypical patients in ICU benchmarking in the Netherlands Chapter 5: Exploring the effects of underrepresented patients on fall prediction (Embargo up to 2028-07-03) Chapter 6: Rethinking external validation for the target population (Embargo up to 2028-07-03) Chapter 7: General discussion Summary; Samenvatting (Dutch summary); Publications; Portfolio; Acknowledgements
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Out-of-distribution detection for reliable medical AI