Expert-defined Keywords Improve Interpretability of Retinal Image Captioning

T.-W. Wu; J.-H. Huang; J. Lin; M. Worring

doi:https://doi.org/10.1109/WACV56688.2023.00190

Expert-defined Keywords Improve Interpretability of Retinal Image Captioning

Authors	T.-W. Wu J.-H. Huang J. Lin M. Worring
Publication date	2023
Book title	Proceedings, 2023 IEEE Winter Conference on Applications of Computer Vision
Book subtitle	3-7 January 2023, Waikoloa, Hawaii
ISBN	9781665493475
ISBN (electronic)	9781665493468
Series	WACV
Event	23rd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023
Pages (from-to)	1859-1868
Publisher	Los Alamitos, California: IEEE Computer Society
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Automatic machine learning-based (ML-based) medical report generation systems for retinal images suffer from a relative lack of interpretability. Hence, such ML-based systems are still not widely accepted. The main reason is that trust is one of the important motivating aspects of interpretability and humans do not trust blindly. Precise technical definitions of interpretability still lack consensus. Hence, it is difficult to make a human-comprehensible ML-based medical report generation system. Heat maps/saliency maps, i.e., post-hoc explanation approaches, are widely used to improve the interpretability of ML-based medical systems. However, they are well known to be problematic. From an ML-based medical model’s perspective, the highlighted areas of an image are considered important for making a prediction. However, from a doctor’s perspective, even the hottest regions of a heat map contain both useful and non-useful information. Simply localizing the region, therefore, does not reveal exactly what it was in that area that the model considered useful. Hence, the post-hoc explanation-based method relies on humans who probably have a biased nature to decide what a given heat map might mean. Interpretability boosters, in particular expert-defined keywords, are effective carriers of expert domain knowledge and they are human-comprehensible. In this work, we propose to exploit such keywords and a specialized attention-based strategy to build a more human-comprehensible medical report generation system for retinal images. Both keywords and the proposed strategy effectively improve the interpretability. The proposed method achieves state-of-the-art performance under commonly used text evaluation metrics BLEU, ROUGE, CIDEr, and METEOR. Project website: https://github.com/Jhhuangkay/Expert-defined-Keywords-Improve-Interpretability-of-Retinal-Image-Captioning.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1109/WACV56688.2023.00190
Published at	https://openaccess.thecvf.com/content/WACV2023/html/Wu_Expert-Defined_Keywords_Improve_Interpretability_of_Retinal_Image_Captioning_WACV_2023_paper.html
Other links	https://www.proceedings.com/67559.html
Downloads	Wu_Expert-Defined_Keywords_Improve_Interpretability_of_Retinal_Image_Captioning_WACV_2023_paper (Accepted author manuscript)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Expert-defined Keywords Improve Interpretability of Retinal Image Captioning