MATTE: Multi-task multi-scale attention

Authors	G. Strezoski N. van Noord M. Worring
Publication date	02-2023
Journal	Computer Vision and Image Understanding
Article number	103622
Volume \| Issue number	228
Number of pages	10
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	In this work, we propose a general method for learning task and scale based attention representations in Multi-Task Learning (MTL) for vision. It relies on learning and maintaining cross-task and cross-scale representations of visual information, whose interaction contributes to a symmetrical improvement across the entire task pool. Apart from learning data representations, we additionally optimize for the most beneficial interaction between tasks and their representations at different scales. Our method adds an attention modulated feature as residual information to the processing of each scale stage within the model, including the final layer of task outputs. We empirically show the effectiveness of our method through experiments with current multi-modal and multi-scale architectures on diverse MTL datasets. We evaluate MATTE on high and low level vision MTL problems, against MTL and single task learning (STL) counterparts. For all experiments we report solid performance improvements in both qualitative and quantitative performance.
Document type	Article
Language	English
Published at	https://doi.org/10.1016/j.cviu.2023.103622
Downloads	MATTE (Final published version)
Permalink to this page

Back

UvA-DARE