MATTE: Multi-task multi-scale attention

Open Access
Authors
Publication date 02-2023
Journal Computer Vision and Image Understanding
Article number 103622
Volume | Issue number 228
Number of pages 10
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In this work, we propose a general method for learning task and scale based attention representations in Multi-Task Learning (MTL) for vision. It relies on learning and maintaining cross-task and cross-scale representations of visual information, whose interaction contributes to a symmetrical improvement across the entire task pool. Apart from learning data representations, we additionally optimize for the most beneficial interaction between tasks and their representations at different scales. Our method adds an attention modulated feature as residual information to the processing of each scale stage within the model, including the final layer of task outputs. We empirically show the effectiveness of our method through experiments with current multi-modal and multi-scale architectures on diverse MTL datasets. We evaluate MATTE on high and low level vision MTL problems, against MTL and single task learning (STL) counterparts. For all experiments we report solid performance improvements in both qualitative and quantitative performance.
Document type Article
Language English
Published at https://doi.org/10.1016/j.cviu.2023.103622
Downloads
MATTE (Final published version)
Permalink to this page
Back