BoxeR: Box-Attention for 2D and 3D Transformers

Open Access
Authors
Publication date 2022
Book title 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Book subtitle New Orleans, Louisiana, 19-24 June 2022 : proceedings
ISBN
  • 9781665469470
ISBN (electronic)
  • 9781665469463
Series CVPR
Event 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Pages (from-to) 4763-4772
Publisher Los Alamitos, California: IEEE Computer Society
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In this paper, we propose a simple attention mechanism, we call Box-Attention. It enables spatial interaction between grid features, as sampled from boxes of interest, and im- proves the learning capability of transformers for several vision tasks. Specifically, we present BoxeR, short for Box Transformer, which attends to a set of boxes by predicting their transformation from a reference window on an input feature map. The BoxeR computes attention weights on these boxes by considering its grid structure. Notably, BoxeR-2D naturally reasons about box information within its attention module, making it suitable for end-to-end instance detection and segmentation tasks. By learning invariance to rotation in the box-attention module, BoxeR-3D is capable of gener- ating discriminative information from a bird-eye-view plane for 3D end-to-end object detection. Our experiments demon- strate that the proposed BoxeR-2D achieves better results on COCO detection, and reaches comparable performance with well-established and highly-optimized Mask R-CNN on COCO instance segmentation. BoxeR-3D already obtains a compelling performance for the vehicle category of Waymo Open, without any class-specific optimization. The code will be released.
Document type Conference contribution
Note With supplemental file
Language English
Published at https://doi.org/10.48550/arXiv.2111.13087 https://doi.org/10.1109/CVPR52688.2022.00473
Published at https://openaccess.thecvf.com/content/CVPR2022/html/Nguyen_BoxeR_Box-Attention_for_2D_and_3D_Transformers_CVPR_2022_paper.html
Other links https://github.com/kienduynguyen/BoxeR https://www.proceedings.com/65666.html
Downloads
Supplementary materials
Permalink to this page
Back