Unlocking Slot Attention by Changing Optimal Transport Costs

Open Access
Authors
  • Y. Zhang
  • D.W. Zhang
  • S. Lacoste-Julien
  • G.J. Burghouts
Publication date 2023
Journal Proceedings of Machine Learning Research
Event 40th International Conference on Machine Learning, ICML 2023
Volume | Issue number 202
Pages (from-to) 41931-41951
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to handle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose MESH (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.
Document type Article
Note Proceedings of the 40th International Conference on International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USA
Language English
Published at https://proceedings.mlr.press/v202/zhang23ba.html
Downloads
zhang23ba (Final published version)
Permalink to this page
Back