Unlocking Slot Attention by Changing Optimal Transport Costs
| Authors |
|
|---|---|
| Publication date | 2023 |
| Journal | Proceedings of Machine Learning Research |
| Event | 40th International Conference on Machine Learning, ICML 2023 |
| Volume | Issue number | 202 |
| Pages (from-to) | 41931-41951 |
| Organisations |
|
| Abstract |
Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to handle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose MESH (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.
|
| Document type | Article |
| Note | Proceedings of the 40th International Conference on International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USA |
| Language | English |
| Published at | https://proceedings.mlr.press/v202/zhang23ba.html |
| Downloads |
zhang23ba
(Final published version)
|
| Permalink to this page | |
