Object-Centric Diffusion for Efficient Video Editing

Authors
  • K. Kahatapitiya
  • A. Karjauv
  • D. Abati
  • F. Porikli
Publication date 2025
Host editors
  • A. Leonardis
  • E. Ricci
  • S. Roth
  • O. Russakovsky
  • T. Sattler
  • G. Varol
Book title Computer Vision – ECCV 2024
Book subtitle 18th European Conference, Milan, Italy, September 29–October 4, 2024 : proceedings
ISBN
  • 9783031729973
ISBN (electronic)
  • 9783031729980
Series Lecture Notes in Computer Science
Event The 18th European Conference on Computer Vision ECCV 2024
Volume | Issue number LVII
Pages (from-to) 91–108
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Diffusion-based video editing have reached impressive quality and can transform either the global style, local structure, and attributes of given video inputs, following textual edit prompts. However, such solutions typically incur heavy memory and computational costs to generate temporally-coherent frames, either in the form of diffusion inversion and/or cross-frame attention. In this paper, we conduct an analysis of such inefficiencies, and suggest simple yet effective modifications that allow significant speed-ups whilst maintaining quality. Moreover, we introduce Object-Centric Diffusion, to fix generation artifacts and further reduce latency by allocating more computations towards foreground edited regions, arguably more important for perceptual quality. We achieve this by two novel proposals: i) Object-Centric Sampling, decoupling the diffusion steps spent on salient or background regions and spending most on the former, and ii) Object-Centric Token Merging, which reduces cost of cross-frame attention by fusing redundant tokens in unimportant background regions. Both techniques are readily applicable to a given video editing model without retraining, and can drastically reduce its memory and computational cost. We evaluate our proposals on inversion-based and control-signal-based editing pipelines, and show a latency reduction up to 10× for a comparable synthesis quality. Project page: qualcomm-ai-research.github.io/object-centric-diffusion.
Document type Conference contribution
Note With supplementary material
Language English
Published at https://doi.org/10.1007/978-3-031-72998-0_6
Other links http://qualcomm-ai-research.github.io/object-centric-diffusion
Permalink to this page
Back