Context-Infused Visual Grounding for Art
| Authors | |
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | Computer Vision – ECCV 2024 Workshops |
| Book subtitle | Milan, Italy, September 29–October 4, 2024 : proceedings |
| ISBN |
|
| ISBN (electronic) |
|
| Series | Lecture Notes in Computer Science |
| Event | Workshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024 |
| Volume | Issue number | VI |
| Pages (from-to) | 118-136 |
| Publisher | Cham: Springer |
| Organisations |
|
| Abstract |
Many artwork collections contain textual attributes that provide rich and contextualised descriptions of artworks. Visual grounding offers the potential for localising subjects within these descriptions on images, however, existing approaches are trained on natural images and generalise poorly to art. In this paper, we present CIGAr (Context-Infused GroundingDINO for Art), a visual grounding approach which utilises the artwork descriptions during training as context, thereby enabling visual grounding on art. In addition, we present a new dataset, Ukiyo-eVG, with manually annotated phrase-grounding annotations, and we set a new state-of-the-art for object detection on two artwork datasets.
|
| Document type | Conference contribution |
| Note | With supplementary file |
| Language | English |
| Published at | https://doi.org/10.1007/978-3-031-91572-7_8 |
| Other links | https://www.scopus.com/pages/publications/105006904027 |
| Downloads |
Context-Infused Visual Grounding for Art
(Final published version)
|
| Supplementary materials | |
| Permalink to this page | |
