Tackling Attribute Fine-grainedness in Cross-modal Fashion Search with Multi-level Features
| Authors |
|
|---|---|
| Publication date | 2021 |
| Book title | Proceedings of the 2021 SIGIR Workshop on eCommerce (SIGIR eCom’20) |
| Book subtitle | July 15, 2021, Virtual Event, Montreal, Canada |
| Event | SIGIR 2021 Workshop on eCommerce |
| Article number | workshop paper 3 |
| Number of pages | 8 |
| Publisher | New York, NY: ACM |
| Organisations |
|
| Abstract |
Leveraging information across modalities can facilitate customers throughout their journey, especially in the fashion domain where the visual modality plays an important role. Fashion products have a variety of visual groups of attributes such as shapes, colors, patterns, etc. Every category is fine-grained, i.e., attributes within a category may be visually very similar, e.g., v-neck vs. round-neck. The fine-grainedness of fashion attributes makes cross-modal fashion retrieval more challenging. In this paper, we address the problem of attribute fine-grainedness in fashion cross-modal retrieval by leveraging multi-level feature representations. In particular, we replace the commonly used spatial segmentation approach with a multi-level feature approach. We compare our approach with state-of-the-art models in general and fashion cross-modal retrieval and evaluate it on the Fashion200K and Fashion-Gen datasets. We record a 43.4% relative increase in text-to-image retrieval and a 57.8% relative increase in image-to-text retrieval on the Fashion200K dataset and a 48.6% relative increase in text-to-image retrieval and a 67.2%relative increase in image-to-text retrieval on the Fashion-Gen dataset while reducing the number of model parameters by 70%when compared with the baselines.
|
| Document type | Conference contribution |
| Language | English |
| Published at | https://sigir-ecom.github.io/ecom21Papers/paper16.pdf |
| Other links | https://sigir-ecom.github.io/ |
| Downloads |
paper16
(Final published version)
|
| Permalink to this page | |
