Post-clustering merging with novel metrics for multi-label image collections

Open Access
Authors
Publication date 01-09-2025
Journal Expert Systems With Applications
Article number 127875
Volume | Issue number 288
Number of pages 11
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
This study addresses the task of clustering multi-label image collections, which is increasingly important in fields such as forensics, social media, and intelligence. Traditional classification models fall short in real-world scenarios where labeled data may not be available. Unsupervised clustering is a way to move forward in such cases. Clustering of multi-label data should minimize the number of clusters for an analyst to identify all instances of a specific label, ensuring cluster efficiency, while also reducing misplaced data within each cluster to improve cluster quality. Existing clustering algorithms applied to multi-label image collections generally have a strong emphasis on either cluster efficiency or cluster quality. We propose a Post-Clustering Merging algorithm that provides greater control over cluster efficiency vs quality in multi-label image collections, that can be applied on the results of existing clustering algorithms. We introduce two external metrics designed for multi-label clustering: Pairwise Jaccard Similarity Score and Label Distribution Score. These metrics enable a nuanced evaluation of clustering quality and efficiency, respectively, in scenarios where single-label metrics are inadequate. We demonstrate its effectiveness on various multi-label image collections. The results indicate significant improvements, not only giving more control, but also reducing the trade-off between cluster quality and efficiency. This study fills a gap in multi-label data collection analysis and sets a foundation for future exploration in this domain.
Document type Article
Language English
Published at https://doi.org/10.1016/j.eswa.2025.127875
Other links https://www.scopus.com/pages/publications/105006710465
Downloads
Permalink to this page
Back