The NGT200 Dataset: Geometric Multi-View Isolated Sign Recognition
| Authors |
|
|---|---|
| Publication date | 2024 |
| Journal | Proceedings of Machine Learning Research |
| Event | 41st International Conference on Machine Learning, ICML 2024 |
| Volume | Issue number | 251 |
| Pages (from-to) | 286-302 |
| Number of pages | 17 |
| Organisations |
|
| Abstract |
Sign Language Processing (SLP) provides a foundation for a more inclusive future in language technology; however, the field faces several significant challenges that must be addressed to achieve practical, real-world applications. This work addresses multi-view isolated sign recognition (MV-ISR), and highlights the essential role of 3D awareness and geometry in SLP systems. We introduce the NGT200 dataset, a novel spatio-temporal multi-view benchmark, establishing MV-ISR as distinct from single-view ISR (SV-ISR). We demonstrate the benefits of synthetic data and propose conditioning sign representations on spatial symmetries inherent in sign language. Leveraging an SE(2) equivariant model improves MV-ISR performance by 8-22 percent over the baseline.
|
| Document type | Article |
| Note | Proceedings of the Geometry-grounded Representation Learning and Generative Modeling Workshop (GRaM) at ICML 2024, 29 July 2024, Vienna, Austria |
| Language | English |
| Published at | https://proceedings.mlr.press/v251/ranum24a.html |
| Downloads |
ranum24a
(Final published version)
|
| Permalink to this page | |
