Prompt Tuned Embedding Classification for Industry Sector Allocation

Open Access
Authors
Publication date 2024
Host editors
  • Y. Yang
  • A. Davani
  • A. Sil
  • A. Kumar
Book title Annual Conference of the North American Chapter of the Association for Computational Linguistics - Industry Track
Book subtitle Proceedings of the Conference (Industry) : NAACL 2024 : June 16-21, 2024
ISBN (electronic)
  • 9798891761209
Event 2024 Conference of the North American Chapter of the Association for Computational Linguistics
Pages (from-to) 108-118
Publisher Kerrville, TX: Association for Computational Linguistics
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
We introduce Prompt Tuned Embedding Classification (PTEC) for classifying companies within an investment firm’s proprietary industry taxonomy, supporting their thematic investment strategy. PTEC assigns companies to the sectors they primarily operate in, conceptualizing this process as a multi-label text classification task. Prompt Tuning, usually deployed as a text-to-text (T2T) classification approach, ensures low computational cost while maintaining high task performance. However, T2T classification has limitations on multi-label tasks due to the generation of non-existing labels, permutation invariance of the label sequence, and a lack of confidence scores. PTEC addresses these limitations by utilizing a classification head in place of the Large Language Models (LLMs) language head. PTEC surpasses both baselines and human performance while lowering computational demands. This indicates the continuing need to adapt state-of-the-art methods to domain-specific tasks, even in the era of LLMs with strong generalization abilities.
Document type Conference contribution
Note With supplementary video
Language English
Published at https://doi.org/10.18653/v1/2024.naacl-industry.10
Downloads
2024.naacl-industry.10-1 (Final published version)
Supplementary materials
Permalink to this page
Back