Short‐text feature expansion and classification based on nonnegative matrix factorization

Authors
Publication date 12-2022
Journal International Journal of Intelligent Systems
Volume | Issue number 37 | 12
Pages (from-to) 10066-10080
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
In this paper, a non‐negative matrix factorization feature expansion (NMFFE) approach was proposed to overcome the feature‐sparsity issue when expanding features of short‐text. First, we took the internal relationships of short texts and words into account when segmenting words from texts and constructing their relationship matrix. Second, we utilized the Dual regularization non‐negative matrix tri‐factorization (DNMTF) algorithm to obtain the words clustering indicator matrix, which was used to get the feature space by dimensionality reduction methods. Thirdly, words with close relationship were selected out from the feature space and added into the short‐text to solve the sparsity issue. The experimental results showed that the accuracy of short text classification of our NMFFE algorithm increased 25.77%, 10.89%, and 1.79% on three data sets: Web snippets, Twitter sports, and AGnews, respectively compared with the Word2Vec algorithm and Char‐CNN algorithm. It indicated that the NMFFE algorithm was better than the BOW algorithm and the Char‐CNN algorithm in terms of classification accuracy and algorithm robustness.
Document type Article
Language English
Published at https://doi.org/10.1002/int.22290
Published at https://zenodo.org/record/4042991
Permalink to this page
Back