Multilingual and cross-lingual document classification A meta-learning approach

Open Access
Authors
Publication date 2021
Host editors
  • P. Merlo
  • J. Tiedemann
  • R. Tsarfaty
Book title The 16th Conference of the European Chapter of the Association for Computational Linguistics
Book subtitle EACL 2021 : proceedings of the conference : April 19-23, 2021
ISBN (electronic)
  • 9781954085022
Event 16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021
Pages (from-to) 1966-1976
Number of pages 11
Publisher Stroudsburg, PA: Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods. In this work, we propose a meta-learning approach to document classification in limited-resource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation to previously unseen languages; and multilingual joint training when limited target-language data is available during training. We conduct a systematic comparison of several meta-learning methods, investigate multiple settings in terms of data availability and show that meta-learning thrives in settings with a heterogeneous task distribution. We propose a simple, yet effective adjustment to existing meta-learning methods which allows for better and more stable learning, and set a new state of the art on several languages while performing on-par on others, using only a small amount of labeled data.

Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2021.eacl-main.168
Other links https://github.com/mrvoh/meta_learning_multilingual_doc_classification https://www.scopus.com/pages/publications/85106376276
Downloads
2021.eacl-main.168 (Final published version)
Permalink to this page
Back