The emergence of number and syntax units in LSTM language models

Open Access
Authors
  • Y. Lakretz
  • G. Kruszewski
  • T. Desbordes
  • D. Hupkes
  • S. Dehaene
  • M. Baroni
Publication date 2019
Host editors
  • J. Burstein
  • C. Doran
  • T. Solorio
Book title The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Book subtitle NAACL HLT 2019 : proceedings of the conference : June 2-June 7, 2019
ISBN (electronic)
  • 9781950737130
Event 2019 Conference of the North American Chapter of the Association for Computational Linguistics
Volume | Issue number 1
Pages (from-to) 11-20
Publisher Stroudsburg, PA: The Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Science (FNWI)
Abstract
Recent work has shown that LSTMs trained on a generic language modeling objective capture syntax-sensitive generalizations such as long-distance number agreement. We have however no mechanistic understanding of how they accomplish this remarkable feat. Some have conjectured it depends on heuristics that do not truly take hierarchical structure into account. We present here a detailed study of the inner mechanics of number tracking in LSTMs at the single neuron level. We discover that long-distance number information is largely managed by two “number units”. Importantly, the behaviour of these units is partially controlled by other units independently shown to track syntactic structure. We conclude that LSTMs are, to some extent, implementing genuinely syntactic processing mechanisms, paving the way to a more general understanding of grammatical encoding in LSTMs.
Document type Conference contribution
Note With supplementary material
Language English
Published at https://doi.org/10.18653/v1/N19-1002
Other links https://vimeo.com/347368203
Downloads
N19-1002 (Final published version)
Supplementary materials
Permalink to this page
Back