Learning Opinion Summarizers by Selecting Informative Reviews

Open Access
Authors
Publication date 2021
Host editors
  • M.-C. Moens
  • X. Huang
  • L. Specia
  • S.W. Sih
Book title 2021 Conference on Empirical Methods in Natural Language Processing
Book subtitle EMNLP 2021 : proceedings of the conference : November 7-11, 2021
ISBN (electronic)
  • 9781955917094
Event 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Pages (from-to) 9424-9442
Number of pages 19
Publisher Stroudsburg, PA: The Association for Computational Linguistics
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract

Opinion summarization has been traditionally approached with unsupervised, weakly-supervised and few-shot learning techniques. In this work, we collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training. However, the number of reviews per product is large (320 on average), making summarization - and especially training a summarizer - impractical. Moreover, the content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates. In order to deal with both of these challenges, we formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets. The choice of the review subset is treated as a latent variable, predicted by a small and simple selector. The subset is then fed into a more powerful summarizer. For joint training, we use amortized variational inference and policy gradient methods. Our experiments demonstrate the importance of selecting informative reviews resulting in improved quality of summaries and reduced hallucinations.

Document type Conference contribution
Note With supplementary video
Language English
Published at https://doi.org/10.18653/v1/2021.emnlp-main.743
Other links https://github.com/abrazinskas/selsum https://www.scopus.com/pages/publications/85127408681
Downloads
2021.emnlp-main.743 (Final published version)
Supplementary materials
Permalink to this page
Back