A Corpus for Multilingual Analysis of Online Terms of Service

Open Access
Authors
  • K. Drawzeski
  • Andrea Galassi
  • A. Jablonowska
  • F. Lagioia
  • M. Lippi
  • H.W. Micklitz
  • G. Sartor
  • G. Tagiuri
  • P. Torroni
Publication date 2021
Host editors
  • N. Aletras
  • I. Androutsopoulos
  • L. Barrett
  • C. Goanta
  • D. Preotiuc-Pietro
Book title Natural Legal Language Processing (NLLP) : proceedings of the 2021 workshop
Book subtitle EMNLP2021 : November 10, 2021, Punta Cana, Dominican Republic
ISBN (electronic)
  • 9781954085985
Event Natural Legal Language Processing workshop
Pages (from-to) 1-8
Publisher Stroudsburg, PA: The Association for Computational Linguistics
Organisations
  • Faculty of Law (FdR) - Amsterdam Center for European Law and Governance (ACELG)
Abstract We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages
Document type Conference contribution
Language English
Published at https://doi.org/10.18653/v1/2021.nllp-1.1
Downloads
2021.nllp-1.1 (Final published version)
Permalink to this page
Back