A Corpus for Multilingual Analysis of Online Terms of Service
| Authors |
|
|---|---|
| Publication date | 2021 |
| Host editors |
|
| Book title | Natural Legal Language Processing (NLLP) : proceedings of the 2021 workshop |
| Book subtitle | EMNLP2021 : November 10, 2021, Punta Cana, Dominican Republic |
| ISBN (electronic) |
|
| Event | Natural Legal Language Processing workshop |
| Pages (from-to) | 1-8 |
| Publisher | Stroudsburg, PA: The Association for Computational Linguistics |
| Organisations |
|
| Abstract | We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.18653/v1/2021.nllp-1.1 |
| Downloads |
2021.nllp-1.1
(Final published version)
|
| Permalink to this page | |