A Test Collection of Synthetic Documents for Training Rankers ChatGPT vs. Human Experts
| Authors | |
|---|---|
| Publication date | 2023 |
| Book title | CIKM '23 |
| Book subtitle | Proceedings of the 32nd ACM International Conference on Information and Knowledge Management : October 21-25, 2023, Birmingham, England |
| ISBN (electronic) |
|
| Event | 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023 |
| Pages (from-to) | 5311-5315 |
| Number of pages | 5 |
| Publisher | New York, NY: Association for Computing Machinery |
| Organisations |
|
| Abstract |
We investigate the usefulness of generative large language models (LLMs) in generating training data for cross-encoder re-rankers in a novel direction: generating synthetic documents instead of synthetic queries. We introduce a new dataset, ChatGPT-RetrievalQA, and compare the effectiveness of strong models fine-tuned on both LLM-generated and human-generated data. We build ChatGPT-RetrievalQA based on an existing dataset, the human ChatGPT comparison corpus (HC3), consisting of multiple public question collections featuring both human- and ChatGPT-generated responses. We fine-tune a range of cross-encoder re-rankers on either human-generated or ChatGPT-generated data. Our evaluation on MS MARCO DEV, TREC DL'19, and TREC DL'20 demonstrates that cross-encoder re-ranking models trained on LLM-generated responses are significantly more effective for out-of-domain re-ranking than those trained on human responses. For in-domain re-ranking, however, the human-trained re-rankers outperform the LLM-trained re-rankers. Our novel findings suggest that generative LLMs have high potential in generating training data for neural retrieval models and can be used to augment training data, especially in domains with less labeled data. ChatGPT-RetrievalQA presents various opportunities for analyzing and improving rankers with both human- and LLM-generated data. Our data, code, and model checkpoints are publicly available. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1145/3583780.3615111 |
| Other links | https://github.com/arian-askari/ChatGPT-RetrievalQA https://www.scopus.com/pages/publications/85178122401 |
| Downloads |
3583780.3615111
(Final published version)
|
| Permalink to this page | |
