Spoken QA

Creators
Publication date 23-11-2021
Description Spoken questions automatically generated via Google Translate API for the following datasets (test splits only): - Natural Questions https://ir-datasets.com/beir.html#beir/nq - MS MARCO https://ir-datasets.com/beir.html#beir/msmarco/dev - Simple Questions (WikiData) https://github.com/askplatypus/wikidata-simplequestions Script used for speech generation The automated transcriptions for all spoken questions were made using the Facebook wav2vec2 large ASR model (wav2vec2-large-960h-lv60-self) More info on the project: github.com/svakulenk0/spoken_qa
Publisher Zenodo
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Document type Dataset
DOI https://doi.org/10.5281/zenodo.5720655
Other links https://zenodo.org/record/5720655
Permalink to this page
Back