Voice-QA: Evaluating the Impact of Misrecognized Words on Passage Retrieval

Autores UPV
Revista Lecture Notes in Computer Science


Question Answering is an Information Retrieval task where the query is posed using natural language and the expected result is a concise answer. Voice-activated Question Answering systems represent an interesting application, where the question is formulated by speech. In these systems, an Automatic Speech Recognition module can be used to transcribe the question. Thus, recognition errors may be introduced, producing a significant effect on the answer retrieval process. In this work we study the relationship between some features of misrecognized words and the retrieval results. The features considered are the redundancy of a word in the result set and its inverse document frequency calculated over the collection. The results show that the redundancy of a word may be an important clue on whether an error on it would deteriorate the retrieval results, at least if a closed model is used for speech recognition.