Statistical text-to-speech synthesis of Spanish subtitles

¿Quieres contarnos tu reto? Pincha aquí y te ayudamos a encontrar una solución

Autores UPV

Piqueras Gozalbes Santiago Romualdo, Agua Teba Miguel Ángel del, Giménez Pastor Adrián, Civera Saiz Jorge, Juan Císcar Alfonso

Año

2014

Revista

Lecture Notes in Computer Science

Abstract

Online multimedia repositories are growing rapidly. However, language barriers are often difficult to overcome for many of the current and potential users. In this paper we describe a TTS Spanish sys- tem and we apply it to the synthesis of transcribed and translated video lectures. A statistical parametric speech synthesis system, in which the acoustic mapping is performed with either HMM-based or DNN-based acoustic models, has been developed. To the best of our knowledge, this is the first time that a DNN-based TTS system has been implemented for the synthesis of Spanish. A comparative objective evaluation between both models has been carried out. Our results show that DNN-based systems can reconstruct speech waveforms more accurately.

Más Información