Universidad de Costa Rica

Quality Assessment of HMM-Based Speech Synthesis Using Acoustical Vowel Analysis

Ing. Marvin Coto Jiménez, PhD.
Marvin Coto-Jiménez and John Goddard-Close and Fabiola M Martínez-Licona
Springer, Cham


The synthetic speech produced from a Hidden Markov Model (HMM)-based system is often reported as sounding muffled when it is compared to natural speech. There are several reasons for this effect: some precise and fine characteristics of the natural speech are removed, minimized or hidden in the modeling phase of the HMM system; the resulting speech parameter trajectories become over smoothed versions of the speech waveforms. This means that each synthetic voice constructed from an HMM-based system must be tested for its speech quality. Usually, costly subjective testing is required and it is interesting to find objective alternatives. This paper considers nine acoustic parameters, related to jitter and shimmer, and considers their statistical significance as objective measurements of synthetic speech quality.

© 2020 Escuela de Ingeniería Eléctrica, Universidad de Costa Rica.