Abstract: The article presents a limited‐vocabulary speaker independent continuous Estonian speech recognition system based on hidden Markov models. The system is trained using an annotated Estonian speech database of 60 speakers, approximately 4 hours in duration. Words are modelled using clustered triphones with multiple Gaussian mixture components. The system is evaluated using a number recognition task and a simple medium‐vocabulary recognition task. The system performance is explored by employing acoustic models of increasing complexity. The number recognizer achieves an accuracy of 97%. The medium‐vocabulary system recognizes 82.9% words correctly if operating in real time. The correctness increases to 90.6% if real‐time…requirement is discarded.
Show more
Keywords: continuous speech recognition, hidden Markov models, Estonian
Citation: Informatica,
vol. 15, no. 3, pp. 303-314, 2004