Discriminative spoken language understanding using word confusion networks

M Henderson, M Gašić, B Thomson… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org
2012 IEEE Spoken Language Technology Workshop (SLT), 2012ieeexplore.ieee.org
Current commercial dialogue systems typically use hand-crafted grammars for Spoken
Language Understanding (SLU) operating on the top one or two hypotheses output by the
speech recogniser. These systems are expensive to develop and they suffer from significant
degradation in performance when faced with recognition errors. This paper presents a
robust method for SLU based on features extracted from the full posterior distribution of
recognition hypotheses encoded in the form of word confusion networks. Following [1], the …
Current commercial dialogue systems typically use hand-crafted grammars for Spoken Language Understanding (SLU) operating on the top one or two hypotheses output by the speech recogniser. These systems are expensive to develop and they suffer from significant degradation in performance when faced with recognition errors. This paper presents a robust method for SLU based on features extracted from the full posterior distribution of recognition hypotheses encoded in the form of word confusion networks. Following [1], the system uses SVM classifiers operating on n-gram features, trained on unaligned input/output pairs. Performance is evaluated on both an off-line corpus and on-line in a live user trial. It is shown that a statistical discriminative approach to SLU operating on the full posterior ASR output distribution can substantially improve performance both in terms of accuracy and overall dialogue reward. Furthermore, additional gains can be obtained by incorporating features from the previous system output.
ieeexplore.ieee.org