2010 Volume 7 Issue 6 Pages 390-396
This paper presents a novel token-based approach for spoken language identification (LID) using bayesian logistic regression model, which takes into account prior distribution for parameters of logistic regression models in order to avoid overfitting. Speech utterances are first decoded into token sequences, and then we design a hierarchical system which utilizes bayesian logistic regression model to perform LID task on these token sequences. Experiments conducted on the NIST LRE 2007 database show that the proposed approach provides quite competitive performance compared to other state-of-the-art token-based approaches.