Abstract
We present a method for audio source separation and localization from binaural recordings. The method combines a new generative probabilistic model with time-frequency masking. We suggest that device-dependent relationships between point-source positions and interaural spectral cues may be learnt in order to constrain a mixture model. This allows to capture subtle separation and localization features embedded in the auditory data. We illustrate our method with data composed of two and three mixed speech signals in the presence of reverberations. Using standard evaluation metrics, we compare our method with a recent binaural-based source separation-localization algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Comp. Stat. & Data An. 14(3), 315–332 (1992)
Mandel, M.I., Weiss, R.J., Ellis, D.P.W.: Model-based expectation-maximization source separation and localization. IEEE TASLP 18, 382–394 (2010)
Mouba, J., Marchand, S.: A source localization/separation/respatialization system based on unsupervised classification of interaural cues. In: Proceedings of the International Conference on Digital Audio Effects, pp. 233–238 (2006)
Nix, J., Hohmann, V.: Sound source localization in real sound fields based on empirical statistics of interaural parameters. JASA 119(1), 463–479 (2006)
Roman, N., Wang, D., Brown, G.J.: Speech segregation based on sound localization. JASA 114(4), 2236–2252 (2003)
Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE TASLP 14(4), 1462–1469 (2006)
Viste, H., Evangelista, G.: On the use of spatial cues to improve binaural source separation. In: Proc. Int. Conf. on Digital Audio Effects, pp. 209–213 (2003)
Yılmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 52, 1830–1847 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deleforge, A., Horaud, R. (2012). A Latently Constrained Mixture Model for Audio Source Separation and Localization. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-28551-6_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28550-9
Online ISBN: 978-3-642-28551-6
eBook Packages: Computer ScienceComputer Science (R0)