Abstract
A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual (face) identification system utilizes an appearance-based [Kernel] approach for identification. The audio channels, originating from a microphone array, were preprocessed with beamforming and noise preprocessing.
This work was sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors, and are not necessarily endorsed by the United States Government.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Turk, M., Pentland, A.: Eigenfaces for Recognition. Journal of Cognitive Neurosciences 3(1), 71–86 (1991)
Belhumeur, V., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Trans. PAMI 19(7), 711–720 (1997)
Yang, M.H.: Kernel Eigenfaces vs. Kernel Fisherfaces: Face Recognition using Kernel Mehods. In: Proc. of IEEE Int. Conf. on Face and Gesture Recognition, Washington DC, USA (May 2002)
Moghaddam, B., Pentland, A.: Probabilistic Visual Learning for Object Representation. IEEE PAMI 19(7), 696–710 (1997)
Anguera, X., Wooters, C., Hernando, J.: Speaker diarization for multi-party meetings using acoustic fusion. In: IEEE Automatic Speech Recognition and Understanding Workshop, Puerto Rico, USA (2005)
Martin, R., Cox, R.: New Speech Enhancement Techniques for Low Bit Rate Speech Coding. In: Proc IEEE Workshop on Speech Coding (1999)
Campbell, W., Brady, K., Campbell, J., Reynolds, D., Granville, R.: Understanding Scores in Forensic Speaker Recognition. In: IEEE Speaker Odyssey, Puerto Rico, USA (June 2006)
Reynolds, D., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1-3), 19–41 (2000)
Campbell, W., Campbell, J., eynolds, D., Singer, E., Torres, P.: Support Vector Machines for Speaker and Language Recognition. Computer Speech and Language 20(2-3), 210–229 (2006)
Messer, K., et al.: XM2VTSDB: The Extended M2VTS Database. In: AVBPA, Washington DC, USA (1999)
Chibelushi, C.C., Deravi, F., Mason, J.S.D.: A Review of Speech-based Bimodal Recognition. IEEE Trans. On Multimedia 4(1), 23–37 (2002)
Sanderson, C., Paliwal, K.K.: Identity Verification Using Speech and Face Information. Digital Signal Processing Journal 14, 449–480 (2004)
Campbell, J.P.: Seaker Recognition: A Tutorial. Proc. of the IEEE 85(9), 1437–1462 (An Invited Paper, 1997)
Mostefa, D., Potamianos, G., Casas, J., Cristoforetti, L., Pnevmatikakis, A., Burger, S., Stiefelhagen, R., Bernardin, K., Rochet, C.: The CHIL Audiovisual Corpus for Lecture and Meeting Analysis inside Smart Rooms. Journal for Language Resources and Evaluation (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brady, K. (2008). MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-68585-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68584-5
Online ISBN: 978-3-540-68585-2
eBook Packages: Computer ScienceComputer Science (R0)