MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation

Kevin Brady¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4625))

Included in the following conference series:

1252 Accesses

Abstract

A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual (face) identification system utilizes an appearance-based [Kernel] approach for identification. The audio channels, originating from a microphone array, were preprocessed with beamforming and noise preprocessing.

This work was sponsored by the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the authors, and are not necessarily endorsed by the United States Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unified System for Visual Speech Recognition and Speaker Identification

Vienna Talking Faces (ViTaFa): A multimodal person database with synchronized videos, images, and voices

Article Open access 10 November 2023

People Identification and Tracking Through Fusion of Facial and Gait Features

References

Turk, M., Pentland, A.: Eigenfaces for Recognition. Journal of Cognitive Neurosciences 3(1), 71–86 (1991)
Article Google Scholar
Belhumeur, V., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Trans. PAMI 19(7), 711–720 (1997)
Google Scholar
Yang, M.H.: Kernel Eigenfaces vs. Kernel Fisherfaces: Face Recognition using Kernel Mehods. In: Proc. of IEEE Int. Conf. on Face and Gesture Recognition, Washington DC, USA (May 2002)
Google Scholar
Moghaddam, B., Pentland, A.: Probabilistic Visual Learning for Object Representation. IEEE PAMI 19(7), 696–710 (1997)
Google Scholar
Anguera, X., Wooters, C., Hernando, J.: Speaker diarization for multi-party meetings using acoustic fusion. In: IEEE Automatic Speech Recognition and Understanding Workshop, Puerto Rico, USA (2005)
Google Scholar
Martin, R., Cox, R.: New Speech Enhancement Techniques for Low Bit Rate Speech Coding. In: Proc IEEE Workshop on Speech Coding (1999)
Google Scholar
Campbell, W., Brady, K., Campbell, J., Reynolds, D., Granville, R.: Understanding Scores in Forensic Speaker Recognition. In: IEEE Speaker Odyssey, Puerto Rico, USA (June 2006)
Google Scholar
Reynolds, D., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1-3), 19–41 (2000)
Article Google Scholar
Campbell, W., Campbell, J., eynolds, D., Singer, E., Torres, P.: Support Vector Machines for Speaker and Language Recognition. Computer Speech and Language 20(2-3), 210–229 (2006)
Article Google Scholar
Messer, K., et al.: XM2VTSDB: The Extended M2VTS Database. In: AVBPA, Washington DC, USA (1999)
Google Scholar
Chibelushi, C.C., Deravi, F., Mason, J.S.D.: A Review of Speech-based Bimodal Recognition. IEEE Trans. On Multimedia 4(1), 23–37 (2002)
Article Google Scholar
Sanderson, C., Paliwal, K.K.: Identity Verification Using Speech and Face Information. Digital Signal Processing Journal 14, 449–480 (2004)
Article Google Scholar
Campbell, J.P.: Seaker Recognition: A Tutorial. Proc. of the IEEE 85(9), 1437–1462 (An Invited Paper, 1997)
Article Google Scholar
Mostefa, D., Potamianos, G., Casas, J., Cristoforetti, L., Pnevmatikakis, A., Burger, S., Stiefelhagen, R., Bernardin, K., Rochet, C.: The CHIL Audiovisual Corpus for Lecture and Meeting Analysis inside Smart Rooms. Journal for Language Resources and Evaluation (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

MIT Lincoln Laboratory, , 244 Wood Street, Lexington Massachusetts, 02420, USA
Kevin Brady

Authors

Kevin Brady
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Rainer Stiefelhagen Rachel Bowers Jonathan Fiscus

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brady, K. (2008). MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation. In: Stiefelhagen, R., Bowers, R., Fiscus, J. (eds) Multimodal Technologies for Perception of Humans. RT CLEAR 2007 2007. Lecture Notes in Computer Science, vol 4625. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68585-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-68585-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68584-5
Online ISBN: 978-3-540-68585-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Unified System for Visual Speech Recognition and Speaker Identification

Vienna Talking Faces (ViTaFa): A multimodal person database with synchronized videos, images, and voices

People Identification and Tracking Through Fusion of Facial and Gait Features

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Unified System for Visual Speech Recognition and Speaker Identification

Vienna Talking Faces (ViTaFa): A multimodal person database with synchronized videos, images, and voices

People Identification and Tracking Through Fusion of Facial and Gait Features

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation