Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1291233.1291391acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

The listening room: a speech-based interactive art installation

Published: 29 September 2007 Publication History

Abstract

In this paper we will present The Listening Room, an interactive audio installation that holds more or less meaningful conversations with up to three people at any one time. Conceived as an artwork that explores the boundaries between virtual and 'real world' experience, The Listening Room incorporates a number of speech technologies. This paper will give an account of the conceptual framework for The Listening Room and will describe the technologies employed in its realisation. To give context to the ideas behind The Listening Room we will describe the work with reference to two previous interactive works by the authors - Face Value(2000) and Alter Ego (2005).

References

[1]
The cslu toolkit. http://cslu.cse.ogi.edu/toolkit/index.html.
[2]
M. Aylett, C. Pidcock, and M. Fraser. The CereVoice Blizzard Entry 2006: A prototype Database Unit Selection Engine. http://festvox.org/blizzard/bc2006/cereprocblizzard2006.pdf, 2006.
[3]
M. Brandstein and D. W. (eds.). Microphone Arrays: Signal Processing Techniques and Applications. Springer, 2001.
[4]
C. Breazeal, A. Wang, and R. Picard. Experiments with a robotic computer, body, affect and cognition interactions. In Proceedings of the Second International Conference on Human-Robot Interaction, Washington DC, 2007.
[5]
R. A. J. Clark, K. Richmond, and S. King. Multisyn: Open-domain unit selection for the Festival speech synthesis system. Speech Communication, 49(4):317--330, 2007.
[6]
J. Clarke. Stelarc's prosthetic head. http://www.ctheory.net/articles.aspx?id=491, Oct. 2005.
[7]
H. Cox, R. Zeskind, and I. Kooij. Practical supergain. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-34(3):393--397, June 1986.
[8]
H. Cox, R. Zeskind, and M. Owen. Robust adaptive beamforming. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-35(10):1365--1376, October 1987.
[9]
H. Cuayhuitl and B. Serridge. Out-of--ocabulary word modeling and rejection for spanish keyword spotting systems. In Proceedings of the Second Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence, pages 156--165. Springer-Verlag, 2002.
[10]
A. Damasio. Error: Emotion, Reason, and the Human Brain. Penguin, 1994.
[11]
J. Foley, A. van Dam, S. Feiner, and J. Hughes. Computer Graphics: Principles and Practice (2nd Ed.). Addison-Wesley Longman Publishing Co., Inc., 1990.
[12]
T. Hain, L. Burget, J. Dines, G. Garau, M. Karafiat, M. Lincoln, J. Vepa, and V. Wan. The ami system for the transcription of speech in meetings. In Proc. ICASSP 2007, Honolulu, Hawaii, USA., April 2007.
[13]
P. Hamilton and R. Hargreaves. The Beautiful and the Damned - The Creation of Identity in Nineteenth Century Photography. Portrait Gallery, London, 2001.
[14]
A. Hunt and A. Black. Unit selection in a concatenative speech synthesis system using a large speech database. In Proceedings of the ICASSP 1996, volume 1, pages 373--376, 1996.
[15]
L. Jordanova. Medicine and the Five Senses, chapter 7. The Art and Science of Seeing in Medicine: Physiognomy 1780--1820. Cambridge, 1993.
[16]
L. Lamel, S. Bennacef, J. L. Gauvain, H. Dartigues, and J. N. Temem. User evaluation of the mask kiosk. Speech Commun., 38(1):131--139, 2002.
[17]
M. A. Walker, R. Passonneau, and J. Boland. Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In Proc of ACL-01, 2001.
[18]
I. McCowan, C. Marro, and L. Mauuary. Robust speech recognition using nearfield superdirective beamforming with postfiltering. In Proc. ICASSP 2000, volume 3, pages 1723--1726, 2000.
[19]
M. Omologo, M. Matassoni, and P. Svaizer. Speech recognition with microphone arrays. In M. Brandstein and D. Ward, editors, Microphone Arrays, pages 331--353. Springer, 2001.
[20]
G. Pask. Conversation, Cognition and Learning. A Cybernetic Theory and Methodology. Elsevier, 1975.
[21]
R. W. Picard. Toward Machines with Emotional Intelligence, Chapter in The science of emotional intelligence: Knowns and unknowns. Oxford University Press, 2007. In Press.
[22]
Steinberg. The VST SDK. http://ygrabit.steinberg.de/öygrabit/public html/index.html.
[23]
W. von Kempelen. Mechanismus der menschlichen sprache nebst beschreibung einer sprechenden maschine ("mechanism of the human speech with description of its speaking machine," ), 1791.
[24]
J. Weizenbaum. Eliza - a computer program for the study of natural language communication between man and machine. Commun. ACM, 9(1):36--45, 1966.
[25]
C. R. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. Pfinder: Real-Time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780--785, 1997.
[26]
A. Wright, A. Linney, and E. Shinkle. Alter ego: Computer reflections of human emotions. In Proceedings of the 6th Digital Art Conference, Copenhagen, 2005.
[27]
M. Yoneyama, J. ichiroh Fujimoto, Y. Kawamo, and S. Sasabe. The audio spotlight: An application of nonlinear interaction of sound waves to a new type of loudspeaker design. The Journal of the Acoustical Society of America, 73(5):1532--1536, May 1983.
[28]
S. Young. The ATK Real-Time API for HTK http://htk.eng.cam.ac.uk/develop/atk.shtml

Cited By

View all
  • (2019)Lucid Peninsula, a Physical Narrative Art Installation Comprising Interactive 360° Virtual Reality ComponentsInternational Journal of Creative Interfaces and Computer Graphics10.4018/IJCICG.201901010110:1(1-15)Online publication date: Jan-2019
  • (2017)DreamScopeProceedings of the 8th International Conference on Digital Arts10.1145/3106548.3106601(67-75)Online publication date: 6-Sep-2017
  • (2015)Identification of the parametric array loudspeaker with a volterra filter using the sparse NLMS algorithm2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2015.7178596(3372-3376)Online publication date: Apr-2015

Index Terms

  1. The listening room: a speech-based interactive art installation

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '07: Proceedings of the 15th ACM international conference on Multimedia
      September 2007
      1115 pages
      ISBN:9781595937025
      DOI:10.1145/1291233
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 September 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. human machine interaction
      2. interactive arts
      3. speech technology
      4. spoken dialogue systems

      Qualifiers

      • Article

      Conference

      MM07

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 13 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Lucid Peninsula, a Physical Narrative Art Installation Comprising Interactive 360° Virtual Reality ComponentsInternational Journal of Creative Interfaces and Computer Graphics10.4018/IJCICG.201901010110:1(1-15)Online publication date: Jan-2019
      • (2017)DreamScopeProceedings of the 8th International Conference on Digital Arts10.1145/3106548.3106601(67-75)Online publication date: 6-Sep-2017
      • (2015)Identification of the parametric array loudspeaker with a volterra filter using the sparse NLMS algorithm2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2015.7178596(3372-3376)Online publication date: Apr-2015

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media