Nothing Special   »   [go: up one dir, main page]

Skip to main content

Speech Audio Retrieval Using Voice Query

  • Conference paper
Digital Libraries: Achievements, Challenges and Opportunities (ICADL 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4312))

Included in the following conference series:

Abstract

Multimedia data has increasingly become a prevalent resource in Digital Library system; this includes audio, video, and image archives. However, each type of these data may need specific tools to help facilitate effective and efficient retrieval tasks. In this paper, we focus on retrieval of speech audio collection, which includes audio books, speech recordings, interviews, and lectures. Currently, most of the audio retrieval systems are based on keyword/title/author search typed into the system by users. The system then searches for particular keywords and gives a list of entire audio files that potentially are relevant to the query. Nonetheless, browsing audio content for particular section of the audios without knowing the actual content is yet a very difficult task. Moreover, since audio transcription or keyword annotation is very labor intensive and becomes infeasible for large data, we introduce here a preliminary framework that locates subsections of the audio that correspond to the voice query made by a user. We demonstrate a utility of our approach on query retrieval tasks in various types of audio recordings. We also show that this simple framework can potentially help retrieve and locate the voice query within the audio accurately and efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Franz, A., Milch, B.: Searching the Web by Voice. In: Proceedings of COLING (2002)

    Google Scholar 

  2. Kruskall, J.B., Liberman, M.: The symmetric time warping algorithm: From continuous to discrete. In: Time Warps, String Edits and Macromolecules (1983)

    Google Scholar 

  3. Klabbhankao, B.: Online Information Retrieval Using Genetic Algorithms. NECTEC Technical Journal 2(7) (March-June 2000)

    Google Scholar 

  4. Zhu, Y., Shasha, D., Zhao, X.: Query by Humming – in Action with its Technology Revealed. In: ACM SIGMOD, June 9-12 (2003)

    Google Scholar 

  5. Zhu, Y., Shasha, D.: Warping Indexes with Envelope Transforms for Query by Humming. In: ACM SIGMOD, June 9-12 (2003)

    Google Scholar 

  6. Hazen, T.J., Saenko, K., La, C.-H., Glass, J.R.: A Segment-Based Audio-Visual Speech Recognizer: Data Collection, Development, and Initial Experiments. In: Proc. ICMI (2004)

    Google Scholar 

  7. Gutkin, A., King, S.: Structural Representation of Speech for Phonetic Classification. In: Proc. 17th International Conference on Pattern Recognition (ICPR), Cambridge, August 2004, vol. 3, pp. 438–441. IEEE Computer Society Press, Los Alamitos (2004)

    Chapter  Google Scholar 

  8. Ratanamahatana, C.A., Keogh, E.: Three Myths about Dynamic Time Warping Data Mining. In: SIAM International Conference on Data Mining (SDM) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ratanamahatana, C.A., Tohlong, P. (2006). Speech Audio Retrieval Using Voice Query. In: Sugimoto, S., Hunter, J., Rauber, A., Morishima, A. (eds) Digital Libraries: Achievements, Challenges and Opportunities. ICADL 2006. Lecture Notes in Computer Science, vol 4312. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11931584_56

Download citation

  • DOI: https://doi.org/10.1007/11931584_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49375-4

  • Online ISBN: 978-3-540-49377-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics