Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology

Fabio Crestani¹

82 Accesses
1 Citation
Explore all metrics

Abstract

This paper presents the design and the current prototype implementation of an interactive vocal Information Retrieval system that can be used to access articles of a large newspaper archive using a telephone. The implementation of the system highlights the limitations of current voice information retrieval technology, in particular of speech recognition and synthesis. We present our evaluation of these limitations and address the feasibility of intelligent interactive vocal information access systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Transcription of Polish Radio and Television Broadcast Audio

The Spoken Wikipedia Corpus collection: Harvesting, alignment and an application to hyperlistening

Article 09 January 2018

Question-Answering Dialog System for Large Audiovisual Archives

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Allan, J. (2002). Perspectives on Information Retrieval and Speech. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications (pp. 1–10). Berlin, Germany: Springer-Verlag.
Google Scholar
Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., and Kuo, S.W. (1997). Experiments in Spoken Queries for Document Retrieval. In Eurospeech 97, vol. 3, Rodhes, Greece, pp. 1323–1326.
Google Scholar
Bernsen, N.O., Dybkjoer, H., and Dybkjoer, L. (1997). What Should Your Speech System Say? IEEE Computer 25–31.
Callan, J.P. (1994). Passage-Level Evidence in Document Retrieval. In Proceedings of ACM SIGIR, Dublin, Ireland (pp. 302–310).
Crestani, F. (2000a). Combination of Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing. In Proceedings of the 8th Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU), Madrid, Spain (pp. 960–967).
Crestani, F. (2000b). Effects ofWord Recognition Errors in Spoken Query Processing. In Proceedings of the IEEE ADL 2000 Conference, Washington DC, USA (pp. 39–47).
Crestani, F. (2000c). Exploiting the Similarity of Non-Matching Terms at Retrieval Time. Journal of Information Retrieval, 2(1), 23–43.
Google Scholar
Crestani, F. (2000d).Word Recognition Errors and Relevance Feedback in Spoken Query Processing. In Proceedings of Fourth Internation Conference on Flexible Query Answering Systems, Warsaw, Poland (pp. 267–281).
Crestani, F., Lalmas, M., van Rijsbergen, C.J., and Campbell, I. (1998). Is this Document Relevant?...Probably. A Survey of Probabilistic Models in Information Retrieval. ACM Computing Surveys, 30(4), 528–552.
Google Scholar
Dutoit, T. (1997a). High Quality Text-to-Speech Synthesis: An Overview. Journal of Electrical and Electronics Engineering, Australia, 17(1), 25–37.
Google Scholar
Dutoit, T. (1997b). An Introduction to Text-To-Speech Synthesis. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Google Scholar
Frakes, W.R. and Baeza-Yates, R. (Eds.) (1992). Information Retrieval: Data Structures and Algorithms. Englewood Cliffs, New Jersey, USA: Prentice Hall.
Google Scholar
Garofolo, J.S., Auzanne, C.G.P., and Voorhees, E.M. (1999). The TREC Spoken Document Retrieval Track: A Success Story. In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 107–130).
Harman, D. (1992). Relevance Feedback and Other Query Modification Techniques. In W.B. Frakes and R. Baeza-Yates (Eds.), Information Retrieval: Data Structures and Algorithms, ch. 11, Englewood Cliffs, New Jersey, USA: Prentice Hall.
Google Scholar
Jurafsky, D. and Martin, J.H. (2000). Speech and Language Processing. Upper Saddle River, NJ, USA: Prentice Hall.
Google Scholar
Kao, Y.H., Hemphill, C.T., Wheatley, B.J., and Rajasekaran, P.K. (1994). Toward Vocabulary Independent Telephone Speech Recognition. In Proccedings of ICASSP'94, vol. 1, Adelaide, Australia (pp. 117–120).
Google Scholar
Kim, J. and Oard, W. (2002). The Use of Speech Retrieval Systems: A Study Design. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications, Berlin, Germany: Springer-Verlag (pp. 87–93).
Google Scholar
Markowitz, J.A. (1996). Using Speech Recognition. Upper Saddle River, NJ, USA: Prentice Hall.
Google Scholar
Miller, S. (1984). Experimental Design and Statistics, 2nd Edn. London, UK: Routledge.
Google Scholar
Mittendorf, E. and Schauble, P. (1996). Measuring the Effects of Data Corruption on Information Retrieval. In Proceedings of the SDAIR 96 Conference, Las Vegas, NV, USA (pp. 179–189).
Peckham, J. (1991). Speech Understanding and Dialogue over the Telephone: An Overview of the ESPRIT SUNDIAL Project. In Proceedings of theWorkshop on Speech and Natural Language, Pacific Grove, CA, USA (pp. 14–27).
Peckham, J. (1996). Speech Understanding and Dialogue over the Telephone. In K. Varghese, S. Pfleger, and J.-P. Lefevre (Eds.), Advanced Speech Applications (pp. 112–125). Berlin, Germany: Springer-Verlag.
Google Scholar
Porter, M.F. (1980). An Algorithm for Suffix Stripping. Program, 14(3), 130–137.
Google Scholar
Sanderson, M. (1996). Word Sense Disambiguation and Information Retrieval. Ph.D. Thesis, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK.
Google Scholar
Silipo, R. and Crestani, F. (2000). Prosodic Stress and Topic Detection in Spoken Sentences. In Proceedings of the SPIRE 2000, the Seventh Symposium on String Processing and Information Retrieval, La Corunna, Spain (pp. 243–252).
Singhal, A., Choi, J., Hindle, D., Lewis, D.D., and Pereira, F. (1998). AT&T at TREC-7. In Proceedings of the TREC Conference, Washington DC, USA (pp. 239–253).
Smith, R.W. and Hipp, D.R. (1994). Spoken Natural Language Dialog Systems: A Practical Approach. Oxford, UK: Oxford University Press.
Google Scholar
Stolcke, A., Shriberg, E., Hakkani-Tur, D., Tur, G., Rivlin, Z., and Sonmez, K. (1999). Combining Words and Speech Prosody for Automatic Topic Segmentation. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Washington DC, USA.
Tombros, A. and Crestani, F. (2000). Users's Perception of Relevance of Spoken Documents. Journal of the American Society for Information Science, 51(9), 929–939.
Google Scholar
Tombros, A. and Sanderson, M. (1998). Advantages of Query Biased Summaries in Information Retrieval. In Proceedings of ACM SIGIR, Melbourne, Australia (pp. 2–10).
van Rijsbergen, C.J. (1979). Information Retrieval, 2nd Edn., London, UK: Butterworths.
Google Scholar
Voorhees, E., Garofolo, J., and Sparck Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track. In TREC-6 notebook, NIST, Gaithersburgh, MD, USA (pp. 167–170).
Voorhees, E.M. and Harman, D. (1998). Overview of the Seventh Text Retrieval Conference (TREC-7). In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 1–24).

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Strathclyde, 26 Richmond Street, Glasgow G1 1XH, Scotland, UK
Fabio Crestani

Authors

Fabio Crestani
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Crestani, F. Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology. Journal of Intelligent Information Systems 20, 161–180 (2003). https://doi.org/10.1023/A:1021824019028

Download citation

Issue Date: March 2003
DOI: https://doi.org/10.1023/A:1021824019028

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Transcription of Polish Radio and Television Broadcast Audio

The Spoken Wikipedia Corpus collection: Harvesting, alignment and an application to hyperlistening

Question-Answering Dialog System for Large Audiovisual Archives

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Transcription of Polish Radio and Television Broadcast Audio

The Spoken Wikipedia Corpus collection: Harvesting, alignment and an application to hyperlistening

Question-Answering Dialog System for Large Audiovisual Archives

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation