Abstract
This paper presents the design and the current prototype implementation of an interactive vocal Information Retrieval system that can be used to access articles of a large newspaper archive using a telephone. The implementation of the system highlights the limitations of current voice information retrieval technology, in particular of speech recognition and synthesis. We present our evaluation of these limitations and address the feasibility of intelligent interactive vocal information access systems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Allan, J. (2002). Perspectives on Information Retrieval and Speech. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications (pp. 1–10). Berlin, Germany: Springer-Verlag.
Barnett, J., Anderson, S., Broglio, J., Singh, M., Hudson, R., and Kuo, S.W. (1997). Experiments in Spoken Queries for Document Retrieval. In Eurospeech 97, vol. 3, Rodhes, Greece, pp. 1323–1326.
Bernsen, N.O., Dybkjoer, H., and Dybkjoer, L. (1997). What Should Your Speech System Say? IEEE Computer 25–31.
Callan, J.P. (1994). Passage-Level Evidence in Document Retrieval. In Proceedings of ACM SIGIR, Dublin, Ireland (pp. 302–310).
Crestani, F. (2000a). Combination of Semantic and Phonetic Term Similarity for Spoken Document Retrieval and Spoken Query Processing. In Proceedings of the 8th Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU), Madrid, Spain (pp. 960–967).
Crestani, F. (2000b). Effects ofWord Recognition Errors in Spoken Query Processing. In Proceedings of the IEEE ADL 2000 Conference, Washington DC, USA (pp. 39–47).
Crestani, F. (2000c). Exploiting the Similarity of Non-Matching Terms at Retrieval Time. Journal of Information Retrieval, 2(1), 23–43.
Crestani, F. (2000d).Word Recognition Errors and Relevance Feedback in Spoken Query Processing. In Proceedings of Fourth Internation Conference on Flexible Query Answering Systems, Warsaw, Poland (pp. 267–281).
Crestani, F., Lalmas, M., van Rijsbergen, C.J., and Campbell, I. (1998). Is this Document Relevant?...Probably. A Survey of Probabilistic Models in Information Retrieval. ACM Computing Surveys, 30(4), 528–552.
Dutoit, T. (1997a). High Quality Text-to-Speech Synthesis: An Overview. Journal of Electrical and Electronics Engineering, Australia, 17(1), 25–37.
Dutoit, T. (1997b). An Introduction to Text-To-Speech Synthesis. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Frakes, W.R. and Baeza-Yates, R. (Eds.) (1992). Information Retrieval: Data Structures and Algorithms. Englewood Cliffs, New Jersey, USA: Prentice Hall.
Garofolo, J.S., Auzanne, C.G.P., and Voorhees, E.M. (1999). The TREC Spoken Document Retrieval Track: A Success Story. In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 107–130).
Harman, D. (1992). Relevance Feedback and Other Query Modification Techniques. In W.B. Frakes and R. Baeza-Yates (Eds.), Information Retrieval: Data Structures and Algorithms, ch. 11, Englewood Cliffs, New Jersey, USA: Prentice Hall.
Jurafsky, D. and Martin, J.H. (2000). Speech and Language Processing. Upper Saddle River, NJ, USA: Prentice Hall.
Kao, Y.H., Hemphill, C.T., Wheatley, B.J., and Rajasekaran, P.K. (1994). Toward Vocabulary Independent Telephone Speech Recognition. In Proccedings of ICASSP'94, vol. 1, Adelaide, Australia (pp. 117–120).
Kim, J. and Oard, W. (2002). The Use of Speech Retrieval Systems: A Study Design. In A.R. Coden, E.W. Brown, and S. Srinivasan (Eds.), Information Retrieval Techniques for Speech Applications, Berlin, Germany: Springer-Verlag (pp. 87–93).
Markowitz, J.A. (1996). Using Speech Recognition. Upper Saddle River, NJ, USA: Prentice Hall.
Miller, S. (1984). Experimental Design and Statistics, 2nd Edn. London, UK: Routledge.
Mittendorf, E. and Schauble, P. (1996). Measuring the Effects of Data Corruption on Information Retrieval. In Proceedings of the SDAIR 96 Conference, Las Vegas, NV, USA (pp. 179–189).
Peckham, J. (1991). Speech Understanding and Dialogue over the Telephone: An Overview of the ESPRIT SUNDIAL Project. In Proceedings of theWorkshop on Speech and Natural Language, Pacific Grove, CA, USA (pp. 14–27).
Peckham, J. (1996). Speech Understanding and Dialogue over the Telephone. In K. Varghese, S. Pfleger, and J.-P. Lefevre (Eds.), Advanced Speech Applications (pp. 112–125). Berlin, Germany: Springer-Verlag.
Porter, M.F. (1980). An Algorithm for Suffix Stripping. Program, 14(3), 130–137.
Sanderson, M. (1996). Word Sense Disambiguation and Information Retrieval. Ph.D. Thesis, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK.
Silipo, R. and Crestani, F. (2000). Prosodic Stress and Topic Detection in Spoken Sentences. In Proceedings of the SPIRE 2000, the Seventh Symposium on String Processing and Information Retrieval, La Corunna, Spain (pp. 243–252).
Singhal, A., Choi, J., Hindle, D., Lewis, D.D., and Pereira, F. (1998). AT&T at TREC-7. In Proceedings of the TREC Conference, Washington DC, USA (pp. 239–253).
Smith, R.W. and Hipp, D.R. (1994). Spoken Natural Language Dialog Systems: A Practical Approach. Oxford, UK: Oxford University Press.
Stolcke, A., Shriberg, E., Hakkani-Tur, D., Tur, G., Rivlin, Z., and Sonmez, K. (1999). Combining Words and Speech Prosody for Automatic Topic Segmentation. In Proceedings of DARPA Broadcast News Transcription and Understanding Workshop, Washington DC, USA.
Tombros, A. and Crestani, F. (2000). Users's Perception of Relevance of Spoken Documents. Journal of the American Society for Information Science, 51(9), 929–939.
Tombros, A. and Sanderson, M. (1998). Advantages of Query Biased Summaries in Information Retrieval. In Proceedings of ACM SIGIR, Melbourne, Australia (pp. 2–10).
van Rijsbergen, C.J. (1979). Information Retrieval, 2nd Edn., London, UK: Butterworths.
Voorhees, E., Garofolo, J., and Sparck Jones, K. (1997). The TREC-6 Spoken Document Retrieval Track. In TREC-6 notebook, NIST, Gaithersburgh, MD, USA (pp. 167–170).
Voorhees, E.M. and Harman, D. (1998). Overview of the Seventh Text Retrieval Conference (TREC-7). In Proceedings of the TREC Conference, Gaithersburg, MD, USA (pp. 1–24).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Crestani, F. Vocal Access to a Newspaper Archive: Assessing the Limitations of Current Voice Information Access Technology. Journal of Intelligent Information Systems 20, 161–180 (2003). https://doi.org/10.1023/A:1021824019028
Issue Date:
DOI: https://doi.org/10.1023/A:1021824019028