Abstract
This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to find the best use of speech recogniser output to produce the highest retrieval effectiveness. Second, investigating the potential problems of retrieving from a so-called ”mixed collection”, i.e. one that contains documents from both a speech recognition system (producing many errors) and from hand transcription (producing presumably near perfect documents). The result of the first part of the work found that merging the transcripts of multiple recognisers showed most promise. The investigation in the second part showed how the term weighting scheme used in a retrieval system was important in determining whether the system was affected detrimentally when retrieving from a mixed collection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
F. Crestani and M. Sanderson. Retrieval of spoken documents: first experiences. Research Report TR-1997-34, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK, October 1997.
F. Crestani, M. Sanderson, M. Theophylactou, and M. Lalmas. Short queries, natural language and spoken document retrieval: Experiments at Glasgow University. In Proceedings of TREC-6, Gaithersburg, MD, USA, November 1997. In press.
C. Gerber. The design and application of an acoustic front-end for use in speech interfaces. M.Sc. Thesis, Department of Computing Science, University of Glasgow, Glasgow, Scotland, UK, February 1997. Available as Technical Report TR-1997-6.
D. Harman. Ranking algorithms. In W.B. Frakes and R. Baeza-Yates, editors, Information Retrieval: data structures and algorithms, chapter 14. Prentice Hall, Englewood Cliffs, New Jersey, USA, 1992.
D. Harman, editor. Proceedings of the Sixth Text Retrieval Conference (TREC-6), Gaithersburg, MD, USA, November 1997. (In press.).
G.J.F. Jones, J.T. Foote, K. Spark Jones, and S.J. Young. Video mail retrieval using voice: an overview of the Stage 2 system. In Proceedings of the MIRO Workshop, Glasgow, Scotland, UK, September 1995.
E. Mittendorf and P. Schauble. Measuring the effects of data corruption on information retrieval. In Proceedings of the SDAIR 96 Conference, pages 179–189, Las Vegas, NV, USA, April 1996.
T. Robinson, M. Hochberg, and S. Renals. The use of recurrent networks in continuos speech reognition. In C.H. Lee, K.K. Paliwal, and F.K. Soong, editors, Automatic Speech and Speaker Recognition-Advanced Topics, chapter 10, pages 233–258. Kluwer Academic Publishers, 1996.
M. Sanderson. System for information retrieval experiments (SIRE). Unpublished paper, November 1996.
M.A. Siegler, M.J. Witbrock, S.T. Slattery, K. Seymore, R.E. Jones, and A.G. Hauptmann. Experiments in spoken document retrieval at CMU. In Proceedings of TREC-6, Gaithersburg, MD, USA, November 1997.
A. Singhal, J. Choi, D. Hindle, and F. Pereira. AT&T at TREC-6: SDR Track. In Proceedings of TREC-6, Washington DC, USA, November 1997.
A. Singhal, G. Salton, and C. Buckley. Lenght normalisation in degraded text collections. Research Report 14853-7501, Department of Computer Science, Cornell University, Ithaca, NY, USA, 1995.
K. Taghva, J. Borsack, and A. Condit. Results of applying probabilistic IR to OCR. In Proceedings of ACM SIGIR, pages 202–211, Dublin, Ireland, 1994.
M. Wechsler and P. Schauble. Speech retrieval based on automatic indexing. In Proceedings of the MIRO Workshop, Glasgow, Scotland, UK, September 1995.
J. Xu and W.B. Croft. Query expansion using local and global document analysis. In Proceedings of ACM SIGIR, pages 4–11, Zurich, Switzerland, August 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sanderson, M., Crestani, F. (1998). Mixing and Merging for Spoken Document Retrieval. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_24
Download citation
DOI: https://doi.org/10.1007/3-540-49653-X_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65101-7
Online ISBN: 978-3-540-49653-3
eBook Packages: Springer Book Archive