Abstract
A multilingual information retrieval method is presented where the user formulates the query in his/her preferred language to retrieve relevant information from a multilingual document collection. This multilingual retrieval method involves mono- and cross-language searches as well as merging their results. We adopt a corpus based approach where documents of different languages are associated if they cover a similar story. The resulting comparable corpus enables two novel techniques we have developed. First, it enables Cross-Language Information Retrieval (CLIR) which does not lack vocabulary coverage as we observed in the case of approaches that are based on automatic Machine Translation (MT). Second, aligned documents of this corpus facilitate to merge the results of mono- and cross-language searches. Using the TREC CLIR data, excellent results are obtained. In addition, our evaluation of the document alignments gives us new insights about the usefulness of comparable corpora.
Part of this work has been carried out during the author’s time at the National Institute of Standards and Technology NIST
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ballesteros, L., Croft, B.W.: Phrasal Translation and Query Expansion Techniques for Cross-Language Information Retrieval. In: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (1997) pages 84–91
Crestani, F., vanRijsbergen, C. J.: Information Retrieval by Logical Imaging. In: Journal of Documentation (1994)
Gale, W. A., Church, K.W.: A Program for Aligning Sentences in Bilingual Corpora. In: Computational Linguistics, 19(1) (1993) 75–102. Special Issue on Using Large Corpora I.
Harman, D. K.: Relevance Feedback and Other Query Modification Techniques. In: Frakes, W. B., Baeza-Yates, R.: Information Retrieval, Data Structures & Algorithms, Prentice-Hall (1992) pages 241–261
Landauer, T. K., Littman, M. L.: Fully Automatic Cross-Language Document Retrieval using Latent Semantic Indexing. In: Proceedings of the Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research, (1990) pages 31–38.
Oard, D. W.: Cross-Language Text Retrieval Research in the USA. Presented at: 3rd ERCIM DELOS Workshop, Zurich, Switzerland (1997) Available from http://www.clis.umd.edu/dlrg/filter/papers/delos.ps.
Oard, D. W., Hackett, P.: Document Translation for Cross-Language Text Retrieval at the University of Maryland. To be published in: Proceedings of the Sixth Text Retrieval Conference (TREC-6) (to appear) Available from http://trec.nist.gov/pubs/trec6/papers/umd.ps.
Qiu, Y.: Automatic Query Expansion Based on A Similarity Thesaurus. PhD Thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (1995)
Schäuble, P.: Multimedia Information Retrieval. Kluwer Academic Publishers (1997)
Schäuble, P., Sheridan, P.: Cross-Language Information Retrieval (CLIR) Track Overview. To be published in: Proceedings of the Sixth Text Retrieval Conference (TREC-6) (to appear)
Sheridan, P., Ballerini, J.-P.: Experiments in Multilingual Information Retrieval using the SPIDER system. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (1996) pages 58–65
Voorhees, E. M., Harman, D. K.: Overview of the Sixth Text Retrieval Conference (TREC-6). To be published in: Proceedings of the Sixth Text Retrieval Conference (TREC-6) (to appear)
Xu, J. and Croft, B. W.: Query Expansion Using Local and Global Document Analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (1996) pages 4–11.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Braschler, M., Scäuble, P. (1998). Multilingual Information Retrieval Based on Document Alignment Techniques. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_12
Download citation
DOI: https://doi.org/10.1007/3-540-49653-X_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65101-7
Online ISBN: 978-3-540-49653-3
eBook Packages: Springer Book Archive