Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/564376.564409acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations

Published: 11 August 2002 Publication History

Abstract

Bilingual dictionaries have been commonly used for query translation in cross-language information retrieval (CLIR). However, we are faced with the problem of translation selection. Several recent studies suggested the utilization of term co-occurrences in this selection. This paper presents two extensions to improve them. First, we extend the basic co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Second, we incorporate a triple translation model, in which syntactic dependence relations (represented as triples) are integrated. Our evaluation on translation accuracy shows that translating triples as units is more precise than a word-by-word translation. Our CLIR experiments show that the addition of the decaying factor leads to substantial improvements of the basic co-occurrence model; and the triple translation model brings further improvements.

References

[1]
Ballesteros, L., and Croft, W. B. (1997). Phrasal translation and query expansion techniques for cross-language information retrieval. In: ACM SIGIR'97. pp. 84--91.]]
[2]
Ballesteros, L., and Croft, W. B. (1998). Resolving ambiguity for cross-language retrieval. In: ACM SIGIR'98. Melbourne, Australia., pp. 64--71]]
[3]
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R.L. (1993). The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2): 263--311]]
[4]
Bian, G.W. and Chen, H.H. (1998). Integrating query translation and document translation in a cross-language information retrieval system. Machine Translation and Information Soup, Lecture Notes in Computer Science, #1529, Spring-Verlag, pp. 250--265.]]
[5]
Clarkson, P., and Robinson, A. (1997). Language model adaptation using mixtures and an exponentially decaying cache. In Proceedings of ICASSP-97, pp. 799--802.]]
[6]
Davis, M. W., and Ogden, W. C. (1997). Free resources and advanced alignment for cross-language text retrieval. In: TREC-6, pp. 285--402.]]
[7]
Fung, P., Liu, X., and Cheung, C. S. (1999). Mixed language query disambiguation. In ACL-99. The 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, USA, pp. 333--340.]]
[8]
Gao, J., Nie, J. Y., Zhang, J., Xun, E., Zhou, M., and Huang, C. (2001) Improving query translation for CLIR using statistical Models. In: ACM SIGIR'01, New Orleans, Louisiana, pp. 96--104.]]
[9]
Gao, J., Nie, J. Y., Zhang, J., Xun, E., Su, Y., Zhou, M., and Huang, C. (2000). TREC-9 CLIR experiments at MSRCN. In TREC-9, pp. 343--353.]]
[10]
Hull, D. A., and Grefenstette, G. (1996). Querying across languages: a dictionary-based approach to multilingual information retrieval. In: ACM SIGIR'96. pp. 49--57.]]
[11]
Jang, M.G., Myaeng, S. H., and Park S. Y. (1999). Using mutual information to resolve query translation ambiguities and query term weighting. In ACL-99. College Park, Maryland, pp. 223--229.]]
[12]
Lin, D. (1997). Using syntactic dependency as local context to resolve word sense ambiguity. In Proceedings of ACL/EACL-97, Madrid, pp. 64--71.]]
[13]
Lin, D. (1998). Automatic retrieval and clustering of similar words. In COLING-ACL98, Montreal, Canada, August, pp. 768-774.]]
[14]
Peters, C., and Picchi, E. (1996). Cross language information retrieval: A system for comparable corpus querying. In SIGIR'96 Workshop on Cross-linguistic Information Retrieval, pp. 24--33.]]
[15]
Robertson, S. E., and Walker, S. (2000). Microsoft Cambridge at TREC-9: Filtering track. In TREC-9, pp. 361--368.]]
[16]
Voorhees, E., Harman, D. (2001). Overview of the ninth text retrieval conference (TREC-9). In TREC-9 pp. 1--14.]]
[17]
Xu, J., and Weischedel, R. (2000). TREC-9 cross-lingual retrieval at BBN. In TREC-9, pp. 106--116.]]
[18]
Zhou, M., Ding, Y., and Huang, C. (2001). Improving translation selection with a new translation model trained by independent monolingual corpora. Computational linguistics and Chinese Language Processing. Vol. 6, No. 1, pp 1--26.]]
[19]
Mandala, R., Tokunaga, T., and Tanaka, H. (1999). Combining multiple evidence from different types of thesaurus for query expansion. In: ACM SIGIR'99. pp 191--197.]]

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
August 2002
478 pages
ISBN:1581135610
DOI:10.1145/564376
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 August 2002

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CLIR
  2. co-occurrence
  3. parse
  4. query translation
  5. statistical model

Qualifiers

  • Article

Conference

SIGIR02
Sponsor:

Acceptance Rates

SIGIR '02 Paper Acceptance Rate 44 of 219 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Security Issues and Challenges for Virtualization TechnologiesACM Computing Surveys10.1145/338219053:2(1-37)Online publication date: 19-May-2020
  • (2020)Trade-offs between Distributed Ledger Technology CharacteristicsACM Computing Surveys10.1145/337946353:2(1-37)Online publication date: 29-May-2020
  • (2020)A Survey on Renamings of Software EntitiesACM Computing Surveys10.1145/337944353:2(1-38)Online publication date: 17-Apr-2020
  • (2020)Knowledge Transfer in Vision RecognitionACM Computing Surveys10.1145/337934453:2(1-35)Online publication date: 17-Apr-2020
  • (2019)Applications of Distributed Ledger Technologies to the Internet of ThingsACM Computing Surveys10.1145/335998252:6(1-34)Online publication date: 14-Nov-2019
  • (2019)A Survey of Coarse-Grained Reconfigurable Architecture and DesignACM Computing Surveys10.1145/335737552:6(1-39)Online publication date: 16-Oct-2019
  • (2019)Document Layout AnalysisACM Computing Surveys10.1145/335561052:6(1-36)Online publication date: 16-Oct-2019
  • (2019)Survey of Compressed Domain Video Summarization TechniquesACM Computing Surveys10.1145/335539852:6(1-29)Online publication date: 16-Oct-2019
  • (2018)Cross-Language Mining and RetrievalEncyclopedia of Database Systems10.1007/978-1-4614-8265-9_89(667-672)Online publication date: 7-Dec-2018
  • (2017)Clustering based on words distancesCluster Computing10.1007/s10586-017-0963-821:1(945-953)Online publication date: 9-Jun-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media