Panlingual Lexical Translation via Probabilistic Inference

Authors

  • ' Mausam University of Washington
  • Stephen Soderland University of Washington
  • Oren Etzioni University of Washington

DOI:

https://doi.org/10.1609/aaai.v24i1.7703

Keywords:

Multilinguality, Machien Translation, Lexical Translation, Translation Dictionaries, Probabilistic Inference over Graphs

Abstract

The bare minimum lexical resource required to translate between a pair of languages is a translation dictionary. Unfortunately, dictionaries exist only between a tiny fraction of the 49 million possible language-pairs making machine translation virtually impossible between most of the languages. This paper summarizes the last four years of our research motivated by the vision of panlingual communication. Our research comprises three key steps. First, we compile over 630 freely available dictionaries over the Web and convert this data into a single representation – the translation graph. Second, we build several inference algorithms that infer translations between word pairs even when no dictionary lists them as translations. Finally, we run our inference procedure offline to construct PANDICTIONARY– a sense-distinguished, massively multilingual dictionary that has translations in more than 1000 languages. Our experiments assess the quality of this dictionary and find that we have 4 times as many translations at a high precision of 0.9 compared to the English Wiktionary, which is the lexical resource closest to PANDICTIONARY.

Downloads

Published

2010-07-05

How to Cite

Mausam, ’, Soderland, S., & Etzioni, O. (2010). Panlingual Lexical Translation via Probabilistic Inference. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 1686-1689. https://doi.org/10.1609/aaai.v24i1.7703

Issue

Section

New Scientific and Technical Advances in Research