Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary

Pablo Gamallo Otero¹ &
José Ramom Pichel Campos²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4919))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1514 Accesses
4 Citations

Abstract

So far, research on extraction of translation equivalents from comparable, non-parallel corpora has not been very popular. The main reason was the poor results when compared to those obtained from aligned parallel corpora. The method proposed in this paper, relying on seed patterns generated from external bilingual dictionaries, allows us to achieve similar results to those from parallel corpus.In this way, the huge amount of comparable corpora available via Web can be viewed as a never-ending source of lexicographic information. In this paper, we describe the experiments performed on a comparable, Spanish-Galician corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

New Areas of Application of Comparable Corpora

Bilingual Contexts from Comparable Corpora to Mine for Translations of Collocations

NEBEL: Never-Ending Bilingual Equivalent Learner

References

Ahrenberg, L., Andersson, M., Merkel, M.: A simple hybrid aligner for generating lexical correspondences in parallel texts. In: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL 1998), Montreal, pp. 29–35 (1998)
Google Scholar
Armentano-Oller, C., et al.: Open-source portuguese-spanish machine translation. In: Vieira, R., et al. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 50–59. Springer, Heidelberg (2006)
Chapter Google Scholar
Carreras, X., Chao, I., Padró, L., Padró, M.: An open-source suite of language analyzers. In: 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal (2004)
Google Scholar
Chiao, Y.-C., Zweigenbaum, P.: Looking for candidate translational equivalents in specialized, comparable corpora. In: 19th COLING 2002 (2002)
Google Scholar
Dejean, H., Gaussier, E., Sadat, F.: Bilingual terminology extraction: an approach based on a multilingual thesaurus applicable to comparable corpora. In: COLING 2002, Tapei, Taiwan (2002)
Google Scholar
Fung, P., McKeown, K.: Finding terminology translation from non-parallel corpora. In: 5th Annual Workshop on Very Large Corpora, pp. 192–202 (1997)
Google Scholar
Fung, P., Yee, L.Y.: An ir approach for translating new words from nonparallel, comparable texts. In: Coling 1998, Montreal, Canada, pp. 414–420 (1998)
Google Scholar
Gale, W., Church, K.: Identifying word correspondences in parallel texts. In: Workshop DARPA SNL (1991)
Google Scholar
Gamallo, P.: Learning bilingual lexicons from comparable english and spanish corpora. In: Machine Translation SUMMIT XI, Copenhagen, Denmark (2007)
Google Scholar
Gamallo, P., Agustini, A., Lopes, G.: Clustering syntactic positions with similar semantic requirements. Computational Linguistics 31(1), 107–146 (2005)
Article Google Scholar
Gamallo, P., Pichel, J.R.: An approach to acquire word translations from non-parallel corpora. In: Bento, C., Cardoso, A., Dias, G. (eds.) EPIA 2005. LNCS (LNAI), vol. 3808, Springer, Heidelberg (2005)
Google Scholar
Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, USA (1994)
MATH Google Scholar
Harris, Z.: Distributional structure. In: Katz, J.J. (ed.) The Philosophy of Linguistics, pp. 26–47. Oxford University Press, New York (1985)
Google Scholar
Kwong, O.Y., Tsou, B.K., Lai, T.B.: Alignment and extraction of bilingual legal terminology from context profiles. Terminology 10(1), 81–99 (2004)
Article Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: COLING-ACL 1998, Montreal (1998)
Google Scholar
Melamed, D.: A portable algorithm for mapping bitext correspondences. In: 35th Conference of the Association of Computational Linguistics, Madrid, Spain, pp. 305–312 (1997)
Google Scholar
Nakagawa, H.: Disambiguation of single noun translations extracted from bilingual comparable corpora. Terminology 7(1), 63–83 (2001)
Google Scholar
Rapp, R.: Automatic identification of word translations from unrelated english and german corpora. In: ACL 1999, pp. 519–526 (1999)
Google Scholar
Shao, L., Ng, H.T.: Mining new word translations from comparable corpora. In: 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland, pp. 618–624 (2004)
Google Scholar
Silva, J.F., Dias, G., Guilloré, S., Lopes, G.P.: Using localmaxs algorithm for the extraction of contiguous and non-contiguous multiword lexical units. In: Progress in Artificial Intelligence. LNCS (LNAI), pp. 113–132. Springer, Heidelberg (1999)
Chapter Google Scholar
Tanala, T.: Measuring the similarity between compound nouns in different languages using non-parallel corpora. In: 19th COLING 2002, pp. 981–987 (2002)
Google Scholar
Tiedemann, J.: Extraction of translation equivalents from parallel corpora. In: 11th Nordic Conference of Computational Linguistics, Copenhagen, Denmark (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Língua Espanhola, Faculdade de Filologia, Universidade de Santiago de Compostela, Galiza, Spain
Pablo Gamallo Otero
Departamento de Tecnologia Linguística da Imaxin|Software, Santiago de Compostela, Galiza
José Ramom Pichel Campos

Authors

Pablo Gamallo Otero
View author publications
You can also search for this author in PubMed Google Scholar
José Ramom Pichel Campos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gamallo Otero, P., Pichel Campos, J.R. (2008). Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-78135-6_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78134-9
Online ISBN: 978-3-540-78135-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

New Areas of Application of Comparable Corpora

Bilingual Contexts from Comparable Corpora to Mine for Translations of Collocations

NEBEL: Never-Ending Bilingual Equivalent Learner

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

New Areas of Application of Comparable Corpora

Bilingual Contexts from Comparable Corpora to Mine for Translations of Collocations

NEBEL: Never-Ending Bilingual Equivalent Learner

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation