Nothing Special   »   [go: up one dir, main page]

skip to main content
10.3115/980845.980964dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Methods and practical issues in evaluating alignment techniques

Published: 10 August 1998 Publication History

Abstract

This paper describes the work achieved in the first half of a 4-year cooperative research project (ARCADE), financed by AUPELF-UREF. The project is devoted to the evaluation of parallel text alignment techniques. In its first period ARCADE ran a competition between six systems on a sentence-to-sentence alignment task which yielded two main types of results. First, a large reference bilingual corpus comprising of texts of different genres was created, each presenting various degrees of difficulty with respect to the alignment task.Second, significant methodological progress was made both on the evaluation protocols and metrics, and the algorithms used by the different systems. For the second phase, which is now underway, ARCADE has been opened to a larger number of teams who will tackle the problem of word-level alignment.

References

[1]
J. Brousseau, C. Drouin, G. Foster, P. Isabelle, R. Kuhn, Y. Normandin, and P. Plamondon. 1995. French Speech Recognition in an Automatic Dictation System for Translators: the TransTalk Project. In Proceedings of Eurospeech 95, Madrid, Spain.
[2]
P. F. Brown, J. Cocke, S. A. Della Pietra, V. J. Della Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roosin. 1990. A Statistical Approach to Machine Translation. In Computational Linguistics, volume 16, pages 79--85, June.
[3]
P. F. Brown, J. C. Lai, and R. L. Mercer. 1991. Aligning Sentences in Parallel Corpora. In 29th Annual Meeting of the Association for Computational Linguistics, pages 169--176, Berkeley, CA, USA.
[4]
Ido Dagan and Kenneth W. Church. 1994. Termight: Identifying and Translating Technical Terminology. In Proceedings of ANLP-94, Stuttgart, Germany.
[5]
F. Débili, E. Sammouda, and A. Zribi. 1994. De l'appariement des mots à la comparaison de phrases. In gème Congrès de Reconnaissance des Formes et Intelligence Artificielle, Paris, Janvier.
[6]
F. Debili. 1992. Aligning Sentences in Bilingual Texts French - English and French - Arabic. In COLING, pages 517--525, Nantes, 23--28 Aout.
[7]
George Foster, Pierre Isabelle, and Pierre Plamondon. 1997. Target-Text Mediated Interactive Machine Translation. Machine Translation, 21(1--2).
[8]
W. A. Gale and Kenneth W. Church. 1991. A Program for Aligning Sentences in Bilingual Corpora. In 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA.
[9]
N. Ide and J. Véronis, 1995. The Text Encoding Initiative: background and context, chapter 342p. Kluwer Academic Publishers, Dordrecht.
[10]
N. Ide, G. Priest-Dorman, and J. Véronis. 1995. Corpus encoding standard. Report. Accessible on the World Wide Web: http://www.lpl.univaix.fr/projects/multext/CES/CES1.html.
[11]
Pierre Isabelle and Michel Simard. 1996. Propositions pour la représentation et l'évaluation des alignements de textes parallèles. http://www-rali.iro.umontreal.ca/arc-a2/-PropEval.
[12]
Pierre Isabelle, Marc Dymetman, George Foster, Jean-Marc Jutras, Elliott Macklovitch, François Perrault, Xiaobo Ren, and Michel Simard. 1993. Translation Analysis and Translation Automation. In Proceedings of TMI-93, Kyoto, Japan.
[13]
M. Kay and M. Röscheisen. 1993. Text-translation alignment. Computational Linguistics, 19(1):121--142.
[14]
Judith Kalvans and Evelyne Tzoukermann. 1995. Combining Corpus and Machine-readable Dictionary Data for Building Bilingual Lexicons. Machine Translation, 10(3).
[15]
Lucie Langlois. 1996. Bilingual Concordances: A New Tool for Bilingual Lexicographers. In Proceedings of AMTA-96, Montréal, Canada.
[16]
Elliott Macklovitch. 1995. TransCheck --- or the Automatic Validation of Human Translations. In Proceedings of the MT Summit V, Luxembourg.
[17]
I. Dan Melamed. 1996. Automatic Construction of Clean Broad-coverage Translation Lexicons. In Proceedings of AMTA-96, Montréal, Canada.
[18]
I. Dan Melamed. 1997. A portable algorithm for mapping bitext correspondence. In 35th Conference of the Association for Computational Linguistics, Madrid, Spain.
[19]
C.J. Van Rijsbergen. 1979. Information Retrieval, 2nd edition, London, Butterworths.
[20]
M. Simard and P. Plamondon. 1996. Bilingual sentence alignment: Balancing robustness and accuracy. In Proceedings of the Second Conference of the Association for Machine Translation in the Americas (AMTA), Montréal, Québec.
[21]
M. Simard, G. F. Foster, and P. Isabelle. 1992. Using Cognates to Align Sentences in Bilingual Corpora. In Fourth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pages 67--81, Montréal, Canada.
[22]
M. Simard. 1998. The BAF: A corpus of English-French Bitext. In First International Conference on Language Resources and Evaluation, Granada, Spain.

Cited By

View all
  • (2007)Lexical-based alignment for reconstruction of structure in parallel textsProceedings of the 12th international conference on Applications of Natural Language to Information Systems10.5555/2394705.2394752(401-406)Online publication date: 27-Jun-2007
  • (2007)Semantic precision and recall for ontology alignment evaluationProceedings of the 20th international joint conference on Artifical intelligence10.5555/1625275.1625330(348-353)Online publication date: 6-Jan-2007
  • (2006)Alignment of paragraphs in bilingual texts using bilingual dictionaries and dynamic programmingProceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications10.1007/11892755_85(824-833)Online publication date: 14-Nov-2006
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '98/COLING '98: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - Volume 1
August 1998
768 pages

Sponsors

  • Government of Canada
  • Université de Montréal

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 10 August 1998

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)10
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2007)Lexical-based alignment for reconstruction of structure in parallel textsProceedings of the 12th international conference on Applications of Natural Language to Information Systems10.5555/2394705.2394752(401-406)Online publication date: 27-Jun-2007
  • (2007)Semantic precision and recall for ontology alignment evaluationProceedings of the 20th international joint conference on Artifical intelligence10.5555/1625275.1625330(348-353)Online publication date: 6-Jan-2007
  • (2006)Alignment of paragraphs in bilingual texts using bilingual dictionaries and dynamic programmingProceedings of the 11th Iberoamerican conference on Progress in Pattern Recognition, Image Analysis and Applications10.1007/11892755_85(824-833)Online publication date: 14-Nov-2006
  • (2006)Paragraph-Level alignment of an english-spanish parallel corpus of fiction texts using bilingual dictionariesProceedings of the 9th international conference on Text, Speech and Dialogue10.1007/11846406_8(61-67)Online publication date: 11-Sep-2006
  • (2005)Comparison, selection and use of sentence alignment algorithms for new language pairsProceedings of the ACL Workshop on Building and Using Parallel Texts10.5555/1654449.1654469(99-106)Online publication date: 29-Jun-2005
  • (2005)NuktiProceedings of the ACL Workshop on Building and Using Parallel Texts10.5555/1654449.1654462(75-78)Online publication date: 29-Jun-2005
  • (2005)Automatic identification of parallel documents with light or without linguistic resourcesProceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence10.1007/11424918_37(354-365)Online publication date: 9-May-2005
  • (2004)Development in parsing technologyNew developments in parsing technology10.5555/1139041.1139043(1-18)Online publication date: 1-Jan-2004
  • (2004)Building parallel corpora by automatic title alignment using length-based and text-based approachesInformation Processing and Management: an International Journal10.1016/j.ipm.2003.11.00240:6(939-955)Online publication date: 1-Nov-2004
  • (2001)Toward hierarchical models for statistical machine translation of inflected languagesProceedings of the workshop on Data-driven methods in machine translation - Volume 1410.3115/1118037.1118044(1-8)Online publication date: 7-Jul-2001
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media