SpanAlign: Efficient Sequence Tagging Annotation Projection into Translated Data applied to Cross-Lingual Opinion Mining

Léo Jacqmin, Gabriel Marzinotto, Justyna Gromada, Ewelina Szczekocka, Robert Kołodyński, Géraldine Damnati

Abstract

Following the increasing performance of neural machine translation systems, the paradigm of using automatically translated data for cross-lingual adaptation is now studied in several applicative domains. The capacity to accurately project annotations remains however an issue for sequence tagging tasks where annotation must be projected with correct spans. Additionally, when the task implies noisy user-generated text, the quality of translation and annotation projection can be affected. In this paper we propose to tackle multilingual sequence tagging with a new span alignment method and apply it to opinion target extraction from customer reviews. We show that provided suitable heuristics, translated data with automatic span-level annotation projection can yield improvements both for cross-lingual adaptation compared to zero-shot transfer, and data augmentation compared to a multilingual baseline.

Anthology ID:: 2021.wnut-1.27
Volume:: Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)
Month:: November
Year:: 2021
Address:: Online
Editors:: Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 238–248
Language:
URL:: https://aclanthology.org/2021.wnut-1.27/
DOI:: 10.18653/v1/2021.wnut-1.27
Bibkey:
Cite (ACL):: Léo Jacqmin, Gabriel Marzinotto, Justyna Gromada, Ewelina Szczekocka, Robert Kołodyński, and Géraldine Damnati. 2021. SpanAlign: Efficient Sequence Tagging Annotation Projection into Translated Data applied to Cross-Lingual Opinion Mining. In Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021), pages 238–248, Online. Association for Computational Linguistics.
Cite (Informal):: SpanAlign: Efficient Sequence Tagging Annotation Projection into Translated Data applied to Cross-Lingual Opinion Mining (Jacqmin et al., WNUT 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.wnut-1.27.pdf

PDF Cite Search Fix data