2024
pdf
bib
abs
Exploring NMT Explainability for Translators Using NMT Visualising Tools
Gabriela Gonzalez-Saez
|
Mariam Nakhle
|
James Turner
|
Fabien Lopez
|
Nicolas Ballier
|
Marco Dinarelli
|
Emmanuelle Esperança-Rodier
|
Sui He
|
Raheel Qader
|
Caroline Rossi
|
Didier Schwab
|
Jun Yang
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)
This paper describes work in progress on Visualisation tools to foster collaborations between translators and computational scientists. We aim to describe how visualisation features can be used to explain translation and NMT outputs. We tested several visualisation functionalities with three NMT models based on Chinese-English, Spanish-English and French-English language pairs. We created three demos containing different visualisation tools and analysed them within the framework of performance-explainability, focusing on the translator’s perspective.
pdf
bib
abs
The MAKE-NMTViz Project: Meaningful, Accurate and Knowledge-limited Explanations of NMT Systems for Translators
Gabriela Gonzalez-Saez
|
Fabien Lopez
|
Mariam Nakhle
|
James Turner
|
Nicolas Ballier
|
Marco Dinarelli
|
Emmanuelle Esperança-Rodier
|
Sui He
|
Caroline Rossi
|
Didier Schwab
|
Jun Yang
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2)
This paper describes MAKE-NMTViz, a project designed to help translators visualize neural machine translation outputs using explainable artificial intelligence visualization tools initially developed for computer vision.
pdf
bib
abs
Améliorer la traduction au niveau du document grâce au sur-echantillage négatif et au masquage ciblé
Gaëtan Caillaut
|
Mariam Nakhlé
|
Jingshu Liu
|
Raheel Qader
Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 1 : articles longs et prises de position
Ces travaux visent à améliorer les capacités des systèmes de traduction automatique à tenir compte du contexte dans lequel se trouve la phrase source, et donc, ultimement, à améliorer les performances globales des systèmes de traduction automatique. L’approche que nous proposons repose uniquement sur les données et la manière dont elles sont fournies au modèle durant l’entraînement et est complètement agnostique de l’architecture du modèle. Nous montrons que les performances des modèles de traduction, sur la paire en-fr, peuvent être améliorées simplement en fournissant des données plus pertinentes vis-à-vis de la tâche cible, et ce sans modifier ni complexifier les architectures existantes, en particulier l’architecture Transformer couramment utilisée par les systèmes de TAL modernes. Pour ce faire, nous présentons deux stratégies d’augmentation de données (sur-échantillonnage négatif et masquage ciblé) conçues pour inciter le modèle à s’appuyer sur le contexte. Nous montrons, au travers de métriques appropriées, que ces méthodes permettent d’améliorer les performances des systèmes de traduction sans pour autant modifier ni l’architecture du modèle, ni le processus d’entraînement.
pdf
bib
abs
Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task
Gaëtan Caillaut
|
Mariam Nakhlé
|
Raheel Qader
|
Jingshu Liu
|
Jean-Gabriel Barthélemy
Proceedings of the Ninth Conference on Machine Translation
Recent studies have showcased remarkable capabilities of decoder-only models in many NLP tasks, including translation. Yet, the machine translation field has been largely dominated by encoder-decoder models based on the Transformer architecture. As a consequence, scaling laws of encoder-decoder models for neural machine translation have already been well studied, but decoder-only models have received less attention.This work explores the scaling laws of decoder-only models on the multilingual and multidomain translation task. We trained a collection of six decoder-only models, ranging from 70M to 7B parameters, on a sentence-level, multilingual (8 languages) and multidomain (9 domains) dataset. We conducted a series of experiments showing that the loss of decoder-only models can be estimated using a scaling law similar to the one discovered for large language models, but we also show that this scaling law has difficulties to generalize to too large models or to a different data distribution. We also study different scaling methods and show that scaling the depth and the width of a model lead to similar test loss improvements, but with different impact on the model’s efficiency.
2023
pdf
bib
abs
L’évaluation de la traduction automatique du caractère au document : un état de l’art
Mariam Nakhlé
Actes de CORIA-TALN 2023. Actes des 16e Rencontres Jeunes Chercheurs en RI (RJCRI) et 25e Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL)
Ces dernières années l’évaluation de la traduction automatique, qu’elle soit humaine ou automatique,a rencontré des difficultés. Face aux importantes avancées en matière de traduction automatiqueneuronale, l’évaluation s’est montrée peu fiable. De nombreuses nouvelles approches ont été pro-posées pour améliorer les protocoles d’évaluation. L’objectif de ce travail est de proposer une vued’ensemble sur l’état global de l’évaluation de la Traduction Automatique (TA). Nous commenceronspar exposer les approches d’évaluation humaine, ensuite nous présenterons les méthodes d’évaluationautomatiques tout en différenciant entre les familles d’approches (métriques superficielles et apprises)et nous prêterons une attention particulière à l’évaluation au niveau du document qui prend comptedu contexte. Pour terminer, nous nous concentrerons sur la méta-évaluation des méthodes.
pdf
bib
abs
The MAKE-NMTVIZ System Description for the WMT23 Literary Task
Fabien Lopez
|
Gabriela González
|
Damien Hansen
|
Mariam Nakhle
|
Behnoosh Namdarzadeh
|
Nicolas Ballier
|
Marco Dinarelli
|
Emmanuelle Esperança-Rodier
|
Sui He
|
Sadaf Mohseni
|
Caroline Rossi
|
Didier Schwab
|
Jun Yang
|
Jean-Baptiste Yunès
|
Lichao Zhu
Proceedings of the Eighth Conference on Machine Translation
This paper describes the MAKE-NMTVIZ Systems trained for the WMT 2023 Literary task. As a primary submission, we used Train, Valid1, test1 as part of the GuoFeng corpus (Wang et al., 2023) to fine-tune the mBART50 model with Chinese-English data. We followed very similar training parameters to (Lee et al. 2022) when fine-tuning mBART50. We trained for 3 epochs, using gelu as an activation function, with a learning rate of 0.05, dropout of 0.1 and a batch size of 16. We decoded using a beam search of size 5. For our contrastive1 submission, we implemented a fine-tuned concatenation transformer (Lupo et al., 2023). The training was developed in two steps: (i) a sentence-level transformer was implemented for 10 epochs trained using general, test1, and valid1 data (more details in contrastive2 system); (ii) second, we fine-tuned at document-level using 3-sentence concatenation for 4 epochs using train, test2, and valid2 data. During the fine-tuning, we used ReLU as an activation function, with an inverse square root learning rate, dropout of 0.1, and a batch size of 64. We decoded using a beam search of size. Four our contrastive2 and last submission, we implemented a sentence-level transformer model (Vaswani et al., 2017). The model was trained with general data for 10 epochs using general-purpose, test1, and valid 1 data. The training parameters were an inverse square root scheduled learning rate, a dropout of 0.1, and a batch size of 64. We decoded using a beam search of size 4. We then compared the three translation outputs from an interdisciplinary perspective, investigating some of the effects of sentence- vs document-based training. Computer scientists, translators and corpus linguists discussed the linguistic remaining issues for this discourse-level literary translation.
pdf
bib
abs
Lingua Custodia’s Participation at the WMT 2023 Terminology Shared Task
Jingshu Liu
|
Mariam Nakhlé
|
Gaëtan Caillout
|
Raheel Qadar
Proceedings of the Eighth Conference on Machine Translation
This paper presents Lingua Custodia’s submission to the WMT23 shared task on Terminology shared task. Ensuring precise translation of technical terms plays a pivotal role in gauging the final quality of machine translation results. Our goal is to follow the terminology constraint while applying the machine translation system. Inspired by the recent work of terminology control, we propose to annotate the machine learning training data by leveraging a synthetic dictionary extracted in a fully non supervised way from the give parallel corpora. The model learned with this training data can then be then used to translate text with a given terminology in a flexible manner. In addition, we introduce a careful annotated data re-sampling step in order to guide the model to see different terminology types enough times. In this task we consider all the three language directions: Chinese to English, English to Czech and German to English. Our automatic evaluation metrics with the submitted systems show the effectiveness of the proposed method.
pdf
bib
abs
Large Language Model Adaptation for Financial Sentiment Analysis
Pau Rodriguez Inserte
|
Mariam Nakhlé
|
Raheel Qader
|
Gaetan Caillaut
|
Jingshu Liu
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing
Natural language processing (NLP) has recently gained relevance within financial institutions by providing highly valuable insights into companies and markets’ financial documents. However, the landscape of the financial domain presents extra challenges for NLP, due to the complexity of the texts and the use of specific terminology. Generalist language models tend to fall short in tasks specifically tailored for finance, even when using large language models (LLMs) with great natural language understanding and generative capabilities. This paper presents a study on LLM adaptation methods targeted at the financial domain and with high emphasis on financial sentiment analysis. To this purpose, two foundation models with less than 1.5B parameters have been adapted using a wide range of strategies. We show that through careful fine-tuning on both financial documents and instructions, these foundation models can be adapted to the target domain. Moreover, we observe that small LLMs have comparable performance to larger scale models, while being more efficient in terms of parameters and data. In addition to the models, we show how to generate artificial instructions through LLMs to augment the number of samples of the instruction dataset.