Tesi etd-04122023-101720

Tipo di tesi

Tesi di laurea magistrale

Autore

GIANNINI, VALERIO

URN

etd-04122023-101720

Titolo

Enhancing the Quality of Machine Translation Systems through Contextual Fine-tuning

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Relatori

relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Dott. Qiu, Disheng
relatore Dott. Galatolo, Federico Andrea

Parole chiave

BLEU
COMET
context adaptation
context aware machine translation
contextual errors
contextual fine-tuning
deep learning
encoder-decoder architecture
human post-editing
human translators
human-in-the-loop
lexical ambiguities
machine translation
neural machine translation
real-time adaptation
state-of-the-art
transformer model
translation quality

Data inizio appello

28/04/2023

Consultabilità

Completa

Riassunto

Traditional machine translation industrial systems usually handle sentences independently, neglecting any additional information that may be present beyond the boundaries of the sentence. The focus of this study is to enhance a state-of-the-art Machine Translation system used by human translators by incorporating contextual information only at inference time, while still training the base model using the considerably greater amount of sentence-level data instead of document-level data. In particular, the proposed approach involves a sentence-level fine-tuning at inference time of a general model using contextual information, enabling the system to provide a better in-context translations of sentences. The proposed model has been extensively tested over a wide range of both contextual and non-contextual test sets. The results show that it outperforms the base model in terms of automatic metrics (e.g. BLEU), indicating its superior translation quality. Finally, a human error analysis has been conducted to examine the model's ability to solve contextual errors. The results indicate that the proposed model is effective in resolving lexical ambiguities and producing more coherent translations. Overall, the results demonstrate the efficacy of the proposed approach at a relatively low cost. It can significantly improve the translation quality in real-world situations where the meaning is ambiguous and the context plays a crucial role in determining the correct translation.

File

Nome file	Dimensione
Master_T...nnini.pdf	1.70 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04122023-101720