logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-09242020-113328


Tipo di tesi
Tesi di laurea magistrale
Autore
DWIVEDI, ABHINAV
URN
etd-09242020-113328
Titolo
Machine Learning and AI for the Professional Translation Market
Dipartimento
INFORMATICA
Corso di studi
DATA SCIENCE AND BUSINESS INFORMATICS
Relatori
relatore Esuli, Andrea
relatore Monreale, Anna
Parole chiave
  • ai
  • data science
  • deep learning
  • natural language processing
  • nlp
  • text to speech synthesis
Data inizio appello
09/10/2020
Consultabilità
Non consultabile
Data di rilascio
09/10/2090
Riassunto
The report contains the summary of my work at Translated Srl. I worked for 6 months with the internship contract.

During my internship, I mainly worked on two projects. The first project was where I spent my first four months of the internship. This time involved working for ModernMT speedups. We spent this time gathering existing resources on ways to speedup a NMT model. We filtered the ones suited better for ModernMT and we experimented the approaches on it. We were looking for approaches we can apply on ModernMT, which also works on the GPUs that we have in production. I spent the rest two months working on a future product MateDub which is under development currently. MateDub aims at performing automatic video dubbing using Artificial Intelligence.
In both of these projects, I worked on primarily on AWS instances. Most of my work was carried on on-demand and spot instances of g4dn and p3 with Tesla and Volta GPU architectures respectively. Finally, work was tried on production GPUs of the company (production GPUs). The Deep Learning tasks were done in Python mainly using PyTorch libraries. The GPU profiling/debugging and programming tasks were performed using NVIDIA Nsight, NVIDIA Compute, NVTX-python and NVIDIA TensorRT.
The data related tasks were performed using Python and Audacity. The presen- tation and reporting was done using Overleaf, Google Docs, Apple Keynotes and Tableau Desktop.
File