Tesi etd-07042024-014251 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
MYFTARAJ, DEIVI
URN
etd-07042024-014251
Titolo
Validation and Profiling of the Dynamic Shaping feature on the Intel Neural Processing Unit to optimize the inference of Transformer Neural Network
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
INGEGNERIA ELETTRONICA
Relatori
relatore Fanucci, Luca
Parole chiave
- neural processing unit
- transformer neural network
Data inizio appello
26/07/2024
Consultabilità
Non consultabile
Data di rilascio
26/07/2094
Riassunto
In the field of hardware acceleration for artificial intelligence, Neural Processing Units (NPUs) represent a technological frontier of growing importance. This thesis, developed in collaboration with Intel, explores the validation and profiling of an innovative technique called "Dynamic Shaping," which aims to optimize and speed up the inference of Transformer neural networks. The main goal was to address the challenges posed by running Transformer models on specialized hardware, leveraging the new features related to Dynamic Shaping.
The work begins with an in-depth analysis of Transformer networks, highlighting their relevance in the field of Natural Language Processing (NLP) and beyond, and discussing the challenges related to their implementation on NPUs. Subsequently, the architecture of the NPU integrated into the Intel processor is presented, describing its features. The core of the research is represented by the study and application of Dynamic Shaping, an innovative technique that dynamically adapts the shape of the processed data to optimize the use of computational resources.
Targeted modifications were implemented to a set of existing software resources and tools to configure, test, and validate the effectiveness of Dynamic Shaping. The results obtained demonstrate significant improvements in terms of inference speed and resource consumption reduction, confirming the effectiveness of this strategy.
The work begins with an in-depth analysis of Transformer networks, highlighting their relevance in the field of Natural Language Processing (NLP) and beyond, and discussing the challenges related to their implementation on NPUs. Subsequently, the architecture of the NPU integrated into the Intel processor is presented, describing its features. The core of the research is represented by the study and application of Dynamic Shaping, an innovative technique that dynamically adapts the shape of the processed data to optimize the use of computational resources.
Targeted modifications were implemented to a set of existing software resources and tools to configure, test, and validate the effectiveness of Dynamic Shaping. The results obtained demonstrate significant improvements in terms of inference speed and resource consumption reduction, confirming the effectiveness of this strategy.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |