Tesi etd-11052024-154638 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
PAGANI, DARIO
URN
etd-11052024-154638
Titolo
Microscaling Floating Point Formats for Large Language Models
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cococcioni, Marco
relatore Dott. Rossi, Federico
relatore Dott. Rossi, Federico
Parole chiave
- arbitrary-precision
- C++
- E3M4
- E4M3
- E5M2
- floating-point
- GPT
- IEEE 754
- LLM
- matrix multiplication
- microscaling
- minifloat
- Open Compute Project
- tiny-precision
- transformer
Data inizio appello
26/11/2024
Consultabilità
Completa
Riassunto
Significant challenges arise from the computational and memory demands for training and inference of large language models (LLMs) as they grow in size and complexity. In order to maximize the accuracy and storage effectiveness of numerical representations within LLMs, this thesis investigates the idea of microscaling floating point formats. Such formats are based on the concept of sharing a single floating-point exponent for an entire block of values, represented by low-precision floating point numbers. Microscaling aims to lower memory footprint and compute overhead by modifying the information representation to reduce the block size employing 8-bit floating point formats and maintaining an acceptable dynamic range introducing the single shared exponent for the block. The focus of this is to study the theoretical foundation and prospective applications of microscaling floating point formats, developing a C++ microscaling framework for boosting the efficiency of LLMs. Through a combination of analytical approaches and exploratory experiments, this thesis intends to build the framework for future study in resource-efficient development and deployment of large-scale language models.
File
Nome file | Dimensione |
---|---|
tesi_v_2...60500.pdf | 1.23 Mb |
Contatta l’autore |