Tesi etd-11052024-154638

Tipo di tesi

Tesi di laurea magistrale

Autore

PAGANI, DARIO

URN

etd-11052024-154638

Titolo

Microscaling Floating Point Formats for Large Language Models

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING

Relatori

relatore Prof. Cococcioni, Marco
relatore Dott. Rossi, Federico

Parole chiave

arbitrary-precision
C++
E3M4
E4M3
E5M2
floating-point
GPT
IEEE 754
LLM
matrix multiplication
microscaling
minifloat
Open Compute Project
tiny-precision
transformer

Data inizio appello

26/11/2024

Consultabilità

Completa

Riassunto

Significant challenges arise from the computational and memory demands for training and inference of large language models (LLMs) as they grow in size and complexity. In order to maximize the accuracy and storage effectiveness of numerical representations within LLMs, this thesis investigates the idea of microscaling floating point formats. Such formats are based on the concept of sharing a single floating-point exponent for an entire block of values, represented by low-precision floating point numbers. Microscaling aims to lower memory footprint and compute overhead by modifying the information representation to reduce the block size employing 8-bit floating point formats and maintaining an acceptable dynamic range introducing the single shared exponent for the block. The focus of this is to study the theoretical foundation and prospective applications of microscaling floating point formats, developing a C++ microscaling framework for boosting the efficiency of LLMs. Through a combination of analytical approaches and exploratory experiments, this thesis intends to build the framework for future study in resource-efficient development and deployment of large-scale language models.

File

Nome file	Dimensione
tesi_v_2...60500.pdf	1.23 Mb
Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11052024-154638