Tesi etd-04092019-165542

Tipo di tesi

Tesi di laurea magistrale

URN

etd-04092019-165542

Titolo

VLSI Design of a Hardware Accelerator for Convolutional Neural Networks on the edge: the Keyword Spotting case study

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

INGEGNERIA ELETTRONICA

Relatori

.

relatore Prof. Fanucci, Luca
correlatore Dinelli, Gianmarco
correlatore Meoni, Gabriele

Parole chiave

Artificial Intelligence
Convolutional Neural Network
FPGA
Hardware accelerator
Keyword Spotting
Machine Learning

Data inizio appello

03/05/2019

Consultabilità

Non consultabile

Data di rilascio

03/05/2089

Riassunto (Inglese)

Riassunto (Italiano)

During the last years, Convolutional Neural Networks have been used for different applications thanks to their potentiality to carry out tasks by using a reduced number of parameters if compared to other Deep Learning approaches. However, power consumption and memory footprints constraints typical of on the edge and portable applications collide with accuracy and latency requirements, which characterize these applications. For such reason, commercial hardware accelerators have become popular, thanks to their architecture designed for the inference of general Convolutional Neural Networks models.
Nevertheless, Field Programmable Gate Arrays represent an interesting perspective, since they offer the possibility of implement a hardware architecture tailored to a specific Convolutional Neural Network model, with promising results in terms of power consumption and timing performances.
In this thesis, we propose a Field Programmable Gate Array hardware accelerator for a Separable Convolutional Neural Network, which was designed for a Keyword Spotting application. We started from the model implemented in a previous work for the Intel Movidius Neural Compute Stick, chosen for its high accuracy despite its relatively little number of parameters and hidden layers. For our goals, we appropriately quantized such model through a bit-true simulation, and we realized a dedicated architecture.
A benchmark comparing the results on different Field Programmable Gate Arrays families by Xilinx with the implementation on Neural Computer Stick was realized. The analysis shows that better latency performances can be obtained with comparable accuracy and power consumption.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04092019-165542