Sistema ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

 

Tesi etd-04062017-160419


Tipo di tesi
Tesi di laurea magistrale
Autore
DE BONIS, MICHELE
URN
etd-04062017-160419
Titolo
Development of a mobile application for Food Recognition using Convolutional Neural Networks
Struttura
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
COMPUTER ENGINEERING
Commissione
relatore Amato, Giuseppe
relatore Falchi, Fabrizio
relatore Gennaro, Claudio
Parole chiave
  • food recognition
  • android
  • mobile application
  • convolutional neural network
  • training and testing
Data inizio appello
05/05/2017;
Consultabilità
completa
Riassunto analitico
The aim of this thesis is to develop an Android mobile application for Food Recognition. To this purpose, a Convolutional Neural Network CNN has been trained on datasets of 101 classes of dishes available in literature. The CNNs usually work on a powerful hardware. Since the objective is to run the CNN on mobile phone, the challenge is to point out the best one in terms of memory occupation, computational speed and accuracy.

The analysis exploits the following CNNs: AlexNet, Residual Network, GoogLeNet (Inception), VGG, SqueezeNet, BWN and XNORNet.
The train of the CNNs follows two main steps and it has been done on both ETHZ and UPMC datasets coming from Zurich and Paris University, respectively.
In the first step, the CNNs have been trained from scratch in order to pick the best in terms of accuracy, computational speed and memory occupation. Then, a fine tuning of the best CNNs previously identified has been performed.
The frameworks used for training and testing the nets are Caffe, which is probably the most used, and Torch. In order to simplify the processes of both training and testing, the NVIDIA Deep Learning GPU Training Systems (DIGITS) has been used. This system provides an efficient and easy-to-use web interface which allows to format images and set network parameters.
Once the best CNN is identified, the application is actually developed. The application works in two modalities: running the CNN locally in the smartphone using an implementation based on RenderScript, or querying the CNN deployed on a server into a Web Application.
Afterwards, a statistical study of the results has been done in order to acquire a better user experience. By establishing a threshold on the score assigned by the CNN to a certain dish, it is possible to communicate the user if the prediction is comfortable. Consequently, the application has been developed and designed relying on Android Studio.

On the basis of the results of this research, it can be concluded that the best accuracy is obtained by the GoogLeNet which also has a reduced size. The CNN has been ported on the mobile phone using an implementation based on RenderScript. Moreover, the CNN has been deployed on a Web Application.
File