Thesis etd-04062017-160419 |
Link copiato negli appunti
Thesis type
Tesi di laurea magistrale
Author
DE BONIS, MICHELE
URN
etd-04062017-160419
Thesis title
Development of a mobile application for Food Recognition using Convolutional Neural Networks
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
COMPUTER ENGINEERING
Supervisors
relatore Amato, Giuseppe
relatore Falchi, Fabrizio
relatore Gennaro, Claudio
relatore Falchi, Fabrizio
relatore Gennaro, Claudio
Keywords
- food recognition
- android
- mobile application
- convolutional neural network
- training and testing
Graduation session start date
05/05/2017
Availability
Full
Summary
The aim of this thesis is to develop an Android mobile application for Food Recognition. To this purpose, a Convolutional Neural Network CNN has been trained on datasets of 101 classes of dishes available in literature. The CNNs usually work on a powerful hardware. Since the objective is to run the CNN on mobile phone, the challenge is to point out the best one in terms of memory occupation, computational speed and accuracy.
The analysis exploits the following CNNs: AlexNet, Residual Network, GoogLeNet (Inception), VGG, SqueezeNet, BWN and XNORNet.
The train of the CNNs follows two main steps and it has been done on both ETHZ and UPMC datasets coming from Zurich and Paris University, respectively.
In the first step, the CNNs have been trained from scratch in order to pick the best in terms of accuracy, computational speed and memory occupation. Then, a fine tuning of the best CNNs previously identified has been performed.
The frameworks used for training and testing the nets are Caffe, which is probably the most used, and Torch. In order to simplify the processes of both training and testing, the NVIDIA Deep Learning GPU Training Systems (DIGITS) has been used. This system provides an efficient and easy-to-use web interface which allows to format images and set network parameters.
Once the best CNN is identified, the application is actually developed. The application works in two modalities: running the CNN locally in the smartphone using an implementation based on RenderScript, or querying the CNN deployed on a server into a Web Application.
Afterwards, a statistical study of the results has been done in order to acquire a better user experience. By establishing a threshold on the score assigned by the CNN to a certain dish, it is possible to communicate the user if the prediction is comfortable. Consequently, the application has been developed and designed relying on Android Studio.
On the basis of the results of this research, it can be concluded that the best accuracy is obtained by the GoogLeNet which also has a reduced size. The CNN has been ported on the mobile phone using an implementation based on RenderScript. Moreover, the CNN has been deployed on a Web Application.
The analysis exploits the following CNNs: AlexNet, Residual Network, GoogLeNet (Inception), VGG, SqueezeNet, BWN and XNORNet.
The train of the CNNs follows two main steps and it has been done on both ETHZ and UPMC datasets coming from Zurich and Paris University, respectively.
In the first step, the CNNs have been trained from scratch in order to pick the best in terms of accuracy, computational speed and memory occupation. Then, a fine tuning of the best CNNs previously identified has been performed.
The frameworks used for training and testing the nets are Caffe, which is probably the most used, and Torch. In order to simplify the processes of both training and testing, the NVIDIA Deep Learning GPU Training Systems (DIGITS) has been used. This system provides an efficient and easy-to-use web interface which allows to format images and set network parameters.
Once the best CNN is identified, the application is actually developed. The application works in two modalities: running the CNN locally in the smartphone using an implementation based on RenderScript, or querying the CNN deployed on a server into a Web Application.
Afterwards, a statistical study of the results has been done in order to acquire a better user experience. By establishing a threshold on the score assigned by the CNN to a certain dish, it is possible to communicate the user if the prediction is comfortable. Consequently, the application has been developed and designed relying on Android Studio.
On the basis of the results of this research, it can be concluded that the best accuracy is obtained by the GoogLeNet which also has a reduced size. The CNN has been ported on the mobile phone using an implementation based on RenderScript. Moreover, the CNN has been deployed on a Web Application.
File
Nome file | Dimensione |
---|---|
Developm...hesis.pdf | 6.52 Mb |
Contatta l’autore |