Tesi etd-06142022-120833

Tipo di tesi

Tesi di laurea magistrale

Autore

OLIVOTTO, VALENTINA

URN

etd-06142022-120833

Titolo

Small object detection on high-resolution images: a comparison of Slice & Merge approaches

Dipartimento

INFORMATICA

Corso di studi

DATA SCIENCE AND BUSINESS INFORMATICS

Relatori

relatore Prof. Gallicchio, Claudio

Parole chiave

fabric defect detection
high-resolution images
object detection
yolo

Data inizio appello

01/07/2022

Consultabilità

Non consultabile

Data di rilascio

01/07/2092

Riassunto

This work aims to present a comparison of three new approaches, based on YOLOv5 architecture, for an industrial laundry's defect detection task. Automated technologies in this sector are valuable elements in reducing costs, time, and increasing customer satisfaction. Current approaches are based on traditional Computer Vision techniques, taking into consideration pixels’ characteristics and their statistical properties. These systems have many limitations in terms of robustness and hardware complexity. Thus, the idea is to take advantage of Artificial Intelligence models, based on Convolutional Neural Networks, to overcome these issues. In particular, YOLO has been considered the best choice for this purpose, since it reaches excellent performance on the state-of-the-art real-time object detection tasks, in terms of both speed and accuracy.

The main issue in this case study is that the dataset includes high-resolution images, while the defects to predict are few pixels in size. The small object detection on large images is an open problem in the Computer Vision field, especially for industrial tasks, since the available hardware is limited. As outlined by the results of the two proposed baselines, it is necessary to find a good tradeoff between accuracy in object detection and low computational costs. Thus, in this thesis, three different Slice & Merge approaches have been implemented and analyzed. The strategy is to slice the original input images, in order to reduce the computational costs, maintaining at the same time the details of high-resolution data. In particular, three slicing approaches have been considered to create the tiles, and the image_bbox_slicer library has been used to crop the images and to generate their corresponding annotations files. After that, YOLOv5s has been used to train a model and detect the bounding boxes in the test set. Ultimately, since the final outcome should be the original whole image with all the predicted boxes, two merge methodologies have been proposed. The possible heuristics of merging are various, so all the cases considered and the decisions are reported and explained.

The experimental analysis performed in the thesis shows the competitiveness of the introduced approaches, which are able to reach performance as per high-resolution images, while at the same time enabling reducing the computational requirements for the deep learning algorithms.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06142022-120833