Tesi etd-03272025-100114 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
MEINI, DENNY
URN
etd-03272025-100114
Titolo
Development of a Multimodal Data pipeline for Drawings Similarity
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Prof. Galatolo, Federico Andrea
relatore Dott. Consoloni, Marco
relatore Prof. Giordano, Vito
relatore Prof. Galatolo, Federico Andrea
relatore Dott. Consoloni, Marco
relatore Prof. Giordano, Vito
Parole chiave
- embeddings
- image similarity
- multimodality
- OCR
- perspective classification
- scraping
- technical drawings
Data inizio appello
14/04/2025
Consultabilità
Non consultabile
Data di rilascio
14/04/2095
Riassunto
Technical drawing similarity is a crucial task, since it permits to avoid a lot of legal problems during the ideation and development of new products. Anyway, it is extremely difficult to find the similarities between technical drawing, since both the images and the text have aspects that make them not perfect for this task.
A smart solution can be to develop a multimodal system.
Multimodality is a theory that deals with the use of various types of data in order to gain more knowledge and to have a more complete view of the problem.
In this thesis work, we apply multimodality to the field of technical documents retrieval, where the data that has to be used are the text and the images of the documents, with the goal of performing the task of technical drawings similarity, that is the task of evaluating how much two technical documents are similar between them by looking at the images. In order to do this work, we used various techniques that permits us to extract information from both images and text.
This, combined with the development of a perspective classifier, allowed us to achieve the goal of comparing in the same space more images, in order to observe similarities and differences between them.
A smart solution can be to develop a multimodal system.
Multimodality is a theory that deals with the use of various types of data in order to gain more knowledge and to have a more complete view of the problem.
In this thesis work, we apply multimodality to the field of technical documents retrieval, where the data that has to be used are the text and the images of the documents, with the goal of performing the task of technical drawings similarity, that is the task of evaluating how much two technical documents are similar between them by looking at the images. In order to do this work, we used various techniques that permits us to extract information from both images and text.
This, combined with the development of a perspective classifier, allowed us to achieve the goal of comparing in the same space more images, in order to observe similarities and differences between them.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |