ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-01272018-201951


Tipo di tesi
Tesi di laurea magistrale
Autore
ANCILOTTI, BEATRICE
URN
etd-01272018-201951
Titolo
The User Side of Innovation. Extracting User Categories from patents.
Dipartimento
INGEGNERIA DELL'ENERGIA, DEI SISTEMI, DEL TERRITORIO E DELLE COSTRUZIONI
Corso di studi
INGEGNERIA GESTIONALE
Relatori
relatore Prof. Bonaccorsi, Andrea
relatore Dott. Chiarello, Filippo
Parole chiave
  • text mining
  • categories
  • user
Data inizio appello
21/02/2018
Consultabilità
Non consultabile
Data di rilascio
21/02/2088
Riassunto
Given the competitiveness of the business environment and the rapid technological changes, an early identification of opportunities is crucial for strategy formulation nowadays. In addition, the rapid growth of patent documents requires increasingly sophisticated tools for content analysis.
In this context, in order to analyze the contents of patents, Data Mining techniques and in particular Text Mining acquired growing importance. Text Mining is able to discover relevant information in documents and transform the text into data through the use of different approaches, one of which in Natural Language Processing (NPL).
This works aims to establish a new method to identify different categories of textual elements inside documents. In particular, considering the increasing variety of readers interested in patent analysis, this work focused on users of patents. Because of the continuous users’ behavioral changes and needs, there are new categories of readers, such as marketers and designers increasingly interested in patent analysis.
The file input was a list containing 73286 items, generated merging together existing lists of entities like jobs, hobbies and using machine learning techniques. Starting from this list, the first step was to study how the number of patents citing a user, can be an index of genericity or specificity of that user.
Secondly, through the search of tags users were assigned to different categories. This second part of works help to understand how a generic user can be present in different groups of entities in contrast to specific users. In this way, the more generic a user is, the higher is the probability of finding that user in patents of different technological areas.
The work is structured as follow: in the Chapter 1 a review of the state of art concerning text mining techniques is presented, the Chapter 2 defines the method used to achieve the purpose of the work and the Chapter 3 describes the application to the case study of users.
File