logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01202022-135950


Tipo di tesi
Tesi di laurea magistrale
Autore
SENSOLI, FEDERICO
URN
etd-01202022-135950
Titolo
Study and development of machine learning methods to detect vocal fold benign lesions from audio recordings
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
BIONICS ENGINEERING
Relatori
relatore Mannini, Andrea
relatore Cianchetti, Matteo
controrelatore Vanello, Nicola
Parole chiave
  • support vector machine
  • vocal folds benign lesions detection
  • machine learning
  • acoustic features
Data inizio appello
11/02/2022
Consultabilità
Tesi non consultabile
Riassunto
In the current state of the art the diagnosis of benign lesions of the vocal fold is still challenging. The analysis of the acoustic signals though the implementation of Machine Learning models can be a viable solutions aimed at offering support to the clinical diagnosis.
Each of the 418 audio signals, recorded from four pathological groups (vocal folds cyst, polyp, nodules, Reinke's edema, with numerosity of 69, 141, 96, 112, respectively), were processed cutting unvoiced parts and segmenting them into 50% overlapping sliding windows of 0.046 s (1024 samples in each window with 22050 Hz sampling rate). From each window, 66 features were extracted, then seven statistical measures were computed: mean, standard deviation, skewness, kurtosis, 25th, 50th, 75th percentiles. In addition, four features were extracted from the whole signal, leading to a total of 466 features. In our preliminary analysis, ANOVA or Kruskal-Wallis test with post-hoc analyses were used to test the significance of the features in the identification of the 4 groups. The groups normality was tested through the Kolmogorov-Smirnov test.
138 features have been found associated to statistically significant differences across groups (significance threshold at p-value ≤ 0.05). Concerning the classification task, a non-linear Support Vector Machine was trained and cross-validated (10-fold cross-validation) employing a feature set made of the 138 significant features. The model performance was assessed in terms of accuracy and average F1-score and the results were further analysed within male and females sub-groups. Machine learning methods such as data augmentation and feature selection were used to improve the classification performance, leading to an accuracy of 55.7%, 79.7%, and 54.4% on the overall samples, males and females, respectively. The results show 1) a better performance for male samples and 2) different performances for the single pathologies with respect to the gender: vocal folds nodules and polyp are better detected in male samples, whereas vocal fold cysts are better assessed in female samples.
Future developments might include exploring different classifiers, such as artificial neural networks, or other machine leraning techniques like Principal Component Analysis. Finally, the healthy voice class could be included into the classification task, to increase the number of potential users of our application.
File