ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-04172018-001601


Tipo di tesi
Tesi di laurea magistrale
Autore
MARINI, MARCO
URN
etd-04172018-001601
Titolo
A new speech analysis technique to improve the performance of an Automatic Speech Recognition system for people with disabilities
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
EMBEDDED COMPUTING SYSTEMS
Relatori
relatore Prof. Fanucci, Luca
relatore Prof. Vanello, Nicola
Parole chiave
  • ASR system
  • dysarthtic people
  • persone con disabilità
  • impaired speech
  • disartria
  • ausilio
  • innovative speech analysis
  • dysarthria
  • technological aid
Data inizio appello
07/05/2018
Consultabilità
Non consultabile
Data di rilascio
07/05/2088
Riassunto
This thesis aims to discuss the problem of dysarthria, in particular its im-
plications and causes. Since the major effect on people affected by dysarthria
is the reduction of their social interaction, the purpose of this thesis is to
investigate a possible aid system through the implementation of the current
technology.
The aid system hereby proposed is composed of several elements. The one
that plays a crucial role is the “Automatic Speech Recognition (ASR)”, which
is responsible for the transcription of audio signal in sequences of words.
The accuracy of speech recognition is excellent with unimpaired speech,
but it is hard to recognize confusable words or discontinuous speech, which
characterizes dysarthric people. As a result, dysarthric people require major
innovations in the speech research field.
An in-depth analysis of ASR systems has allowed us to understand the
nature of the poor performance of these systems on dysarthric speech. There-
fore it is necessary to find a new type of voice features that could improve the
performance of the ASR system.
Several experiments have been carried out using speeches of five people
with disabilities. The results achieved have allowed us to develop a new tech-
nique of speech analysis and a consequential improvement which goes from
31% to 81%, depending on the user to be analysed.
File