Tesi etd-04152019-100924

Tipo di tesi

Tesi di laurea magistrale

URN

etd-04152019-100924

Titolo

Audio-Augmented Dialogue Systems

Dipartimento

INFORMATICA

Corso di studi

INFORMATICA

Relatori

.

relatore Bacciu, Davide
relatore Cambria, Erik

Parole chiave

Audio features
Dialogue systems
Language generation
Multi-modality

Data inizio appello

03/05/2019

Consultabilità

Non consultabile

Data di rilascio

03/05/2089

Riassunto (Inglese)

Riassunto (Italiano)

Research on building dialogue systems able to converse with humans naturally has recently attracted a lot of attention. Most work on this area assumes text-based conversation, where the user message is modeled as a sequence of words in a vocabulary. Real-world human conversation, in contrast, involves other modalities, such as voice, facial expression and body language, which in certain scenarios can have a significant influence on the conversation.

In this work, we explore the impact of incorporating the audio features of the user message into the dialogue system. Specifically, we first design an auxiliary response classification task to refine raw audio features. Then we use word-level modality fusion to incorporate the audio features as additional context in our main generative model. Experiments show that our audio-augmented model outperforms the audio-free counterpart on perplexity, response diversity and human evaluation.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04152019-100924