logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04122023-163558


Tipo di tesi
Tesi di laurea magistrale
Autore
IBRAHIM, AHMED SALAH TAWFIK
URN
etd-04122023-163558
Titolo
Development of a Conversational Software Agent for a Virtual Research Environment
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Dott. Candela, Leonardo
Parole chiave
  • nlp
  • natural language processing
  • vre
  • virtual research environment
  • chatbot
  • conversational agent
  • transformers
  • information retrieval
  • conversational information retrieval
  • natural language generation
  • nlg
Data inizio appello
28/04/2023
Consultabilità
Non consultabile
Data di rilascio
28/04/2093
Riassunto
On one hand, rapid progress in NLP has led to huge improvements in the capabilities of conversational agents. On the other hand, scientific communities are increasingly utilizing the resources provided by virtual research environments (VREs) in their collaborative research. Therefore, we envision a need to integrate conversational agents within VREs as they would facilitate the utilization of the resources of those environments. In particular, we focus on developing a conversational agent which is capable of the retrieval of resources via natural language queries, open-book question-answering, text summarization and resource recommendation. To achieve this, we finetune a transformer model to develop an intent classifier, an entity extractor, an offensive language classifier and an ambiguous query classifier. We also finetune a sentence transformer in order to develop a neural information retriever. Finally, we finetune a number of generative transformer models for the task of language generation with the purpose of generating answers to questions, summaries and replies to small talk. We integrate these models together in order to develop our conversational agent which is later deployed on one VRE where it underwent an experimental phase to get users’ feedback via a feedback form which asks users to evaluate the replies in terms of fluency, length, speed, correctness and usefulness. Users were generally satisfied with the replies they got; however, some replies were not useful and/or correct in some cases. This has highlighted not only the limitations of our system like the fact that it cannot engage in small talk and that it hallucinates when the answer to a question does not exist in the papers of the VRE, but also some of its strengths, like the neural retriever’s ability to get the most relevant content in the majority of cases.
File