Tesi etd-02042025-213315 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
CANZONERI, DANIELE
URN
etd-02042025-213315
Titolo
A Scalable Multi-Modal Perception System for Cognitive Architectures in Humanoid Robotics
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
ARTIFICIAL INTELLIGENCE AND DATA ENGINEERING
Relatori
relatore Prof. Cimino, Mario Giovanni Cosimo Antonio
relatore Prof. Galatolo, Federico Andrea
relatore Dott. Cominelli, Lorenzo
relatore Prof. Galatolo, Federico Andrea
relatore Dott. Cominelli, Lorenzo
Parole chiave
- artificial intelligence
- cognitive architectures
- deep learning
- human robot interaction
- micro services
- robotics
Data inizio appello
21/02/2025
Consultabilità
Non consultabile
Data di rilascio
21/02/2028
Riassunto
This thesis presents a perceptual system for Abel, a social robot, to construct a "meta-scene"—a coherent representation of its environment. We built a Go-based middleware that coordinates real-time data flow between neural network-powered modules.
Key modules include a Voice Activity Detector, Speech-to-Text converter, Object and Subject Detectors, Depth Estimator, and Saliency Estimator. These operate asynchronously, contributing to a shared representation maintained by the server. A Python-based client library enhances flexibility and extensibility.
Compared to existing middleware, our system offers superior modularity, low-latency processing, and adaptability. Benchmarks confirm its efficiency in handling sensory data and enabling multi-modal perception.
Key modules include a Voice Activity Detector, Speech-to-Text converter, Object and Subject Detectors, Depth Estimator, and Saliency Estimator. These operate asynchronously, contributing to a shared representation maintained by the server. A Python-based client library enhances flexibility and extensibility.
Compared to existing middleware, our system offers superior modularity, low-latency processing, and adaptability. Benchmarks confirm its efficiency in handling sensory data and enabling multi-modal perception.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |