Tesi etd-01292021-184347

Tipo di tesi

Tesi di laurea magistrale

Autore

GIANNESSI, RAFFAELE

URN

etd-01292021-184347

Titolo

Health monitoring and recovery in embedded hypervisors

Dipartimento

INGEGNERIA DELL'INFORMAZIONE

Corso di studi

EMBEDDED COMPUTING SYSTEMS

Relatori

relatore Prof. Buttazzo, Giorgio C.
relatore Dott. Biondi, Alessandro
relatore Ing. Cicero, Giorgiomaria

Parole chiave

Cyber-physical systems
Embedded Systems
Hypervisor
Isolation
Mixed-criticality
Monitor
Recovery
Safety
Virtualization

Data inizio appello

19/02/2021

Consultabilità

Non consultabile

Data di rilascio

19/02/2091

Riassunto

Software complexity in embedded systems is continuously increasing while embedded computing platforms
are becoming more and more powerful and heterogeneous to perform high-performance computations with limited power budgets. Modern embedded software systems are composed of subsystems with different levels of criticality and security, which make risky and inefficient the adoption of a single Operating System (OS) to handle all software tasks in a holistic fashion. For this reason, virtualization technology is establishing as the de-facto solution to securely and safely host mixed-criticality software on the same platform by providing a multi-domain environment in which Real-Time Operating Systems (RTOSs) may coexist, in isolation, with General Purpose OSs (e.g., Linux). Due to their large code base and software complexity, the latter are much more prone to safety and security threats with respect to RTOSs, hence calling for continuous monitoring to detect and react to possible failures of different nature.
The aim of this thesis is to design and implement hypervisor-level mechanisms to monitor failure in virtual machines (VMs) and recover failed VMs that host the Linux OS in a mixed-criticality environment. The mechanisms have been realized within CLARE-Hypervisor, a fully-static type-1 real-time hypervisor targeting cyber-physical systems on heterogeneous platforms.
The main idea is to detect Linux crashes and perform a warm reset of the corresponding VM, while keeping the entire system up such that the other VMs in the system can continue to run without experiencing any unwanted interference. The proposed monitoring technique is based on two different approaches: synchronous and asynchronous fault detection. The former is based on a direct notification to the hypervisor that the execution flow of the VM ended up in the kernel panic code section. The latter is based on a watchdog timer implemented at the hypervisor level with refresh notifications sent by the VM by means of a Linux kernel module.
Once a failure is detected, a warm reset of the failed VM is performed without directly involving the hypervisor, being the recovery procedure within the same VM context. In this way, the workload of the recovery procedure is handled as normal VM execution, thus preserving all the isolation properties configured for the VM of interest.
Experimental results are finally reported to prove the feasibility for reliable and interference-free
monitoring and recovering mechanisms for mixed-criticality cyber-physical systems equipped with the
Xilinx Ultrascale+ SoC, exhibiting a negligible impact on the boot latency, recovery time, and run-time overhead.

File

Nome file	Dimensione
Tesi non consultabile. Contatta l’autore

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-01292021-184347