Tesi etd-06062024-121356 |
Link copiato negli appunti
Tipo di tesi
Tesi di dottorato di ricerca
Autore
BUONGIORNO, ROSSANA
URN
etd-06062024-121356
Titolo
Optimizing Medical Image Segmentation using a Priori knowledge in Attention Mechanism-Enriched Convolutional Neural Networks
Settore scientifico disciplinare
ING-INF/06
Corso di studi
INGEGNERIA DELL'INFORMAZIONE
Relatori
tutor Prof. Ducange, Pietro
supervisore Dott.ssa Colantonio, Sara
supervisore Dott.ssa Germanese, Danila
supervisore Dott.ssa Colantonio, Sara
supervisore Dott.ssa Germanese, Danila
Parole chiave
- attention mechanisms
- deep learning
- medical image segmentation
- spatial priors
Data inizio appello
13/06/2024
Consultabilità
Non consultabile
Data di rilascio
13/06/2027
Riassunto
In recent years, there has been a remarkable shift in medical image segmentation, driven by the intersection of Deep Learning (DL) and medical imaging technologies. This convergence has led to significant progress, fundamentally altering how medical image analysis is approached. DL methods, notably Convolutional Neural Networks (CNNs), have played a pivotal role in this transformation by revolutionizing the field of medical image segmentation. They facilitate the automatic extraction of features from raw image data, achieving unparalleled levels of accuracy and sensitivity.
However, despite these advances, persistent challenges such as computational demands, data quality and availability, interpretability, and model generalization hinder the broad adoption of DL models in clinical environments. Moreover, while CNNs manage to autonomously extract and analyze image features with a good level of detail, they often struggle to identify regions in images that exhibit complexities that are challenging even to the human eye.
To address these issues, attention and recurrence mechanisms have been introduced. The former enhances the network's ability to focus on relevant regions in the image while ignoring irrelevant background, whereas the latter studies long-range dependencies between different areas of the image to obtain broader contextual information.
The first part of this doctoral thesis thoroughly examines and analyzes attention and recurrence mechanisms to determine their efficacy in binary medical image segmentation. Specifically, the objective was to identify the mechanism that strikes the optimal balance between resource utilization, data availability, and accurate segmentation outcomes for the given problem statement. The results of this analysis have shown that attention mechanisms improve segmentation accuracy by dynamically adjusting weights assigned to different image regions, and optimizing data requirements. However, effectively directing CNN's attention remained challenging in scenarios requiring a clear and precise differentiation between subtle variations crucial for accurate diagnoses.
These challenges formed the basis for the second part of the thesis, which explores the integration of spatial priors into CNN architectures, specifically within a UNet-based framework enriched with the attention mechanism, namely the Attention-UNet. More precisely, by incorporating prior knowledge about the spatial location of objects to be segmented, the proposed approach aims to enhance CNN effectiveness in the segmentation task. A new framework, called SPI-net, was designed for this purpose. SPI-net features an Attention-UNet as a backbone, an upstream block aimed at obtaining spatial prior, and an additional novel branch featuring long skip connections to inject nuanced context-aware information into the decoding pathway of the network. This improves its understanding of underlying structures and enhances segmentation accuracy.
The experimental application and evaluation of SPI-net focused on the segmentation of COVID-19 infections, leveraging prior knowledge of disease spatial location to guide CNN attention. The results demonstrate the efficacy of SPI-net in accurately delineating disease patterns, outperforming traditional segmentation approaches. The comparative analysis highlights the limitations of conventional pre-processing operations, emphasizing the importance of integrating spatial priors into CNN architectures.
Overall, this research contributes to the advancement of medical image segmentation by implicitly incorporating prior knowledge into CNNs, offering insights and empirical evidence to enhance segmentation accuracy and interpretability. The findings extend beyond COVID-19 segmentation, offering a promising framework for various medical imaging applications and contributing to the evolution of CNNs as reliable tools in healthcare diagnostics.
However, despite these advances, persistent challenges such as computational demands, data quality and availability, interpretability, and model generalization hinder the broad adoption of DL models in clinical environments. Moreover, while CNNs manage to autonomously extract and analyze image features with a good level of detail, they often struggle to identify regions in images that exhibit complexities that are challenging even to the human eye.
To address these issues, attention and recurrence mechanisms have been introduced. The former enhances the network's ability to focus on relevant regions in the image while ignoring irrelevant background, whereas the latter studies long-range dependencies between different areas of the image to obtain broader contextual information.
The first part of this doctoral thesis thoroughly examines and analyzes attention and recurrence mechanisms to determine their efficacy in binary medical image segmentation. Specifically, the objective was to identify the mechanism that strikes the optimal balance between resource utilization, data availability, and accurate segmentation outcomes for the given problem statement. The results of this analysis have shown that attention mechanisms improve segmentation accuracy by dynamically adjusting weights assigned to different image regions, and optimizing data requirements. However, effectively directing CNN's attention remained challenging in scenarios requiring a clear and precise differentiation between subtle variations crucial for accurate diagnoses.
These challenges formed the basis for the second part of the thesis, which explores the integration of spatial priors into CNN architectures, specifically within a UNet-based framework enriched with the attention mechanism, namely the Attention-UNet. More precisely, by incorporating prior knowledge about the spatial location of objects to be segmented, the proposed approach aims to enhance CNN effectiveness in the segmentation task. A new framework, called SPI-net, was designed for this purpose. SPI-net features an Attention-UNet as a backbone, an upstream block aimed at obtaining spatial prior, and an additional novel branch featuring long skip connections to inject nuanced context-aware information into the decoding pathway of the network. This improves its understanding of underlying structures and enhances segmentation accuracy.
The experimental application and evaluation of SPI-net focused on the segmentation of COVID-19 infections, leveraging prior knowledge of disease spatial location to guide CNN attention. The results demonstrate the efficacy of SPI-net in accurately delineating disease patterns, outperforming traditional segmentation approaches. The comparative analysis highlights the limitations of conventional pre-processing operations, emphasizing the importance of integrating spatial priors into CNN architectures.
Overall, this research contributes to the advancement of medical image segmentation by implicitly incorporating prior knowledge into CNNs, offering insights and empirical evidence to enhance segmentation accuracy and interpretability. The findings extend beyond COVID-19 segmentation, offering a promising framework for various medical imaging applications and contributing to the evolution of CNNs as reliable tools in healthcare diagnostics.
File
Nome file | Dimensione |
---|---|
La tesi non è consultabile. |