Tesi etd-03262026-202240 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
NKAIWATEI, PURITY SIANTAYO
URN
etd-03262026-202240
Titolo
Security Threats and Defenses for Fuzzy Regression Trees in Federated Learning
Dipartimento
INGEGNERIA DELL'INFORMAZIONE
Corso di studi
CYBERSECURITY
Relatori
relatore Dott. Ruffini, Fabrizio
relatore Prof. Ducange, Pietro
correlatore Daole, Mattia
relatore Prof. Ducange, Pietro
correlatore Daole, Mattia
Parole chiave
- Adversarial Machine Learning
- Artificial Intelligence
- Attacks in FederatedLearning
- Defenses in FederatedLearning
- Federated Learning
- Fuzzy Regression Trees
- Machine Learning
Data inizio appello
15/04/2026
Consultabilità
Non consultabile
Data di rilascio
15/04/2096
Riassunto (Inglese)
In recent years, Artificial Intelligence (AI) has transformed industrial operations across
healthcare, finance, security and Internet of Things (IoT). However, its adoption remains
cautious in high-stakes sectors, particularly in domains that handle sensitive or private
data. As a result, industries handling privacy-sensitive data require a system that pro
duces predictions that are transparent and trustworthy.
In this context, transparency refers the extent to which the internal functioning of
a model can be examined and its decisions interpreted. It involves the ability to trace
how inputs are processed and to provide explanations that are meaningful to human
stakeholders. Trustworthiness, in turn, encompasses a broader set of requirements in
artificial intelligence systems, including fairness, interpretability, robustness, explainabil
ity, safety and security. An AI system is therefore considered trustworthy not only when
it achieves strong predictive performance but also when its behaviour remains consistent,
its outputs can be justified, and sensitive information is adequately protected.
Federated Learning (FL) decentralizes the training process, allowing multiple data
owners to collaboratively build a shared model without exchanging raw data. Data
privacy is preserved, as raw data never leaves the source. However, the distributed
nature of federated learning introduces unique security vulnerabilities. Because training
is performed across multiple clients, any adversarial behaviour poses a significant risk
to the global model. Malicious participants can perform poisoning attacks by injecting
corrupted data or manipulate statistics, leading to a degradation in the performance of
the global model.
In this work, we investigate the security of a Fuzzy Regression Tree (FRT) model
learned in a federated fashion under data poisoning attacks. The model enhances trans
parency and trustworthiness and is inherently interpretable by design, by using rule-based
predictions. We perform systematic experimental evaluation by analyzing how the per
formance of the global model changes in the presence of adversarial participants.
Due to the interpretable nature of the model, we also analyze how the structure of the
model changes. Unlike black-box models also referred to as "opaque", the FRT encodes
knowledge in the form of hierarchical decision paths that can be directly translated into
human-readable rules. Structural properties such as tree depth, path length, and number
of nodes correspond to changes in how the model partitions the input space and formulates
its predictions. Changes in these properties provide insight into the reasoning process of
the model.
Our study accounts for realistic federated settings using two widely adopted regression
datasets: California Housing dataset and Parkinson Disease dataset, by incorporating
data heterogeneity among participants. This includes data with skewed distributions in
both data quantity and feature space. In the presence of attacks and heterogeneity, the
model’s structural analysis becomes particularly relevant, as it allows assessment of not
only performance degradation but also the extent to which interpretability is affected.
Existing literature predominantly focuses on classification tasks. Regression tasks re
main comparatively underexplored in federated learning, particularly when considered in
conjunction with interpretability and data heterogeneity. By focusing on fuzzy regression
trees, this study addresses this gap and investigates how these factors interact under
adversarial conditions, extending the analysis beyond the largely studied classification
oriented work.
To mitigate these threats, we propose a trimmed aggregation mechanism, which is
a variation of "Trimmed Mean". In general, trimming is a robust statistical technique
in which extreme values are removed before aggregation, with the goal of reducing the
influence of outliers. This concept is applied to participants’ updates, where a fraction
of the most extreme contributions is discarded prior to aggregation. We evaluate the
effectiveness of this defense by comparing the model performance and structure under
attack with and without the proposed defense mechanism.
The observed results indicate that the presence of malicious participants can lead to
a severe degradation in model performance. This effect became particularly evident as
the number of malicious participants increased. The model’s interpretability, measured
through tree complexity and structure, severely deteriorates under the implemented at
tacks, with the tree becoming increasingly irregular and less linguistically coherent.
The results of this analysis expose the vulnerabilities in explainable federated regression models, providing a foundation for systematic design and development of appropriate
countermeasures.
healthcare, finance, security and Internet of Things (IoT). However, its adoption remains
cautious in high-stakes sectors, particularly in domains that handle sensitive or private
data. As a result, industries handling privacy-sensitive data require a system that pro
duces predictions that are transparent and trustworthy.
In this context, transparency refers the extent to which the internal functioning of
a model can be examined and its decisions interpreted. It involves the ability to trace
how inputs are processed and to provide explanations that are meaningful to human
stakeholders. Trustworthiness, in turn, encompasses a broader set of requirements in
artificial intelligence systems, including fairness, interpretability, robustness, explainabil
ity, safety and security. An AI system is therefore considered trustworthy not only when
it achieves strong predictive performance but also when its behaviour remains consistent,
its outputs can be justified, and sensitive information is adequately protected.
Federated Learning (FL) decentralizes the training process, allowing multiple data
owners to collaboratively build a shared model without exchanging raw data. Data
privacy is preserved, as raw data never leaves the source. However, the distributed
nature of federated learning introduces unique security vulnerabilities. Because training
is performed across multiple clients, any adversarial behaviour poses a significant risk
to the global model. Malicious participants can perform poisoning attacks by injecting
corrupted data or manipulate statistics, leading to a degradation in the performance of
the global model.
In this work, we investigate the security of a Fuzzy Regression Tree (FRT) model
learned in a federated fashion under data poisoning attacks. The model enhances trans
parency and trustworthiness and is inherently interpretable by design, by using rule-based
predictions. We perform systematic experimental evaluation by analyzing how the per
formance of the global model changes in the presence of adversarial participants.
Due to the interpretable nature of the model, we also analyze how the structure of the
model changes. Unlike black-box models also referred to as "opaque", the FRT encodes
knowledge in the form of hierarchical decision paths that can be directly translated into
human-readable rules. Structural properties such as tree depth, path length, and number
of nodes correspond to changes in how the model partitions the input space and formulates
its predictions. Changes in these properties provide insight into the reasoning process of
the model.
Our study accounts for realistic federated settings using two widely adopted regression
datasets: California Housing dataset and Parkinson Disease dataset, by incorporating
data heterogeneity among participants. This includes data with skewed distributions in
both data quantity and feature space. In the presence of attacks and heterogeneity, the
model’s structural analysis becomes particularly relevant, as it allows assessment of not
only performance degradation but also the extent to which interpretability is affected.
Existing literature predominantly focuses on classification tasks. Regression tasks re
main comparatively underexplored in federated learning, particularly when considered in
conjunction with interpretability and data heterogeneity. By focusing on fuzzy regression
trees, this study addresses this gap and investigates how these factors interact under
adversarial conditions, extending the analysis beyond the largely studied classification
oriented work.
To mitigate these threats, we propose a trimmed aggregation mechanism, which is
a variation of "Trimmed Mean". In general, trimming is a robust statistical technique
in which extreme values are removed before aggregation, with the goal of reducing the
influence of outliers. This concept is applied to participants’ updates, where a fraction
of the most extreme contributions is discarded prior to aggregation. We evaluate the
effectiveness of this defense by comparing the model performance and structure under
attack with and without the proposed defense mechanism.
The observed results indicate that the presence of malicious participants can lead to
a severe degradation in model performance. This effect became particularly evident as
the number of malicious participants increased. The model’s interpretability, measured
through tree complexity and structure, severely deteriorates under the implemented at
tacks, with the tree becoming increasingly irregular and less linguistically coherent.
The results of this analysis expose the vulnerabilities in explainable federated regression models, providing a foundation for systematic design and development of appropriate
countermeasures.
Riassunto (Italiano)
File
| Nome file | Dimensione |
|---|---|
Tesi non consultabile. |
|