logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-03192025-164401


Tipo di tesi
Tesi di dottorato di ricerca
Autore
CARDIA, MARCO
URN
etd-03192025-164401
Titolo
Soft Sensors and Machine Learning for Automatic Water Quality Assessment
Settore scientifico disciplinare
INFO-01/A - Informatica
Corso di studi
INFORMATICA
Relatori
tutor Prof. Chessa, Stefano
relatore Prof. Micheli, Alessio
relatore Dott.ssa Gambineri, Francesca
Parole chiave
  • Artificial Intelligence
  • Data Augmentation
  • Generative Adversarial Network
  • Machine Learning
  • Soft Sensing
  • Spectroscopy
  • UV-Vis spectroscopy
  • Wastewater Analysis
  • Water Quality Monitoring
Data inizio appello
01/04/2025
Consultabilità
Completa
Riassunto
Water quality monitoring is a critical aspect for ensuring environmental protection, industrial safety, and public health. Traditional laboratory based methods, though accurate, are often slow, expansive, and labor-intensive, making them unsuitable for real-time decision making in rapidly changing environments.
This thesis presents a novel methodology for water quality assessment that leverages Ultraviolet-Visible (UV-Vis) spectroscopy combined with machine learning to develop soft sensing systems for real-time monitoring.
The research focuses on two main applications: industrial wastewater from the highly polluting leather industry, and drinking water quality, where ensuring safety and regulatory compliance is fundamental.
In industrial contexts, the aim is to predict key water quality indicators such as Chemical Oxygen Demand (COD), Total Suspended Solids (TSS), and chlorides in real time, while in drinking water contexts, the focus is on parameters such as Total Organic Carbon (TOC), volatile organic compounds, metals, anions, cations, and microbiological parameters.
Key innovations of this research include robust preprocessing techniques to enhance data integrity and optimize model performance. Additionally, sophisticated feature extraction methods are developed, incorporating statistical measures, peak-based features, slope-based features, and Area Under the Curve (AUC) calculations to capture meaningful spectral information.
The core of this work involves developing soft sensors that integrate UV-Vis spectroscopic data with machine learning models. A significant challenge addressed by this research is the limited availability of high-quality training data, particularly in highly polluted industrial environments. This issue is tackled using Conditional Generative Adversarial Networks (CGAN) for data augmentation.
The results show significant improvements in predictive performance when synthetic data are used, demonstrating the potential of CGANs to supplement real datasets effectively.
Furthermore, the research develops time series prediction models employing methods like one dimensional-Convolutional Neural Networks (1D-CNNs) and Echo State Networks (ESN) to forecast water quality indicators effectively, enhancing proactive monitoring capabilities.
Moreover, to promote transparency and stakeholder adoption, techniques such as random forest feature importance and SHapley Additive exPlanations (SHAP) are employed to improve the interpretability of the machine learning models, providing insights into which spectral features are most important for predicting specific water quality parameters.
Overall, this research confirms the viability of soft sensing technologies combined with machine learning for automated, real-time water quality monitoring.
File