logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-04042024-173731


Tipo di tesi
Tesi di laurea magistrale
Autore
ARGUMEDO POSADA, ALEJANDRO
URN
etd-04042024-173731
Titolo
Comparison of Machine learning techniques for predicting missing well-log samples: Application to the Lower Miocene Stratigraphic Sequence in the Matagorda Island.
Dipartimento
SCIENZE DELLA TERRA
Corso di studi
GEOFISICA DI ESPLORAZIONE E APPLICATA
Relatori
relatore Dott. Aleardi, Mattia
correlatore Dott. Stucchi, Eusebio Maria
Parole chiave
  • artificial neural network
  • linear regression
  • long short term memory
  • Machine learning
  • outliers
  • random forest
  • sonic log.
Data inizio appello
17/05/2024
Consultabilità
Completa
Riassunto
Well logging is important to get a detailed description of the subsurface properties. However, some logs might be missing due to borehole problems, tool failures, or cost limitations. In this work, I use four different methods: multivariable linear regression, random forest (RN), artificial neural networks (ANN) and long short-term memory (LSTM) to estimate missing values for the sonic log in the Lagarto Formation and I demonstrate its robustness and prediction capabilities for each of them. In addition, the LSTM predictions shows to have the best metric evaluations among the four methods employed, due to the robustness and feature extraction capabilities of depth series. A total of 8 wells from the Matagorda island area are used, including 6 wells for training and 2 wells for testing the network predictions. Two datasets were built based on the principal component analysis (PCA) and cross-correlation matrix to compare the model performance. Outlier detection and removal were carried out for both datasets based on the unsupervised isolation forest algorithm. A grid-search scheme is applied to tune the hyperparameters for each ML method. We evaluate and compare the four methods proposed for both datasets before and after outlier removal. The LSTM overcomes some very well-known issues affecting Recurrent neural networks (i.e., short-memory issue) and guarantees final predictions with higher accuracy than the other three methods employed.
File