logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-06292022-184736


Tipo di tesi
Tesi di laurea magistrale
Autore
ATZORI, ELENA
URN
etd-06292022-184736
Titolo
Comparison of Machine Learning approaches for litho-fluid facies classification from pre-stack data
Dipartimento
SCIENZE DELLA TERRA
Corso di studi
GEOFISICA DI ESPLORAZIONE E APPLICATA
Relatori
relatore Prof. Aleardi, Mattia
Parole chiave
  • facies
  • LSTM
  • TCN
  • machine learning
  • artificial neural networks
  • classification
  • seismic data
Data inizio appello
15/07/2022
Consultabilità
Non consultabile
Data di rilascio
15/07/2092
Riassunto
In this thesis, different neural networks are compared for the identification of litho-fluid facies from pre-stack seismic data.

We build and test five architectures: a Feed-Forward neural network (FFN), two bidirectional Long Short-Term Memory neural networks (LSTM and LSTM 2), and two Temporal Convolutional neural networks (TCN and TCN 2). The LSTM 2 and TCN 2 networks are the shallower counterparts of LSTM and TCN and have been introduced to prevent the overfitting issue that might affect the deeper architectures (LSTM and TCN). The outcomes provided by the neural networks are benchmarked against the results of a Bayesian Quadratic Discriminant Analysis (BQDA).

Due to the lack of field seismic data, the work is carried out on synthetic data, derived from a 2D reference model that mimics a real geological setting where a turbiditic sequence hosts a gas-saturated reservoir. Indeed, the petrophysical properties for the reference model have been derived by integrating the information provided by exploration wells with the results of a petrophysical inversion of field seismic data. In this context, a properly calibrated rock physics model provides the link between the petrophysical properties and the elastic attributes of Vp, Vs, and density, whereas the facies have been established according to appropriate cutoff values.

The application of any machine learning approach requires proper training, validation, and test sets that in our case are constituted by pesudologs of facies, Vp, Vs, and density generated according to a previously defined a priori model. In particular, we assume a Gaussian mixture (GM) prior for the elastic properties, in which each Gaussian component corresponds to a given litho-fluid class. This multimodal prior allows us to take into account the facies dependent behavior of the elastic properties. The statistical properties of the mixture have been directly computed from the reference elastic model. On the other hand, the 1D facies profiles are generated according to a Hidden Markov Model in which a transition probability matrix (calibrated on the reference model) guarantees the vertical continuity of the facies. After generating the facies profiles, the elastic properties of Vp, Vs, and density are distributed according to the prior GM assumption. In this case, a Gaussian vertical correlogram is also used to impose vertical continuity to the simulated properties.

After generating the training, validation, and test sets, the optimal network architectures and hyperparameter configurations have been defined through a trial and error procedure.
The performed experiments approach the classification problem with increasing degrees of difficulty and can be divided into four groups:

1 – Classification of the Pseudologs;
2 – Classification of the Elastic Reference Model;
3 – Classification of Inverted Elastic Properties;
4 – Classification of the Seismic Data.


In the first experimental set-up, the networks classify a test set of pseudologs, generated as described before. In this case, the training, validation, and test sets share identical statistical characteristics. For this set-up, all the classification methods are able to predict the litho-facies with high accuracy.

In the second experimental set-up, we use the networks to classify the elastic reference model, whose statistical characteristics are slightly different from those assumed in the a priori model. In this experiment, between the two LSTM architectures, the shallower one, (LSTM 2), performs better (and is also the best classification method). The remaining methods anyway show good prediction capabilities, and they are able to correctly identify the most part of the reservoir area.

The third set-up aims to classify the litho-facies from inverted elastic properties of Vp, Vs, and density. In this phase, the observed seismic data are computed from the reference elastic model, by means of a 1D convolutional forward model based on Zoeppritz equations, and for incident angles of 0,15 and 30 degrees. The pre-stack data are inverted using a locally linearized AVA inversion, under the assumption of a Gaussian prior model. The inferred elastic properties are used as input for the facies classification. We first perform the experiment for ideal conditions (the noise and wavelet used to generate the observed data match those implemented in the forward model). Then, we assess the stability of the networks in case of under-estimation of the noise during the inversion procedure, as well as in case of under-estimation of the amplitude and frequency of the wavelet.
The results at this stage show an overall decrease in the prediction accuracy with respect to the results from stages 1 and 2. This was in part expected, as the lower resolution of the seismic data and the ill-conditioning of the elastic inversion introduce more ambiguity, making the discrimination between different saturation conditions more challenging. The performance of the LSTM 2 however stands out, being the only method that correctly identifies the gas-bearing area, and it shows more robustness as well in case of erroneous assumptions about the noise statistics and/or errors in the estimated source wavelet.

Finally, in the last experimental set-up, we classify the litho-facies directly from the observed pre-stack data. For this classification task, we test the LSTM, the LSTM 2, and the TCN networks, which are able to deal with data as time series, (we do not further consider the TCN 2, as it was outperformed by the TCN in previous tasks).
Since we have seismic data, a new generation phase is required to produce seismic training and validation sets. For the ensemble of the facies profiles in the pseudolog training and validation datasets, the associated pre-stack responses are computed, accordingly to the previously described 1D convolutional model.
The LSTM, LSTM 2, and TCN are retrained and used to classify the observed pre-stack data. We test the performances of the networks in case the noise and/or wavelet used to compute the data for the training phase are different from those of the observed data. In this final classification, the LSTM slightly outperforms the LSTM 2 and the TCN.

Our results demonstrate the applicability of the machine learning methods to the facies classification problem. In particular, the LSTM 2 has proven to be the most stable and robust method throughout the different classification tasks.
In general, particular care must be taken in the generation of an appropriate training set and in the training procedure itself.
Further development may be the application of the methods to field seismic data. In case of application to a field scenario, it’s important to validate the seismic data with actual well log information, to have a reliable training dataset.
File