Home ETD
banca dati delle tesi e dissertazioni accademiche elettroniche
Università di Pisa
Sistema bibliotecario di ateneo
Tesi etd-01152010-121349
Condividi questa tesi: 
 
 

Tipo di tesi Tesi di dottorato di ricerca
Autore BERTINETTO, CARLO GIUSEPPE
Indirizzo email cgbertinetto@gmail.com, carlo.bertinetto@aalto.fi
URN etd-01152010-121349
Titolo PREDICTION OF THE PHYSICO-CHEMICAL PROPERTIES OF LOW AND HIGH MOLECULAR WEIGHT COMPOUNDS
Settore scientifico disciplinare CHIM/02 - CHIMICA FISICA
Corso di studi SCIENZE CHIMICHE
Commissione
Nome Commissario Qualifica
Prof. Maria Rosaria Tiné tutor
Prof . Walter Baratta commissario
Prof. Bruno Marongiu commissario
Prof. Roger Fuoco commissario
Dr. Alessio Micheli commissario
Parole chiave
  • IGC50
  • LC50
  • melting point
  • multi-linear regression
  • Pimephales promelas
  • pyridinium bromides
  • QSAR/QSPR
  • recursive neural networks
  • structured representation
  • Tetrahymena pyriformis
  • toxicity
  • glass transition temperature
  • (meth)acrylic homopolymers
  • (meth)acrylic copolymers
Data inizio appello 2010-02-18
Disponibilità mixed
Data di rilascio2050-02-18
Riassunto analitico
In the present Ph.D. Thesis, an innovative approach to derive Quantitative Structure-Property/Activity Relationships (QSPR/QSARs) was investigated and discussed by applying it to various predictive problems. This approach is based on the direct and adaptive treatment of molecular structure by means of a Recursive Neural Network (RNN). Chemical compounds are represented through appropriate graphical tools and no numerical descriptors are needed.
In the first part, the RNN-QSPR method was applied to predicting the melting point (Tm) of a set of 126 pyridinium bromides and the glass transition temperature (Tg) of a set of 337 (meth)acrylic homopolymers. Particular emphasis was placed on the representation of cyclic moieties, which can be achieved in different ways by exploiting the flexibility of the structured approach. Various representations were devised, each one having different advantages and sampling requirements. The performance did not show significant variations when passing from a more specific representation to a more general one. The best result obtained for the Tm of pyridinium bromides showed, for the test set of 37 molecules, a mean absolute residual (MAR) of 25 K, a standard error of prediction (S) of 29.6 K and a squared correlation coefficient (R2) of 0.62. The best outcome for the Tg of poly(meth)acrylates had MAR, S and R2 values of 15.8 K, 20.4 K and 0.85, respectively, for the test set of 54 molecules.
In the second part, the representation used for the treatment of homopolymers was expanded to treat copolymers. A data set containing the Tg of 275 random (meth)acrylic copolymers was investigated, either alone or mixed with homopolymer data. The prediction on copolymers was excellent, with MAR, S and R2 for the 57 compounds in the test set of 4.9 K, 6.1 K and 0.98. The method yielded a good performance also on the total data set comprising homopolymers and copolymers together.
In the last part, the RNN approach was employed to model and predict the toxicity of two sets of aromatic molecules. The first data set involved the median growth impairment concentration (IGC50) of 221 phenols towards Tetrahymena pyriformis. The results were good for the training set, but the performance on the test set (41 molecules) was not on par with that of other methods in the literature. However, it must be stressed that the referenced methods employ a priori information synthesized into appropriate numerical descriptors, whereas our method does not make use of any background knowledge. The second data set concerned the median Lethal Concentration (LC50) of 69 substituted benzenes towards Pimephales promelas. This data set was also investigated by means of a descriptor-based MLR technique. The performance was good for both calculations, yielding MAR ≈ 0.22, S ≈ 0.25 and R2 ≈ 0.80 on the test set of 18 molecules. The results obtained by RNN and MLR were very similar, despite the radically different approaches of these two methods.
File
  Nome file       Dimensione       Tempo di download stimato (Ore:Minuti:Secondi) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)    piu' di 128 Kb  
  Appendice1_2_3_4.pdf 279.89 Kb 00:01:17 00:00:39 00:00:34 00:00:17 00:00:01
  Capitolo1.pdf 181.40 Kb 00:00:50 00:00:25 00:00:22 00:00:11 < 00:00:01
  Capitolo2.pdf 926.18 Kb 00:04:17 00:02:12 00:01:55 00:00:57 00:00:04
  Capitolo3.pdf 2.64 Mb 00:12:13 00:06:17 00:05:30 00:02:45 00:00:14
  Conclusioni.pdf 80.26 Kb 00:00:22 00:00:11 00:00:10 00:00:05 < 00:00:01
  Copertina_Indice_Abstract.pdf 116.25 Kb 00:00:32 00:00:16 00:00:14 00:00:07 < 00:00:01
Ci sono 3 file riservati su richiesta dell'autore.
Contatta l'autore