ETD

Archivio digitale delle tesi discusse presso l'Università di Pisa

Tesi etd-01172022-165332


Tipo di tesi
Tesi di dottorato di ricerca
Autore
VENTIMIGLIA, MARIA
URN
etd-01172022-165332
Titolo
ASTER-REP, a database of Asteraceae sequences for studying structure and function of transposable elements
Settore scientifico disciplinare
AGR/07
Corso di studi
SCIENZE AGRARIE, ALIMENTARI E AGRO-AMBIENTALI
Relatori
tutor Dott.ssa Mascagni, Flavia
Parole chiave
  • Asteraceae
  • transposable elements
  • database
  • exaptation
Data inizio appello
27/01/2022
Consultabilità
Non consultabile
Data di rilascio
27/01/2062
Riassunto
Since their discovery, transposable elements (TEs) have been found to be ubiquitously present in eukaryotic genomes. These elements, which represent a major part of the repetitive component of plant species, are capable of changing their location within the genome, generating genomic plasticity by inducing various chromosomal mutations and allelic diversity, thus contributing to the evolution of their host.
The Asteraceae is one of the largest and most economically important families of flowering plants and includes very important crop species, such as sunflower, globe artichoke, and lettuce. Despite the economic importance, only partial data are available on genome composition and organization of this family.
Taking advantage of the increasing availability of plant genomic sequences, the principal aim of my Ph. D. project has been to create ASTER-REP: a comprehensive database of sequences isolated from species belonging to the Asteraceae family, for studying structure and function of TEs. The six Asteraceae species whose fully sequenced genome assemblies are available in the National Center for Biotechnology Information (NCBI) GenBank database, and which were selected for the TE discovery, are: Helianthus annuus, Lactuca sativa, Cynara cardunculus var. scolymus, Artemisia annua, Carthamus tinctorius and Chrysanthemum seticuspe. Based on the most current classification system, TEs were identified for the following five orders: LTR-RE, SINE, TIR, MITE, and Helitron. A total of 334,747 full-length TEs were identified and included in ASTER-REP. The database is set up on a Linux-Apache-MySQL-PHP (LAMP) system, and its intuitive use allows the user to choose the desired sequences by selecting the species, TE class and order and, where possible, TE superfamily and lineage. The result of the search can be visualized and downloaded as FASTA and GFF files. ASTER-REP represents a useful tool for studies on TE diversity and dynamics, and it could help to decipher the genome structure and to infer about the evolution process occurred during Asteraceae separation. Furthermore, the discovery methods used can be applied to other plant species, favouring studies on the structure of the genomes and in particular of TEs.
The repetitive component of the genomes of the selected species was investigated by comparative analysis, inferring the role that it may have played in evolution and speciation. For each species under consideration, Illumina paired-end reads, available in the GenBank database, were analysed through a clustering process using the RepeatExplorer2 software, which allowed us to estimate the abundance and variability of the repeat types. The large difference found between species is probably due to the fact that, after the separation of species, individual genomes undertook different evolutionary dynamics in terms of composition and abundance of repeat elements.
Being long terminal repeat retrotransposons (LTR-REs) the most represented elements within the repetitive component of plant genomes, the attention was subsequently focused on this order of elements. Firstly, a pool of LTR-REs from all six species was subjected to phylogenetic analyses to verify the evolutionary relationships present between the species, confirming the annotation previously attributed to these TEs; then, an insertion time analysis of the same elements estimated their proliferation from around 15 million years ago, highlighting that the species show different insertion time profiles, specific to the different LTR-RE lineages.
During evolution, the coding regions of TEs may undergo modifications leading to the loss of their self-replicative capability, acquisition of new functions, and beginning of their evolution under phenotypic selective pressure: they become novel genes, defined as exapted transposable element genes (ETEs). Focusing on an important model species that is the sunflower Helianthus annuus, whose repetitive component, mostly represented by TEs, amounts to about 80% of its genome, possible ETEs generated starting from LTR-REs and TIR elements were searched within this species. The sunflower genes showing similarity with TEs were investigated for the characteristics that distinguish TEs from genes, namely repetitiveness, similarity with already known TEs, siRNA coverage, and expression. Through this process, 3,530 sunflower genes were elected as validated ETEs. Their functional characterisation showed a significant involvement in disparate cellular functions, suggesting that ETEs affected several biological processes during sunflower evolution. The identification and characterisation of ETEs in sunflower highlighted the crucial role that the exaptation phenomenon plays in the creation of sequences with new functions, thus contributing to species evolution.
File