logo SBA


Digital archive of theses discussed at the University of Pisa


Thesis etd-01082016-164849

Thesis type
Tesi di dottorato di ricerca
Thesis title
Structural genomics of sunflowers (Helianthus spp.)
Academic discipline
Course of study
tutor Prof.ssa Natali, Lucia
  • Helianthus
  • retrotransposon
  • Structural genomics
  • sunflowers
Graduation session start date
Release date
The mobile component of the genome is composed by sequences, called transposable elements, which are able to move across the DNA using different replication mechanisms. These sequences, that are present in the nuclear genome of virtually all eukaryotes, lead to significant effects on the host genome. In fact, aside from polyploidy, transposable elements using the “copy and paste” mode of replication are the major driver of genome size increase, moreover, they are able to cause mutations.
Developmental genetic studies of model organisms have clarified that retroelements may also play a role in the epigenetic settings of the genome, regulating chromatin organization in the nucleus, and as control elements of the expression of genes.
Nowadays, there is a paucity of information regarding the distribution of repetitive sequences and the overall genome organization in plants with medium-large genomes, in which repetitive DNA is most frequent and presumably most important. The Asteraceae is one of the largest and most economically important families of flowering plants and includes important crop species, such as the sunflower (Helianthus annuus). Despite its economic importance, there is no reference genome sequence and little information is available on genome composition and organization for this family. A detailed structural analysis of the sunflower genome is still to be completed although many sequence data are available. In my PhD work next generation sequencing techniques were used to study the repetitive component of the sunflower genome. Firstly, by varying sequencing technology (Illumina or 454), coverage (0.55-1.25X), assemblers and assembly procedures a database of sunflower repetitive sequences (SUNREP) was produced. Among the 47,924 repeated sequences that constitute the database, retrotransposons were by far the most represented. Within long terminal repeat (LTR) retrotransposons, sequences belonging to the Gypsy superfamily were 2.3-fold more represented than those belonging to the Copia superfamilies. In total, the repetitive component amounted to 81% of the sunflower genome, substantially confirming the results previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach. SUNREP has proven to be a useful tool for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.
Subsequently, the repetitive component was characterized in cultivated and wild genotypes, to assess its intraspecific variation and possible role in domestication. Analyzing the most repeated sunflower LTR-retrotransposons considerable variation in redundancy was observed among genotypes. Such variation was found for both Gypsy and Copia retrotransposons and even for each LTR-retrotransposon lineage. Within each superfamily, different lineages showed different behaviors: for example, Athila and Maximus/SIRE elements were more redundant in wild than in cultivated genotypes, while the opposite trend was found for Chromovirus and AleII retrotransposons. Large variability among genotypes was ascertained also for retrotransposon proximity to genes, showing that these elements are closer to genes in wild than in cultivated sunflower genotypes. Such differences in LTR-retrotransposon redundancy and proximity to genes between cultivated and wild genotypes indicate a possible involvement of retrotransposons in sunflower domestication.
Finally, the characterization of the repetitive component of the genome of the Helianthus species was achieved to elucidate its evolution and the relationship to that of other species of the Asteraceae family. The genome structure is similar among the analyzed species, with LTR-retrotransposons representing the vast majority of repetitive sequences. There are not LTR-retrotransposon lineages or sublineages specific to one or a few species. Conversely, a huge difference can be observed analyzing the relative abundance of LTR-retrotransposons, from the superfamily to the sublineage level. Such results suggest that the Helianthus species shared the ancestors of the different LTR-retrotransposons sublineages, and that, next to species divergence, these sublineages were subjected to different rates of amplification/loss, while no new LTR-retrotransposon sublineage originated in the genome.