How can transposable elements affect the genome




















The majority of LINE-1 sequences are transcriptionally inactive. The Alu sequences were named based on sharing a common cleavage site for the Alu I restriction enzyme Houck et al.

In turn, the MIR elements are inactive. These sequences contribute to about 0. In turn, pseudogenes are DNA sequences that are related to real genes but they have lost at least some protein-coding abilities. It has been found that mRNA of pseudogenes can be reverse transcribed by the proteins encoded by LINE-1 sequences and transferred into other regions of the genome, creating processed pseudogenes.

It has been estimated that the human genome consists of over 7, pseudogenes Zhang et al. In case of integration close to active promoters, processed pseudogenes can be further transcribed. These transposons do not act via RNA intermediates and encode enzymes that enable their mobilization. Due to their inactivity their causal role in the etiology of human diseases is less likely Darby and Sabunciyan, A recent review of human monogenic diseases that occur due to retrotransposition suggests that only the transposition of LINE-1, Alu, and SVA sequences might be deleterious, underlying the development of monogenic diseases Kaer and Speek, Retrotransposition might affect various gene regions via altering their sequence or influencing expression activity.

For instance, the Alu sequences contain several stop codons that may result in a truncated protein Mighell et al. This mechanism has been discovered in patients with hemophilia B caused by transferring the Alu-Ya5 element into a protein coding region of the factor IX gene Vidaud et al. In case of transposition into promoter regions, these sequences might impact gene expression. Another scenario originates from sequence homology that can promote homologous recombination, leading to insertions and deletions.

Finally, the SVA tandems can mobilize exons, contributing to complex rearrangements. However, the effects of alterations in DNA sequence triggered by retrotransposition have not been found to underlie the development of common mental disorders.

Indeed, the majority of TEs in the human genome are hypermethylated Pray, b. Although DNA methylation acts as a defense mechanism, it cannot be excluded that hypermethylation of newly inserted TEs can lead to further changes in chromatin conformation, triggering changes in the expression of adjacent genes. It is most likely that retrotransposition occurs during early development when epigenetic marks are removed Darby and Sabunciyan, There are also some well characterized histone modifications, including trimethylation of lysine 9 and lysine 27 at histone H3 H3K9me3 and H3K27me3, respectively , which lead to heterochromatin formation and transcriptional silencing of TEs Day et al.

It should be noted that only a small subset of TEs has been reported to be involved in retrotransposition.

For instance, only 30—60 LINE-1 sequences in diploid cells are capable of retrotransposition Sassaman et al. In addition, the majority of LINE-1 sequences are methylated to a certain degree. It has been found that LINE-1 methylation might impact gene expression via specific mechanisms [for review see Kitkumthorn and Mutirangura, ]. Global DNA hypomethylation that progresses with aging has been associated with genomic instability Jung and Pfeifer, Hypomethylated genome regions are prone to accumulate various types of DNA lesions that include oxidative damage, depurination, depyrimidation and pathologic endogenous double-strand breaks Mutirangura, The latter ones are now believed to act as intermediate products that drive genomic instability Mutirangura, Accumulating evidence indicates that methylation of TEs might protect against genomic instability processes.

It remains largely unknown how changes in the expression of TEs might contribute to the development of mental disorders. It has been hypothesized that the presence of TEs in the human genome provides immunity against several infectious agents. Indeed, the mechanisms that contributed to HERV insertions are analogous to those used for replication by exogenous retroviruses Grandi and Tramontano, Therefore, changes in the expression of TEs, e.

There is evidence that HERV-derived peptides may interact with innate immunity via various mechanisms. Emerging evidence indicates that exogenous viruses, including herpesviruses and influenza virus, might modulate the expression of HERV sequences. This mechanism might play a protective role and has been reviewed in detail by Grandi and Tramontano In brief, HERV transcripts might interact with homologous RNA from exogenous retroviruses, leading to the formation of molecules that are recognized by PRRs, acting as innate immunity sensors.

The similarity of HERV proteins to those exogenous retroviruses allow them to compete with cellular receptors. This similarity might also trigger complementation events that impair formation of viral particles after cellular infection.

On the other site, HERV proteins may suppress innate immunity. As mentioned above, expression of TEs might play an important role in shaping immune responses against exogenous infections. Aberrant immune-inflammatory responses have been reported in several mental disorders.

Also, a number of exogenous infections have been found to impact a risk of mental disorders. Below, we review studies investigating TEs and their epigenetic regulation in specific mental disorders, starting from the rationale of these studies that is based on the contribution of immune-inflammatory processes.

A summary of human studies was provided in Table 1. Table 1. Overview of human studies investigating the role of TEs in mental disorders. Interestingly, the studies in mouse models provide additional information on the potential use of ERV sequences as biomarkers: i a higher expression of ERV was observed both in the peripheral blood mononuclear cells and the brain, suggesting that altered profile of peripheral ERV sequences may reflect similar alterations at the brain level; ii ERV overexpression in ASD mouse models is detectable from prenatal stage till the adulthood and iii ERV overexpression in ASD mouse models is also accompanied by increased expression of pro-inflammatory cytokines and Toll-like receptors.

Furthermore, a subsequent study in one of the models mice prenatally exposed to valproic acid provided evidence that higher levels of ERVs are also detectable in the offspring second and third generations of those mice exposed prenatally to valproic acid Tartaglione et al.

The levels of LINE-1 ORF1 and ORF2 transcripts have been investigated in four brain regions of patients with idiopathic autism the frontal cortex, anterior cingulate, auditory cortex, and cerebellum. Elevated LINE-1 expression together with lower binding affinity of repressive MeCP2 protein and histone H3K9me3 to LINE-1 sequences was observed only in the cerebellum, suggesting a lessening of epigenetic repression and consequently an increase in chromatin accessibility.

Interestingly, the increase in LINE-1 expression was also inversely correlated with glutathione redox status, consistent with reports indicating that LINE-1 expression is increased under pro-oxidant conditions Shpyleva et al. The overexpression of LINE-1 within a single brain region is suggestive of a mosaicism-like impact of retrotransposons and definitively needs further investigation. In partial agreement with the findings of increased LINE-1 expression in ASD, data concerning LINE-1 methylation status in lymphoblastoid peripheral cells have provided evidence of reduced methylation in a subgroup of patients with severe language impairment Tangsuwansri et al.

It has also been shown that the Alu sequence, the most abundant of all TEs in the human genome, deserves further research in ASD Saeliw et al. Indeed, this study investigated the Alu methylation and expression in lymphoblastoid peripheral cells from ASD patients. Although the global methylation of Alu subfamilies was not significantly different between ASD and control group, when ASD samples were divided according to phenotypic subgroups, methylation patterns of the AluS subfamily were different from those in relative controls in two of the ASD subgroups, and within one of the subgroup mild phenotype , the Alu expression was correlated with methylation status.

Despite the limited sample size particularly of subgroups , these data suggest that classification of ASD patients in phenotypic subgroups may represent a useful tool in investigating associations of TEs with the highly heterogeneous ASD diagnostic construct. It has been clearly demonstrated that winter-spring seasonality of birth as well as prenatal and postnatal infections increase a risk of developing schizophrenia McGrath and Welham, ; Davies et al.

Moreover, the largest genome-wide association study revealed that variation within the HLA genes is strongly associated with schizophrenia susceptibility Ripke et al.

Finally, schizophrenia patients present with several indices of subclinical inflammation in terms of pro-inflammatory cytokine profiles Miller et al.

On the basis of a meta-analysis, Arias et al. Accumulating evidence indicates altered expression of HERV sequences in patients with schizophrenia. Karlsson et al. These sequences were not detected in the CSF of individuals with non-inflammatory neurological diseases and healthy controls.

Increased levels of HERV-W-related gag and pol transcripts and a higher prevalence of the gag and pol antigenemia in peripheral blood from patients with schizophrenia compared to healthy controls have been reported by several studies Karlsson et al.

The study by Perron et al. The HERV-W gag and env antigenemia has been also associated with subclinical inflammation in terms of elevated levels of CRP and pro-inflammatory cytokines Perron et al.

Interestingly, Huang et al. Interestingly, expression level of the HERV-W gag protein has been found to be decreased in the cingulate gyrus and the hippocampus of patients with schizophrenia Weis et al. The env gene in this locus encodes syncytin-1, expressed at high levels in the human placenta Blond et al.

However, altered expression of this gene has been reported in the areas of active demyelination in patients with multiple sclerosis Mameli et al.

At this point, it should be noted that myelin alterations are widely observed in patients with schizophrenia Mighdoll et al. Although initial results regarding expression of the HERV-W sequences in schizophrenia patients are promising, caution should be taken on the way these results are being interpreted.

Moreover, no conclusive association between the HERV-W expression and other human pathologies has been documented so far [for review see Grandi and Tramontano, ]. Less is known about other families of HERVs in patients with schizophrenia.

Frank et al. Our group also tested peripheral blood methylation levels of HERV-K sequences in first-episode and multi-episode schizophrenia patients Mak et al. We found significantly lower levels of HERV-K methylation in first-episode schizophrenia patients compared to healthy controls. These alterations were not observed in multi-episode schizophrenia patients. Moreover, we did not find an association between HERV-K methylation levels and the deficit schizophrenia subtype that refers to a subgroup of patients with enduring and persistent negative symptoms.

However, we found a significant positive correlation between the dosage of antipsychotics and HERV-K methylation levels in multi-episode schizophrenia patients.

It is also likely that antipsychotic drugs might impact methylation and expression of HERV-K sequences. In contrast to our findings, Diem et al. Some studies also investigated methylation status and expression levels of non-LTR sequences in patients with schizophrenia. Bundo et al. These findings were confirmed in induced pluripotent cells from patients with 22q11 deletion syndrome as well as in a mouse model of schizophrenia maternal immune activation paradigm.

In agreement with these results, a significant increase in the number of intragenic LINE-1 insertions has been observed in the dorsolateral prefrontal cortex of patients with schizophrenia compared to healthy controls Doyle et al. In some studies, LINE-1 methylation was tested in peripheral blood leukocytes of patients with schizophrenia, providing mixed findings Misiak et al.

The study by our group revealed lower LINE-1 methylation only in patients with first-episode schizophrenia and a positive history of childhood trauma. Among various childhood adversities, emotional trauma was most strongly associated with the LINE-1 methylation status. These results are in agreement with a previous study, showing that the LINE-1 methylation might be involved in resilience and susceptibility to develop post-traumatic stress disorder Rusiecki et al.

Moreover, increased expression of LINE-1 in response to stress has been reported in various cell lines Li and Schmid, ; Capomaccio et al.

Lower LINE-1 methylation levels in patients with schizophrenia and bipolar disorder were also reported by Li et al. Other studies revealed hypermethylation of LINE-1 sequences in patients with first-episode psychosis, paranoid schizophrenia and methamphetamine-induced paranoia Fachim et al.

A recent systematic review indicates that prenatal infections might impact the risk of bipolar disorder Marangoni et al. However, this observation is based on a lower number of studies compared to studies addressing the impact of prenatal infections on schizophrenia risk.

There is evidence that influenza infection during pregnancy is associated with a fourfold increase in the risk of bipolar disorder in the offspring Parboosing et al. Another study demonstrated that prenatal flu exposure increases the risk of bipolar disorder with psychotic features Canetta et al. Maternal infections in the second trimester might also contribute to the development of depressive symptoms in the adolescent offspring Murphy et al.

However, the impact of specific infectious agents has not been tested so far. Although all major mental disorders are characterized by co-existing subclinical inflammation, some differences, regarding specific pro-inflammatory markers can be indicated Goldsmith et al.

Therefore, it might be hypothesized that the mechanisms leading to subclinical inflammation in bipolar disorder, major depression and schizophrenia-spectrum disorders are different. However, studies investigating expression of TEs do not support this hypothesis. For instance, over-expression of HERV-K sequences has been reported in brain samples of patients with bipolar disorder and schizophrenia Frank et al. Similarly, decreased expression of the HERV-W gag protein has been reported in the cingulate gyrus and hippocampus of patients with schizophrenia, bipolar disorder, and major depression Weis et al.

Finally, hypomethylation of LINE-1 elements in peripheral blood has been observed in patients with bipolar disorder and schizophrenia Li et al. Indeed, Perron et al. Expression levels of the HERV-W env sequence were also significantly higher in patients with bipolar disorder than in those with schizophrenia. There is a general consensus that aging processes are associated with progressive loss of global DNA methylation and site-specific DNA hypermethylation Jung and Pfeifer, Similarly, TEs are subjected to profound epigenetic modifications during aging that appear in the context of organismal and cellular senescence Cardelli, Importantly, the study by Gentilini et al.

In turn, studies investigating changes of LINE-1 methylation have provided mixed findings Bollati et al. Although specific retrotransposition events that may account for mental disorders in the manner observed in case of Mendelian diseases have not been identified so far, accumulating evidence indicates the involvement of altered expression and epigenetic regulation of TEs in the pathophysiology of schizophrenia, mood disorders and ASD.

However, specific findings are similar in patients with various mental disorders and thus their use as biomarkers is largely limited. Moreover, the direction of causality is yet to be determined. For instance, it cannot be excluded that altered expression of HERV appears as a consequence of other epigenetic dysregulations that are widely observed in mental disorders. Additionally, severe mental disorders, including schizophrenia and mood disorders, are associated with high prevalence rates of somatic comorbidities, including autoimmune diseases, type 2 diabetes and cardiovascular diseases that have also been associated with altered epigenetic regulation of TEs Cash et al.

Interestingly, there are studies showing that the expression of various HERV sequences appears in a certain subgroup of patients with schizophrenia but not in healthy controls. These findings are consistent with previous studies, showing that immune alterations can be observed only in a subgroup of patients characterized by poor response to treatment and support the concept of psychosis subtypes Frydecka et al. Other clinical correlates of subclinical inflammation in schizophrenia include, i.

However, so far studies investigating expression and epigenetic regulation of TEs in schizophrenia have been based on relatively small samples without comprehensive clinical assessment. Similarly, studies investigating the expression of TEs in patients with bipolar disorder did not control for mood status and a severity of psychopathological symptoms. Another important point is that causal inferences between TEs and mental disorders cannot be established.

Firstly, it remains unknown what are the critical periods when alterations in epigenetic regulation and expression of TEs appear. Therefore, future studies should examine epigenetic processes that regulate expression of TEs in patients at early stages of mental disorders or individuals from clinical high risk groups.

This is particularly important since several lifestyle characteristics that are highly prevalent among patients with mental disorders, e. Secondly, the role of HERVs in shaping innate immunity also remains problematic with respect to understanding causal associations.

On one side, expression of HERVs might condition resistance to exogenous infections; on the other, exogenous retroviruses have been found to impact the expression of HERVs. Therefore, it remains unknown whether altered expression profiles of HERVs in mental disorders represent cause or consequence of exogenous infections.

Future studies should necessarily examine the biological nature and the extent of associations between immune alterations in mental disorders and expression of various TEs. Finally, more global concordance patterns of different TEs expression in mental disorders are yet to be examined: this could provide further insight into specificity of methylation patterns across different TEs and provide additional information of their use as potential biomarkers. At this point, it is important to note that similar DNA methylation patterns have been described in brain samples and peripheral blood leukocytes of patients with schizophrenia Van Den Oord et al.

Another direction for the field is to disentangle the effects of stressful life events on epigenetic regulation of TEs expression. Early-life stress is a known risk factor for mood and psychotic disorders as well as correlates with a number of biological dysregulations in adults Misiak et al.

Acute stress has been found to increase the levels of H3K9me3 as well as decrease the levels of H3K9me1 and H3K27me3 in the dentate gyrus and the CA1 layer of the hippocampus in rats Milne et al. In turn, chronic restraint stress for 21 days mildly increased the levels of H3Kme4 and reduced the levels of H3K9me3 in the dentate gyrus.

Treatment with fluoxetine reversed changes in the levels of H3K9me3 during chronic restraint stress. In turn, our group found lower methylation of LINE-1 sequences in peripheral blood leukocytes of patients with first-episode schizophrenia reporting a positive history of childhood trauma Misiak et al.

In light of these findings, future studies should further examine the effects of stress on the expression of TEs in patients from various clinical groups and preclinical studies could contribute to this aim. BM and MS conceived the concept of this article. All authors contributed to the manuscript revision, read, and approved the submitted version. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Aporntewan, C. PLoS One 6:e Arias, I. Infectious agents associated with schizophrenia: a meta-analysis. Baillie, J. Somatic retrotransposition alters the genetic landscape of the human brain. Nature , — Baker, M. Acute stress and hippocampal histone H3 lysine 9 trimethylation, a retrotransposon silencing response. Balestrieri, E. HERVs expression in autism spectrum disorders. PLoS One 7:e Transcriptional activity of human endogenous retrovirus in Albanian children with autism spectrum disorders.

New Microbiol. PubMed Abstract Google Scholar. Kazazian HH. Britten RJ. Transposable element insertions have strongly affected human evolution. Identification of a unique Alu -based polymorphism and its use in human population studies. Inactivation of the Fas gene by Alu insertion: retrotransposition in an intron causing splicing variation and autoimmune lymphoproliferative syndrome.

Genes Immun. Molecular pathology of haemophilia B: identification of five novel mutations including a LINE 1 insertion in Indian patients. Disruption of the APC gene by a retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res. Mutation analysis in the BRCA2 gene in primary breast cancers.

Nat Genet. Monocyte activation and differentiation augment human endogenous retrovirus expression: implications for inflammatory brain diseases. Ann Neurol. A de novo Alu insertion results in neurofibromatosis type 1. Transposable elements and psychiatric disorders. LINE dancing in the human genome: transposable elements and disease. Genome Med. TEMP: a computational method for analyzing transposable element polymorphism in populations.

Nucleic Acids Res. Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Recently mobilized transposons in the human and chimpanzee genomes. Am J Hum Genet. Curr Opin Genet Dev. Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution.

Nat Rev Genet. Endogenous retroviruses and human evolution. Comp Funct Genomics. Hot L1s account for the bulk of retrotransposition in the human population. Swergold GD. Identification, characterization, and cell specificity of a human LINE-1 promoter. Mol Cell Biol. Hum Mutat. LINE-mediated retrotransposition of marked Alu sequences. Weiner AM. Curr Opin Cell Biol. SVA elements: a hominid-specific retroposon family.

J Mol Biol. Loss of LINE-1 activity in the megabats. Large-scale analysis of the Alu Ya5 and Yb8 subfamilies and their contribution to human genomic diversity. Alu insertion polymorphisms for the study of human genomic diversity. Non-traditional Alu evolution and primate genomic diversity.

Potential gene conversion and source genes for recently integrated Alu elements. Recently integrated human Alu repeats: finding needles in the haystack. Alu repeats and human genomic diversity.

A mobile threat to genome stability: the impact of non-LTR retrotransposons upon the human genome. Semin Cancer Biol. Inviting instability: transposable elements, double-strand breaks, and the maintenance of genome integrity.

Mutat Res. LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease: mutation detection bias and multiple mechanisms of target gene disruption. J Biomed Biotechnol. Sobczak K, Krzyzosiak WJ. Structural determinants of BRCA1 translational regulation. J Biol Chem. Repetitive elements in the 5' untranslated region of a human zinc-finger gene modulate transcription and translation efficiency.

Cell Cycle. Inverted Alu repeats unstable in yeast are excluded from the human genome. EMBO J. Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins.

The role of Alu repeat clusters as mediators of recurrent chromosomal aberrations in tumors. Genes Chromosomes Cancer. Rearrangement of the human tre oncogene by homologous recombination between Alu repeats of nucleotide sequences from two different chromosomes.

Human genomic deletions mediated by recombination between Alu elements. L1 recombination-associated deletions generate human genomic variation. DNA hypomethylation and human diseases. Biochim Biophys Acta. Hypomethylation of retrotransposable elements correlates with genomic instability in non-small cell lung cancer. Int J Cancer. Hypomethylation of L1 retrotransposons in colorectal cancer and adjacent normal tissue.

Int J Colorectal Dis. Cytosine methylation and the ecology of intragenomic parasites. Br J Cancer. Speek M. Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes. Cruickshanks HA, Tufarelli C. Isolation of cancer-specific chimeric transcripts induced by hypomethylation of the LINE-1 antisense promoter.

Hum Hered. Genomic rearrangements of the APC tumor-suppressor gene in familial adenomatous polyposis. This dataset of well-defined transposon lineages now provides the basis to further explore the factors controlling transposon dynamics. Founder elements may help us obtain better insights into common patterns which could explain how and why amplification starts. As described above, intergenic sequences show almost no conservation between homeologous loci.

That means they contain practically no TEs that have inserted already in the common ancestor of the subgenomes. Instead, ancestral sequences were removed over time and replaced by TEs that have inserted more recently.

Despite this near-complete turnover of the TE space Fig. Most interestingly and strikingly, not only gene order but also distances between neighboring homeologs tend to be conserved between subgenomes Fig. Indeed, we found that the ratio of distances between neighboring homeologs has a strong peak at 1 or 0 in log scale on Fig.

These findings suggest that distances between genes are likely under selection pressure. Comparison of distances between neighboring homeologs in the subgenomes. For each homeolog triplet, three ratios were calculated i. If the distance is similar in two subgenomes, the ratio will be close to 1. The distribution is compared to one where gene positions were randomized see Methods.

This indicates that distances between homeologs are conserved, despite the near-complete absence of conservation of intergenic sequences between subgenomes. We found this constrained distribution irrespective of the chromosome compartments, i. However, constraints applied on intergenic distances seem relaxed broader peak in Fig. At this point, we can only speculate about the possible impact of meiotic recombination as a driving force towards maintaining a stable chromosome organization.

Previous studies have shown that recombination in highly repetitive genomes occurs mainly in or near genes [ 41 ]. We hypothesize that spacing of genes is preserved for proper expression regulation or proper pairing during meiosis. Previous studies on introgressions of divergent haplotypes in large-genome grasses support this hypothesis. For instance, highly divergent haplotypes which still preserve the spacing of genes have been maintained in wheats of different ploidy levels at the wheat Lr10 locus [ 42 ].

The sequences flanking genes have a very distinct TE composition compared to the overall TE space. At the superfamily level, the A, B, and D subgenomes exhibit the same biased composition in gene surrounding regions Additional file 1 : Figure S We then computed, independently for each subgenome, the enrichment ratio of each TE family that was present in the promoter of protein-coding genes 2 kb upstream of the transcription start site TSS compared to their overall proportion in copy number, considering the TE families with at least copies.

Considering a strong bias, i. While it was previously known that MITEs were enriched in promoters of genes, here we show that this bias is not restricted to MITEs but rather involves many other families. Again, although TEs that shaped the direct gene environment have inserted independently in the A, B, and D diploid lineages, their evolution converged to three subgenomes showing very similar TE composition.

To go further, we showed that the tendency of TE families to be enriched in, or excluded from, promoters was extremely conserved between the A, B, and D subgenomes Fig. In other words, when a family is over- or under-represented in the promoter regions of one subgenome, it is also true for the two other subgenomes.

We did not find any family that was enriched in a gene promoter in one subgenome while under-represented in gene promoters of another subgenome. TE landscape surrounding genes. Genes from the three subgenomes were treated separately.

For all genes, the 10 kb upstream of the transcription start site TSS and 10 kb downstream of the transcription end site were analyzed. Abundance of the different TE families was compiled for all genes of each subgenome. The plots include only those superfamilies that are specifically enriched near genes and which are otherwise less abundant in intergenic sequences.

Enrichment analyses of TE families within gene promoters. The y -axis represents the log2 ratio of the proportion i. Positive and negative values represent an over- and under-representation of a given family in the promoters, respectively.

Log2 ratios were calculated for the three subgenomes independently A green ; B violet ; D orange and the three values were represented here as a stacked histogram. Only highly repeated families copies or more are represented, with 1 panel per superfamily. Families are ordered decreasingly along the x -axis according to the whole genome log2 ratio. Superfamily is generally but not always a good indicator of the enrichment of TEs in genic regions Fig.

We confirmed that class 2 DNA transposons especially MITEs are enriched in promoters, while Gypsy retrotransposons tend to be excluded from the close vicinity of genes. Our results showed that this is also true for Copia.

Thus, the TE turnover did not changed the highly organized genome structure. Given that not only proportions, but also enrichment patterns, remained similar for almost all TE families after A-B-D divergence, we suggest that TEs tend to be at the equilibrium in the genome, with amplification compensating their deletion as described in [ 29 ] , and with families enriched around genes having remained the same.

We investigated the influence of neighboring TEs on gene expression. Indeed, TEs are so abundant in the wheat genome, that genes are almost systematically flanked by a TE in the direct vicinity. The density as well as the diversity of TEs in the vicinity of genes allow us to speculate on potential relationships between TEs and gene expression regulation.

We used the gene expression network built by [ 26 ] based on an exhaustive set of wheat RNA-seq data. Genes were clustered into 39 expression modules sharing a common expression profile across all samples.

We also grouped unexpressed genes to study the potential influence of TEs on neighbor gene silencing. For each gene, the closest TE upstream was retrieved, and we investigated potential correlations through an enrichment analysis each module was compared to the full gene set.

Despite the close association between genes and TEs, no strong enrichment for a specific family was observed for any module or for the unexpressed genes. We then studied the TE landscape upstream of wheat homeolog triplets, focusing on 19, triplets 58, genes with a orthologous relationship between A, B, and D subgenomes. For each triplet, we retrieved the closest TE flanking the TSS and investigated the level of conservation of flanking TEs between homeologs. This suggests that most TEs present upstream of triplets were not selected for by the presence of common regulatory elements across homeologs.

These TE-derived CNSs are on average bp, which is three times smaller than the average size of gene-flanking TE fragments on average bp , suggesting that only a portion of the ancestrally inserted TEs are under selection pressure. They represent a wide range different families of diverse elements belonging to all the different superfamilies. The majority of homeolog triplets have relatively similar expression patterns [ 26 , 44 ], contrary to what was found for older polyploid species like maize [ 45 ].

In synthetic polyploid wheat, it was shown that repression of D subgenome homeologs was related to silencing of neighbor TEs [ 46 ]. Thus, we focused on triplets for which two copies are coexpressed while the third is silenced. However, enrichment analysis did not reveal any significant enrichment of specific TE families in promoters of the silenced homeologs. We also examined transcriptionally dynamic triplets across tissues [ 44 ].

Again, no TE enrichment in promoters was observed. These results suggest that recent changes in gene expression are not due to specific families recently inserted in the close vicinity of genes.

The chromosome-scale assembly of the wheat genome provided an unprecedented genome-wide view of the organization and impact of TEs in such a complex genome. Since they diverged, the A, B, and D subgenomes have experienced a near-complete TE turnover, although polyploidization did not massively reactivate TEs. This turnover contrasted drastically with the high level of gene synteny.

Apart from genes, there was no conservation of the TE space between homeologous loci. But surprisingly, TE families that have shaped the A, B, and D subgenomes are the same, and unexpectedly, their proportions and intrinsic properties gene-prone or not are quite similar despite their independent evolution in the diploid lineages. These novel insights contradict the previous model of evolution with amplification bursts followed by rapid silencing.

Our results suggest a role of TEs at the structural level. The Triticum aestivum cv. Chinese Spring genome sequence was annotated as described in [ 26 ]. TE modeling was achieved through a similarity search approach based on the ClariTeRep curated databank of repeated elements [ 48 ], developed specifically for the wheat genome, and with the CLARITE program that was developed to model TEs and reconstruct their nested structure [ 17 ]. For the annotation, we used the ClariTeRep naming system, which assigns simple numbers to individual families and subfamilies; e.

Since many TE families have been previously named, we provided this previous name in parentheses. All candidates where annotated for PfamA domains with hmmer3 [ 50 ] and stringently filtered for canonical elements by the following criteria: 1 presence of at least one typical retrotransposon domain RT, RH, INT, GAG ; 2 removal of mis-predictions based on inconsistent domains, e.

When this was not possible, the prediction was classified as RLX. Muscle was used for multiple alignments of each cluster [ 52 ] in a fast mode -maxiters 2 -diags1. To build phylogenetic trees, we used tree2 from the muscle output which was created in the second iteration with a Kimura distance matrix, and trees were visualized with ete3 toolkit [ 53 ]. The lifespan of an individual LTR-RT subfamily was defined as the 5th to 95th percentile interval between the oldest and youngest insertions.

The densities for the chromosomal heat-maps were calculated using a sliding window of 4 Mb with a step of 0. For the comparison of distances separating neighbor genes, homeologous triplets located in the three chromosomal compartments distal, interstitial, and proximal; Additional file 1 : Table S2 were treated separately. This was done because gene density is lower in interstitial and proximal regions, and because the latter show a lack of genetic recombination.

Furthermore, we considered only triplets where all three homeologous genes are found on the homeologous chromosomes. Comparison of homeologous gene pairs from distal regions was done in two ways, both of which yielded virtually identical results. Distances were measured from one gene to the one that follows downstream. However, there were many small local inversions between the different subgenomes. Thus, if a gene on the B or D subgenome was oriented in the opposite direction compared to its homeologous copy in the A subgenome, it was assumed that that gene is part of a local inversion.

Therefore, the distance to the preceding gene on the chromosome was calculated. The second approach was more stringent, based only on triplets for which all three homeologs are in the same orientation in the three subgenomes. The results obtained from the two approaches were extremely similar, and we presented only the results from the second, more stringent, approach. For the control dataset, we picked a number of random positions along the chromosomes that is equal to the number of homeologs for that chromosome group.

Then, homeologous gene identifiers were assigned to these positions from top to bottom to preserve the order of genes but randomize the distances between them. This was done once for all three chromosomal compartments. Histograms of the distributions of the distance ratios between homeologs were produced with rstudio rstudio. The significance of the differences between the largest group of actual and randomized gene positions peak of the histogram was established with a chi-square test.

We developed a Perl script gffGetClosestTe. Enrichment analyses were then automated using R scripts. Only families accounting for copies or more in the whole genome were considered. The log2 ratio was calculated only for expression modules representing at least coexpressed genes, and we considered only log2 ratio values for families accounting for copies or more.

For that, we developed a Perl script getTeHomeologs. Plant transposable elements: where genetics meets genomics. Nat Rev Genet. A unified classification system for eukaryotic transposable elements. Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression.

Genome Res. Feschotte C. Transposable elements and the evolution of regulatory networks. Centromeric localization and adaptive evolution of an Arabidopsis histone H3 variant. Plant Cell. A physical, genetic and functional sequence assembly of the barley genome. International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Unlocking the barley genome by chromosomal and comparative genomics. The Sorghum bicolor genome and the diversification of grasses.

The B73 maize genome: complexity, diversity, and dynamics. Exceptional diversity, non-random distribution, and rapid evolution of retroelements in the B73 maize genome. PLoS Genet. Fu H, Dooner HK.



0コメント

  • 1000 / 1000