Multiple effects govern endogenous retrovirus survival patterns in human gene introns

Genome Biol. 2006;7(9):R86. doi: 10.1186/gb-2006-7-9-r86.

Abstract

Background: Endogenous retroviruses (ERVs) and solitary long terminal repeats (LTRs) have a significant antisense bias when located in gene introns, suggesting strong negative selective pressure on such elements oriented in the same transcriptional direction as the enclosing gene. It has been assumed that this bias reflects the presence of strong transcriptional regulatory signals within LTRs but little work has been done to investigate this phenomenon further.

Results: In the analysis reported here, we found significant differences between individual human ERV families in their prevalence within genes and degree of antisense bias and show that, regardless of orientation, ERVs of most families are less likely to be found in introns than in intergenic regions. Examination of density profiles of ERVs across transcriptional units and the transcription signals present in the consensus ERVs suggests the importance of splice acceptor sites, in conjunction with splice donor and polyadenylation signals, as the major targets for selection against most families of ERVs/LTRs. Furthermore, analysis of annotated human mRNA splicing events involving ERV sequence revealed that the relatively young human ERVs (HERVs), HERV9 and HERV-K (HML-2), are involved in no human mRNA splicing events at all when oriented antisense to gene transcription, while elements in the sense direction in transcribed regions show considerable bias for use of strong splice sites.

Conclusion: Our observations suggest suppression of splicing among young intronic ERVs oriented antisense to gene transcription, which may account for their reduced mutagenicity and higher fixation rate in gene introns.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Endogenous Retroviruses / genetics*
  • Gene Expression Regulation*
  • Genome, Human
  • Humans
  • Introns / genetics*
  • Molecular Sequence Data
  • RNA Splicing
  • Terminal Repeat Sequences / genetics
  • Transcription, Genetic