In silico identification of novel selenoproteins in the Drosophila melanogaster genome

EMBO Rep. 2001 Aug;2(8):697-702. doi: 10.1093/embo-reports/kve151.

Abstract

In selenoproteins, incorporation of the amino acid selenocysteine is specified by the UGA codon, usually a stop signal. The alternative decoding of UGA is conferred by an mRNA structure, the SECIS element, located in the 3'-untranslated region of the selenoprotein mRNA. Because of the non-standard use of the UGA codon, current computational gene prediction methods are unable to identify selenoproteins in the sequence of the eukaryotic genomes. Here we describe a method to predict selenoproteins in genomic sequences, which relies on the prediction of SECIS elements in coordination with the prediction of genes in which the strong codon bias characteristic of protein coding regions extends beyond a TGA codon interrupting the open reading frame. We applied the method to the Drosophila melanogaster genome, and predicted four potential selenoprotein genes. One of them belongs to a known family of selenoproteins, and we have tested experimentally two other predictions with positive results. Finally, we have characterized the expression pattern of these two novel selenoprotein genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Cell Line
  • Codon, Terminator / genetics*
  • Drosophila melanogaster / embryology
  • Drosophila melanogaster / genetics*
  • Gene Expression Profiling
  • Genome*
  • Humans
  • In Situ Hybridization
  • Insect Proteins / chemistry
  • Insect Proteins / genetics*
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Proteins / chemistry
  • Proteins / genetics*
  • Regulatory Sequences, Nucleic Acid / genetics
  • Selenium Radioisotopes / metabolism
  • Selenocysteine / metabolism*
  • Selenoproteins
  • Sequence Alignment

Substances

  • Codon, Terminator
  • Insect Proteins
  • Proteins
  • Selenium Radioisotopes
  • Selenoproteins
  • Selenocysteine