ButterflyBase: a platform for lepidopteran genomics

Nucleic Acids Res. 2008 Jan;36(Database issue):D582-7. doi: 10.1093/nar/gkm853. Epub 2007 Oct 12.

Abstract

With over 100 000 species and a large community of evolutionary biologists, population ecologists, pest biologists and genome researchers, the Lepidoptera are an important insect group. Genomic resources [expressed sequence tags (ESTs), genome sequence, genetic and physical maps, proteomic and microarray datasets] are growing, but there has up to now been no single access and analysis portal for this group. Here we present ButterflyBase (http://www.butterflybase.org), a unified resource for lepidopteran genomics. A total of 273 077 ESTs from more than 30 different species have been clustered to generate stable unigene sets, and robust protein translations derived from each unigene cluster. Clusters and their protein translations are annotated with BLAST-based similarity, gene ontology (GO), enzyme classification (EC) and Kyoto encyclopaedia of genes and genomes (KEGG) terms, and are also searchable using similarity tools such as BLAST and MS-BLAST. The database supports many needs of the lepidopteran research community, including molecular marker development, orthologue prediction for deep phylogenetics, and detection of rapidly evolving proteins likely involved in host-pathogen or other evolutionary processes. ButterflyBase is expanding to include additional genomic sequence, ecological and mapping data for key species.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromosome Mapping
  • Databases, Genetic*
  • Expressed Sequence Tags
  • Genome, Insect*
  • Genomics
  • Insect Proteins / chemistry
  • Insect Proteins / genetics
  • Internet
  • Lepidoptera / classification
  • Lepidoptera / genetics*
  • Phylogeny
  • Proteomics
  • User-Computer Interface

Substances

  • Insect Proteins