A compilation of soybean ESTs: generation and analysis

Randy Shoemaker; Paul Keim; Lila Vodkin; Ernest Retzel; Sandra W Clifton; Robert Waterston; David Smoller; Virginia Coryell; Anupama Khanna; John Erpelding; Xiaowu Gai; Volker Brendel; Christina Raph-Schmidt; E G Shoop; C J Vielweber; Matt Schmatz; Deana Pape; Yvette Bowers; Brenda Theising; John Martin; Michael Dante; Todd Wylie; Cheryl Granger

doi:10.1139/g01-150

A compilation of soybean ESTs: generation and analysis

Genome. 2002 Apr;45(2):329-38. doi: 10.1139/g01-150.

Affiliation

¹ USDA-ARS, Corn Insect and Crop Genetics Research Unit, and Department of Agronomy, Iowa State University, Ames 50011, USA. [email protected]

PMID: 11962630
DOI: 10.1139/g01-150

Abstract

Whole-genome sequencing is fundamental to understanding the genetic composition of an organism. Given the size and complexity of the soybean genome, an alternative approach is targeted random-gene sequencing, which provides an immediate and productive method of gene discovery. In this study, more than 120000 soybean expressed sequence tags (ESTs) generated from more than 50 cDNA libraries were evaluated. These ESTs coalesced into 16928 contigs and 17336 singletons. On average, each contig was composed of 6 ESTs and spanned 788 bases. The average sequence length submitted to dbEST was 414 bases. Using only those libraries generating more than 800 ESTs each and only those contigs with 10 or more ESTs each, correlated patterns of gene expression among libraries and genes were discerned. Two-dimensional qualitative representations of contig and library similarities were generated based on expression profiles. Genes with similar expression patterns and, potentially, similar functions were identified. These studies provide a rich source of publicly available gene sequences as well as valuable insight into the structure, function, and evolution of a model crop legume genome.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Contig Mapping
DNA, Complementary / analysis
Expressed Sequence Tags*
Gene Expression Regulation, Plant
Gene Library
Genome, Plant
Glycine max / genetics*

Substances

DNA, Complementary