Generation of synthetic transcriptome data with defined statistical properties for the development and testing of new analysis methods

Genomics Proteomics Bioinformatics. 2007 Feb;5(1):45-52. doi: 10.1016/S1672-0229(07)60013-8.

Abstract

We have previously developed a combined signal/variance distribution model that accounts for the particular statistical properties of datasets generated on the Applied Biosystems AB1700 transcriptome system. Here we show that this model can be efficiently used to generate synthetic datasets with statistical properties virtually identical to those of the actual data by aid of the JAVA application ace.map creator 1.0 that we have developed. The fundamentally different structure of AB1700 transcriptome profiles requires re-evaluation, adaptation, or even redevelopment of many of the standard microarray analysis methods in order to avoid misinterpretation of the data on the one hand, and to draw full benefit from their increased specificity and sensitivity on the other hand. Our composite data model and the ace.map creator 1.0 application thereby not only present proof of the correctness of our parameter estimation, but also provide a tool for the generation of synthetic test data that will be useful for further development and testing of analysis methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Internet
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Transcription, Genetic / genetics*