We have devised a novel analysis approach, percentile analysis for differential gene expression (PADGE), for identifying genes differentially expressed between two groups of heterogeneous samples. PADGE was designed to compare expression profiles of sample subgroups at a series of percentile cutoffs and to examine the trend of relative expression between sample groups as expression level increases. Simulation studies showed that PADGE has more statistical power than t-statistics, cancer outlier profile analysis (COPA) (Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun XW, Varambally S, Cao X, Tchinda J, Kuefer R, Lee C, Montie JE, Shah RB, Pienta KJ, Rubin MA, Chinnaiyan AM. Science 310: 644-648, 2005), and kurtosis (Teschendorff AE, Naderi A, Barbosa-Morais NL, Caldas C. Bioinformatics 22: 2269-2275, 2006). Application of PADGE to microarray data sets in tumor tissues demonstrated its utility in prioritizing cancer genes encoding potential therapeutic targets or diagnostic markers. A web application was developed for researchers to analyze a large gene expression data set from heterogeneous biological samples and identify differentially expressed genes between subsets of sample classes using PADGE and other available approaches.
Availability: http://www.cgl.ucsf.edu/Research/genentech/padge/.