Directed indices for exploring gene expression data

Michael LeBlanc; Charles Kooperberg; Thomas M Grogan; Thomas P Miller

doi:10.1093/bioinformatics/btg079

Directed indices for exploring gene expression data

Bioinformatics. 2003 Apr 12;19(6):686-93. doi: 10.1093/bioinformatics/btg079.

Authors

Michael LeBlanc¹, Charles Kooperberg, Thomas M Grogan, Thomas P Miller

Affiliation

¹ Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA. [email protected]

PMID: 12691980
DOI: 10.1093/bioinformatics/btg079

Abstract

Motivation: Large expression studies with clinical outcome data are becoming available for analysis. An important goal is to identify genes or clusters of genes where expression is related to patient outcome. While clustering methods are useful data exploration tools, they do not directly allow one to relate the expression data to clinical outcome. Alternatively, methods which rank genes based on their univariate significance do not incorporate gene function or relationships to genes that have been previously identified. In addition, after sifting through potentially thousands of genes, summary estimates (e.g. regression coefficients or error rates) algorithms should address the potentially large bias introduced by gene selection.

Results: We developed a gene index technique that generalizes methods that rank genes by their univariate associations to patient outcome. Genes are ordered based on simultaneously linking their expression both to patient outcome and to a specific gene of interest. The technique can also be used to suggest profiles of gene expression related to patient outcome. A cross-validation method is shown to be important for reducing bias due to adaptive gene selection. The methods are illustrated on a recently collected gene expression data set based on 160 patients with diffuse large cell lymphoma (DLCL).

Publication types

Comparative Study
Evaluation Study
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.
Validation Study

MeSH terms

Algorithms*
Gene Expression Profiling / methods*
Gene Expression Profiling / standards*
Genetic Testing / methods*
Genetic Testing / standards
Humans
Lymphoma, Non-Hodgkin / genetics*
Lymphoma, Non-Hodgkin / mortality*
Models, Genetic
Models, Statistical
Reproducibility of Results
Risk Assessment / methods*
Risk Assessment / standards
Sensitivity and Specificity
Survival Analysis

Abstract

Publication types

MeSH terms

Grants and funding