Cell subset prediction for blood genomic studies

Christopher R Bolen; Mohamed Uduman; Steven H Kleinstein

doi:10.1186/1471-2105-12-258

Cell subset prediction for blood genomic studies

BMC Bioinformatics. 2011 Jun 24:12:258. doi: 10.1186/1471-2105-12-258.

Authors

Christopher R Bolen¹, Mohamed Uduman, Steven H Kleinstein

Affiliation

¹ Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06511, USA.

Abstract

Background: Genome-wide transcriptional profiling of patient blood samples offers a powerful tool to investigate underlying disease mechanisms and personalized treatment decisions. Most studies are based on analysis of total peripheral blood mononuclear cells (PBMCs), a mixed population. In this case, accuracy is inherently limited since cell subset-specific differential expression of gene signatures will be diluted by RNA from other cells. While using specific PBMC subsets for transcriptional profiling would improve our ability to extract knowledge from these data, it is rarely obvious which cell subset(s) will be the most informative.

Results: We have developed a computational method (Subset Prediction from Enrichment Correlation, SPEC) to predict the cellular source for a pre-defined list of genes (i.e. a gene signature) using only data from total PBMCs. SPEC does not rely on the occurrence of cell subset-specific genes in the signature, but rather takes advantage of correlations with subset-specific genes across a set of samples. Validation using multiple experimental datasets demonstrates that SPEC can accurately identify the source of a gene signature as myeloid or lymphoid, as well as differentiate between B cells, T cells, NK cells and monocytes. Using SPEC, we predict that myeloid cells are the source of the interferon-therapy response gene signature associated with HCV patients who are non-responsive to standard therapy.

Conclusions: SPEC is a powerful technique for blood genomic studies. It can help identify specific cell subsets that are important for understanding disease and therapy response. SPEC is widely applicable since only gene expression profiles from total PBMCs are required, and thus it can easily be used to mine the massive amount of existing microarray or RNA-seq data.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Computational Biology / methods
Gene Expression Profiling*
Hepatitis C, Chronic / drug therapy*
Hepatitis C, Chronic / pathology*
Humans
Leukocytes, Mononuclear / metabolism
Polymorphism, Single Nucleotide

Abstract

Publication types

MeSH terms

Grants and funding