A molecular classification of human mesenchymal stromal cells

PeerJ. 2016 Mar 24:4:e1845. doi: 10.7717/peerj.1845. eCollection 2016.

Abstract

Mesenchymal stromal cells (MSC) are widely used for the study of mesenchymal tissue repair, and increasingly adopted for cell therapy, despite the lack of consensus on the identity of these cells. In part this is due to the lack of specificity of MSC markers. Distinguishing MSC from other stromal cells such as fibroblasts is particularly difficult using standard analysis of surface proteins, and there is an urgent need for improved classification approaches. Transcriptome profiling is commonly used to describe and compare different cell types; however, efforts to identify specific markers of rare cellular subsets may be confounded by the small sample sizes of most studies. Consequently, it is difficult to derive reproducible, and therefore useful markers. We addressed the question of MSC classification with a large integrative analysis of many public MSC datasets. We derived a sparse classifier (The Rohart MSC test) that accurately distinguished MSC from non-MSC samples with >97% accuracy on an internal training set of 635 samples from 41 studies derived on 10 different microarray platforms. The classifier was validated on an external test set of 1,291 samples from 65 studies derived on 15 different platforms, with >95% accuracy. The genes that contribute to the MSC classifier formed a protein-interaction network that included known MSC markers. Further evidence of the relevance of this new MSC panel came from the high number of Mendelian disorders associated with mutations in more than 65% of the network. These result in mesenchymal defects, particularly impacting on skeletal growth and function. The Rohart MSC test is a simple in silico test that accurately discriminates MSC from fibroblasts, other adult stem/progenitor cell types or differentiated stromal cells. It has been implemented in the www.stemformatics.org resource, to assist researchers wishing to benchmark their own MSC datasets or data from the public domain. The code is available from the CRAN repository and all data used to generate the MSC test is available to download via the Gene Expression Omnibus or the Stemformatics resource.

Keywords: Data integration; Mesenchymal stromal cells; Meta-analysis; Sparse classification; Stem cell classification; Transcriptome.

Grants and funding

This work was funded by an Australian Research Council Grant SR1101002 to Stem Cells Australia (CAW), ARC discovery project DP130100777 to CAW and KALC, JEM Research Foundation philanthropic funding to CAW, and an NHMRC project grant APP1023368 to NMF and KK. CAW is funded by a QLD Government Smart Futures Fellowship. KK was supported by NHMRC career development fellowship 1023371. JP was supported by the National Heart Foundation Australia. KALC is funded in part by the Australian Cancer Research Foundation (ACRF) for the Diamantina Individualised Oncology Care Centre at the University of Queensland Diamantina Institute. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.