projectR: an R/Bioconductor package for transfer learning via PCA, NMF, correlation and clustering

Gaurav Sharma; Carlo Colantuoni; Loyal A Goff; Elana J Fertig; Genevieve Stein-O'Brien

doi:10.1093/bioinformatics/btaa183

projectR: an R/Bioconductor package for transfer learning via PCA, NMF, correlation and clustering

Bioinformatics. 2020 Jun 1;36(11):3592-3593. doi: 10.1093/bioinformatics/btaa183.

Authors

Gaurav Sharma¹, Carlo Colantuoni^{2

3}, Loyal A Goff^{2

4

5}, Elana J Fertig^{1

6

7}, Genevieve Stein-O'Brien^{2

4

5

6}

Affiliations

¹ Department of Biomedical Engineering.
² Department of Neuroscience.
³ Department of Neurology.
⁴ Kavli Neurodiscovery Institute.
⁵ Department of Genetic Medicine.
⁶ Department of Oncology.
⁷ Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA.

Abstract

Motivation: Dimension reduction techniques are widely used to interpret high-dimensional biological data. Features learned from these methods are used to discover both technical artifacts and novel biological phenomena. Such feature discovery is critically importent in analysis of large single-cell datasets, where lack of a ground truth limits validation and interpretation. Transfer learning (TL) can be used to relate the features learned from one source dataset to a new target dataset to perform biologically driven validation by evaluating their use in or association with additional sample annotations in that independent target dataset.

Results: We developed an R/Bioconductor package, projectR, to perform TL for analyses of genomics data via TL of clustering, correlation and factorization methods. We then demonstrate the utility TL for integrated data analysis with an example for spatial single-cell analysis.

Availability and implementation: projectR is available on Bioconductor and at https://github.com/genesofeve/projectR.

Contact: [email protected] or [email protected].

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Cluster Analysis
Genomics*
Machine Learning
Single-Cell Analysis
Software*

Abstract

Publication types

MeSH terms

Grants and funding