Motivation: Molecular signatures are critical for inferring the proportions of cell types from bulk transcriptomics data. However, the identification of these signatures is based on a methodology that relies on prior biological knowledge of the cell types being studied. When working with less known biological material, a data-driven approach is required to uncover the underlying classes and generate ad hoc signatures from healthy or pathogenic tissue.
Results: We present a new approach, A2Sign: Agnostic Algorithms for Signatures, based on a non-negative tensor factorization (NTF) strategy that allows us to identify cell-type-specific molecular signatures, greatly reduce collinearities and also account for inter-individual variability. We propose a global framework that can be applied to uncover molecular signatures for cell-type deconvolution in arbitrary tissues using bulk transcriptome data. We also present two new molecular signatures for deconvolution of up to 16 immune cell types using microarray or RNA-seq data.
Availability and implementation: All steps of our analysis were implemented in annotated Python notebooks (https://github.com/paulfogel/A2SIGN). To perform NTF, we used the NMTF package, which can be downloaded using Python pip install.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected].