A2Sign: Agnostic Algorithms for Signatures-a universal method for identifying molecular signatures from transcriptomic datasets prior to cell-type deconvolution

Bioinformatics. 2022 Jan 27;38(4):1015-1021. doi: 10.1093/bioinformatics/btab773.

Abstract

Motivation: Molecular signatures are critical for inferring the proportions of cell types from bulk transcriptomics data. However, the identification of these signatures is based on a methodology that relies on prior biological knowledge of the cell types being studied. When working with less known biological material, a data-driven approach is required to uncover the underlying classes and generate ad hoc signatures from healthy or pathogenic tissue.

Results: We present a new approach, A2Sign: Agnostic Algorithms for Signatures, based on a non-negative tensor factorization (NTF) strategy that allows us to identify cell-type-specific molecular signatures, greatly reduce collinearities and also account for inter-individual variability. We propose a global framework that can be applied to uncover molecular signatures for cell-type deconvolution in arbitrary tissues using bulk transcriptome data. We also present two new molecular signatures for deconvolution of up to 16 immune cell types using microarray or RNA-seq data.

Availability and implementation: All steps of our analysis were implemented in annotated Python notebooks (https://github.com/paulfogel/A2SIGN). To perform NTF, we used the NMTF package, which can be downloaded using Python pip install.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms*
  • Exome Sequencing
  • Gene Expression Profiling
  • RNA-Seq
  • Transcriptome*