DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies

Genome Biol. 2020 Aug 17;21(1):207. doi: 10.1186/s13059-020-02091-3.

Abstract

Deep mutational scanning (DMS) enables multiplexed measurement of the effects of thousands of variants of proteins, RNAs, and regulatory elements. Here, we present a customizable pipeline, DiMSum, that represents an end-to-end solution for obtaining variant fitness and error estimates from raw sequencing data. A key innovation of DiMSum is the use of an interpretable error model that captures the main sources of variability arising in DMS workflows, outperforming previous methods. DiMSum is available as an R/Bioconda package and provides summary reports to help researchers diagnose common DMS pathologies and take remedial steps in their analyses.

Keywords: Bioconda; Bioinformatic pipeline; Deep mutational scanning; R package; Statistical model; Variant effect prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • DNA Mutational Analysis / methods*
  • High-Throughput Nucleotide Sequencing / methods
  • Models, Genetic
  • Molecular Diagnostic Techniques / methods*
  • Mutation*
  • Polymerase Chain Reaction
  • Proteins / genetics
  • Software

Substances

  • Proteins