A systematic evaluation of nucleotide properties for CRISPR sgRNA design

BMC Bioinformatics. 2017 Jun 6;18(1):297. doi: 10.1186/s12859-017-1697-6.

Abstract

Background: CRISPR is a versatile gene editing tool which has revolutionized genetic research in the past few years. Optimizing sgRNA design to improve the efficiency of target/DNA cleavage is critical to ensure the success of CRISPR screens.

Results: By borrowing knowledge from oligonucleotide design and nucleosome occupancy models, we systematically evaluated candidate features computed from a number of nucleic acid, thermodynamic and secondary structure models on real CRISPR datasets. Our results showed that taking into account position-dependent dinucleotide features improved the design of effective sgRNAs with area under the receiver operating characteristic curve (AUC) >0.8, and the inclusion of additional features offered marginal improvement (∼2% increase in AUC).

Conclusion: Using a machine-learning approach, we proposed an accurate prediction model for sgRNA design efficiency. An R package predictSGRNA implementing the predictive model is available at http://www.ams.sunysb.edu/~pfkuan/softwares.html#predictsgrna .

Keywords: CRISPR; Machine learning; Predictive modeling; Thermodynamics.

MeSH terms

  • Animals
  • Area Under Curve
  • Clustered Regularly Interspaced Short Palindromic Repeats / genetics*
  • Gene Editing
  • Internet
  • Machine Learning
  • Nucleotides / metabolism*
  • ROC Curve
  • Thermodynamics
  • User-Computer Interface*

Substances

  • Nucleotides