epiGBS2: Improvements and evaluation of highly multiplexed, epiGBS-based reduced representation bisulfite sequencing

Mol Ecol Resour. 2022 Jul;22(5):2087-2104. doi: 10.1111/1755-0998.13597. Epub 2022 Mar 3.

Abstract

Several reduced-representation bisulfite sequencing methods have been developed in recent years to determine cytosine methylation de novo in nonmodel species. Here, we present epiGBS2, a laboratory protocol based on epiGBS with a revised and user-friendly bioinformatics pipeline for a wide range of species with or without a reference genome. epiGBS2 is cost- and time-efficient and the computational workflow is designed in a user-friendly and reproducible manner. The library protocol allows a flexible choice of restriction enzymes and a double digest. The bioinformatics pipeline was integrated in the Snakemake workflow management system, which makes the pipeline easy to execute and modular, and parameter settings for important computational steps flexible. We implemented bismark for alignment and methylation analysis and we preprocessed alignment files by double masking to enable single nucleotide polymorphism calling with Freebayes (epiFreebayes). The performance of several critical steps in epiGBS2 was evaluated against baseline data sets from Arabidopsis thaliana and great tit (Parus major), which confirmed its overall good performance. We provide a detailed description of the laboratory protocol and an extensive manual of the bioinformatics pipeline, which is publicly accessible on github (https://github.com/nioo-knaw/epiGBS2) and zenodo (https://doi.org/10.5281/zenodo.4764652).

Keywords: DNA methylation; SNP calling; bisulfite sequencing; de novo reference; double digest; nonmodel species; reduced representation.

MeSH terms

  • DNA Methylation
  • High-Throughput Nucleotide Sequencing / methods
  • Sequence Analysis, DNA / methods
  • Software*
  • Sulfites*

Substances

  • Sulfites
  • hydrogen sulfite