Bioinformatic data processing pipelines in support of next-generation sequencing-based HIV drug resistance testing: the Winnipeg Consensus

J Int AIDS Soc. 2018 Oct;21(10):e25193. doi: 10.1002/jia2.25193.

Abstract

Introduction: Next-generation sequencing (NGS) has several advantages over conventional Sanger sequencing for HIV drug resistance (HIVDR) genotyping, including detection and quantitation of low-abundance variants bearing drug resistance mutations (DRMs). However, the high HIV genomic diversity, unprecedented large volume of data, complexity of analysis and potential for error pose significant challenges for data processing. Several NGS analysis pipelines have been developed and used in HIVDR research; however, the absence of uniformity in data processing strategies results in lack of consistency and comparability of outputs from different pipelines. To fill this gap, an international symposium on bioinformatic strategies for NGS-based HIVDR testing was held in February 2018 in Winnipeg, Canada, convening laboratory scientists, bioinformaticians and clinicians involved in four recently developed, publicly available NGS HIVDR pipelines. The goal of this symposium was to establish a consensus on effective bioinformatic strategies for NGS data management and its use for HIVDR reporting.

Discussion: Essential functionalities of an NGS HIVDR pipeline were divided into five analytic blocks: (1) NGS read quality control (QC)/quality assurance (QA); (2) NGS read alignment and reference mapping; (3) HIV variant calling and variant QC; (4) NGS HIVDR reporting; and (5) extended data applications and additional considerations for data management. The consensuses reached among the participants on all major aspects of these blocks are summarized here. They encompass not only recommended data management and analysis strategies, but also detailed bioinformatic approaches that help ensure accuracy of the derived HIVDR analysis outputs for both research and potential clinical use.

Conclusions: While NGS is being adopted more broadly in HIVDR testing laboratories, data processing is often a bottleneck hindering its generalized application. The proposed standardization of NGS read QC/QA, read alignment and reference mapping, variant calling and QC, HIVDR reporting and relevant data management strategies in this "Winnipeg Consensus" may serve as a starting guideline for NGS HIVDR data processing that informs the refinement of existing pipelines and those yet to be developed. Moreover, the bioinformatic strategies presented here may apply more broadly to NGS data analysis of microbes harbouring significant genomic diversity.

Keywords: HIV drug resistance test; Winnipeg Consensus; bioinformatics; guideline; next-generation sequencing; pipeline.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology*
  • Consensus
  • Drug Resistance, Viral / genetics
  • HIV / drug effects*
  • HIV / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans