Haplotype Inference Using Long-Read Nanopore Sequencing: Application to GSTA1 Promoter

Mol Biotechnol. 2024 Jun 17. doi: 10.1007/s12033-024-01213-7. Online ahead of print.

Abstract

Recovering true haplotypes can have important clinical consequences. The laboratory process is difficult and is, therefore, most often done through inference. In this paper, we show that when using the Oxford nanopore sequencing technology, we could recover the true haplotypes of the GSTA1 promoter region. Eight LCL cell lines with potentially ambiguous haplotypes were used to characterize the efficacy of Oxford nanopore sequencing to phase the correct GSTA1 promoter haplotypes. The results were compared to Sanger sequencing and inferred haplotypes in the 1000 genomes project. The average read length was 813 bp out of a total PCR length of 1336 bp. The best coverage of sequencing was in the middle of the PCR product and decreased to 50% at the PCR ends. SNPs separated by less than 200 bp showed > 90% of correct haplotypes, while at the distance of 1089 bp, this proportion still exceeded 58%. The number of cycles influences the generation of hybrid haplotypes but not extension or annealing time. The results demonstrate that this long sequencing reads methodology, can accurately determine the haplotypes without the need for inference. The technology proved to be robust but the success of phasing nonetheless depends on the distances and frequencies of SNPs.

Keywords: GSTA1; Haplotypes; Nanopore; Phasing; Sequencing.