Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID

Cassandra B Jabara; Corbin D Jones; Jeffrey Roach; Jeffrey A Anderson; Ronald Swanstrom

doi:10.1073/pnas.1110064108

Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID

Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):20166-71. doi: 10.1073/pnas.1110064108. Epub 2011 Nov 30.

Authors

Cassandra B Jabara¹, Corbin D Jones, Jeffrey Roach, Jeffrey A Anderson, Ronald Swanstrom

Affiliation

¹ Department of Biology, Lineberger Comprehensive Cancer Center, University of North Carolina Center for AIDS Research, Carolina Center for Genome Sciences, Chapel Hill, NC 27599, USA.

Abstract

Viruses can create complex genetic populations within a host, and deep sequencing technologies allow extensive sampling of these populations. Limitations of these technologies, however, potentially bias this sampling, particularly when a PCR step precedes the sequencing protocol. Typically, an unknown number of templates are used in initiating the PCR amplification, and this can lead to unrecognized sequence resampling creating apparent homogeneity; also, PCR-mediated recombination can disrupt linkage, and differential amplification can skew allele frequency. Finally, misincorporation of nucleotides during PCR and errors during the sequencing protocol can inflate diversity. We have solved these problems by including a random sequence tag in the initial primer such that each template receives a unique Primer ID. After sequencing, repeated identification of a Primer ID reveals sequence resampling. These resampled sequences are then used to create an accurate consensus sequence for each template, correcting for recombination, allelic skewing, and misincorporation/sequencing errors. The resulting population of consensus sequences directly represents the initial sampled templates. We applied this approach to the HIV-1 protease (pro) gene to view the distribution of sequence variation of a complex viral population within a host. We identified major and minor polymorphisms at coding and noncoding positions. In addition, we observed dynamic genetic changes within the population during intermittent drug exposure, including the emergence of multiple resistant alleles. These results provide an unprecedented view of a complex viral population in the absence of PCR resampling.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Alleles
Base Sequence
Codon / genetics
DNA Primers / metabolism*
DNA, Complementary / biosynthesis
Drug Resistance, Multiple, Viral / drug effects
Drug Resistance, Multiple, Viral / genetics
Genes, Viral / genetics*
Genetic Variation / drug effects
HIV Protease / genetics*
HIV-1 / drug effects
HIV-1 / enzymology
HIV-1 / genetics
High-Throughput Nucleotide Sequencing / methods*
Humans
Linkage Disequilibrium / genetics
Molecular Sequence Data
Phylogeny
Protease Inhibitors / pharmacology
RNA, Viral / genetics
Templates, Genetic

Substances

Codon
DNA Primers
DNA, Complementary
Protease Inhibitors
RNA, Viral
HIV Protease
p16 protease, Human immunodeficiency virus 1

Abstract

Publication types

MeSH terms

Substances

Grants and funding