Theoretical and experimental assessment of degenerate primer tagging in ultra-deep applications of next-generation sequencing

Richard H Liang; Theresa Mo; Winnie Dong; Guinevere Q Lee; Luke C Swenson; Rosemary M McCloskey; Conan K Woods; Chanson J Brumme; Cynthia K Y Ho; Janke Schinkel; Jeffrey B Joy; P Richard Harrigan; Art F Y Poon

doi:10.1093/nar/gku355

Theoretical and experimental assessment of degenerate primer tagging in ultra-deep applications of next-generation sequencing

Nucleic Acids Res. 2014 Jul;42(12):e98. doi: 10.1093/nar/gku355. Epub 2014 May 7.

Affiliations

¹ BC Centre for Excellence in HIV/AIDS, Vancouver, BC, V6Z 1Y6, Canada [email protected].
² BC Centre for Excellence in HIV/AIDS, Vancouver, BC, V6Z 1Y6, Canada.
³ Section of Clinical Virology, Department of Medical Microbiology, Academic Medical Center, 1105AZ Amsterdam, The Netherlands.
⁴ BC Centre for Excellence in HIV/AIDS, Vancouver, BC, V6Z 1Y6, Canada Department of Medicine, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada.

Abstract

Primer IDs (pIDs) are random oligonucleotide tags used in next-generation sequencing to identify sequences that originate from the same template. These tags are produced by degenerate primers during the reverse transcription of RNA molecules into cDNA. The use of pIDs helps to track the number of RNA molecules carried through amplification and sequencing, and allows resolution of inconsistencies between reads sharing a pID. Three potential issues complicate the above applications. First, multiple cDNAs may share a pID by chance; we found that while preventing any cDNAs from sharing a pID may be unfeasible, it is still practical to limit the number of these collisions. Secondly, a pID must be observed in at least three sequences to allow error correction; as such, pIDs observed only one or two times must be rejected. If the sequencing product contains copies from a high number of RT templates but produces few reads, our findings indicate that rejecting such pIDs will discard a great deal of data. Thirdly, the use of pIDs could influence amplification and sequencing. We examined the effects of several intrinsic and extrinsic factors on sequencing reads at both the individual and ensemble level.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

DNA Primers / chemistry*
DNA, Complementary / chemistry
HIV / genetics
Hepacivirus / genetics
High-Throughput Nucleotide Sequencing / methods*
Humans
Polymerase Chain Reaction
RNA, Viral / blood
RNA, Viral / chemistry
Sequence Analysis, RNA

Substances

DNA Primers
DNA, Complementary
RNA, Viral

Theoretical and experimental assessment of degenerate primer tagging in ultra-deep applications of next-generation sequencing

Authors

Affiliations

Abstract

Publication types

MeSH terms

Substances

Grants and funding