Randomness in DNA Encoded Library Selection Data Can Be Modeled for More Reliable Enrichment Calculation

Letian Kuai; Thomas O'Keeffe; Christopher Arico-Muendel

doi:10.1177/2472555218757718

Randomness in DNA Encoded Library Selection Data Can Be Modeled for More Reliable Enrichment Calculation

SLAS Discov. 2018 Jun;23(5):405-416. doi: 10.1177/2472555218757718. Epub 2018 Feb 13.

Authors

Letian Kuai¹, Thomas O'Keeffe¹, Christopher Arico-Muendel¹

Affiliation

¹ 1 GlaxoSmithKline, Cambridge, MA, USA.

PMID: 29437521
DOI: 10.1177/2472555218757718

Abstract

DNA Encoded Libraries (DELs) use unique DNA sequences to tag each chemical warhead within a library mixture to enable deconvolution following affinity selection against a target protein. With next-generation sequencing, millions to billions of sequences can be read and counted to report binding events. This unprecedented capability has enabled researchers to synthesize and analyze numerically large chemical libraries. Despite the common perception that each library member undergoes a miniaturized affinity assay, selections with higher complexity libraries often produce results that are difficult to rank order. In this study, we aimed to understand the robustness of DEL selection by examining the sequencing readouts of warheads and chemotype families among a large number of experimentally repeated selections. The results revealed that (1) the output of DEL selection is intrinsically noisy but can be reliably modeled by the Poisson distribution, and (2) Poisson noise is the dominating noise at low copy counts and can be estimated even from a single experiment. We also discuss the shortcomings of data analyses based on directly using copy counts and their linear transformations, and propose a framework that incorporates proper normalization and confidence interval calculation to help researchers better understand DEL data.

Keywords: DEL; DNA Encoded Library; Poisson; data analysis; normalization.

MeSH terms

Base Sequence / genetics
DNA / genetics*
Data Analysis
Drug Discovery / methods
Gene Library
Small Molecule Libraries / metabolism

Substances

Small Molecule Libraries
DNA