Integrity, standards, and QC-related issues with big data in pre-clinical drug discovery

Biochem Pharmacol. 2018 Jun:152:84-93. doi: 10.1016/j.bcp.2018.03.014. Epub 2018 Mar 15.

Abstract

The tremendous expansion of data analytics and public and private big datasets presents an important opportunity for pre-clinical drug discovery and development. In the field of life sciences, the growth of genetic, genomic, transcriptomic and proteomic data is partly driven by a rapid decline in experimental costs as biotechnology improves throughput, scalability, and speed. Yet far too many researchers tend to underestimate the challenges and consequences involving data integrity and quality standards. Given the effect of data integrity on scientific interpretation, these issues have significant implications during preclinical drug development. We describe standardized approaches for maximizing the utility of publicly available or privately generated biological data and address some of the common pitfalls. We also discuss the increasing interest to integrate and interpret cross-platform data. Principles outlined here should serve as a useful broad guide for existing analytical practices and pipelines and as a tool for developing additional insights into therapeutics using big data.

Keywords: Big data; Exome; Genomics; Microarray; RNA-seq; Transcriptomics.

Publication types

  • Review

MeSH terms

  • Big Data*
  • Biomedical Research / standards*
  • Drug Discovery*
  • Quality Control*