Bias in High-Throughput Analysis of miRNAs and Implications for Biomarker Studies

Anal Chem. 2016 Feb 16;88(4):2088-95. doi: 10.1021/acs.analchem.5b03376. Epub 2016 Jan 27.

Abstract

A certain degree of bias in high-throughput molecular technologies including microarrays and next-generation sequencing (NGS) is known. To quantify the actual impact of the biomarker discovery platform on miRNA profiles, we first performed a meta-analysis: raw data of 1 539 microarrays and 705 NGS blood-borne miRNomes were statistically evaluated, suggesting a substantial influence of the technology on biomarker profiles. We observed highly significant dependency of the miRNA nucleotide composition on the expression level. Higher expression in NGS was discovered for uracil-rich miRNAs (p = 7 × 10(-37)), while high expression in microarrays was found predominantly for guanine-rich miRNAs (p = 3 × 10(-33)). To verify the findings, 10 identical replicates of one individual were measured using NGS and microarrays (2 525 miRNAs from miRBase version 21). Overall, we calculated a correlation coefficient of 0.414 for both technologies. Detailed analysis however revealed that the correlation was observed only for miRNAs in the early miRBase versions (<8). The majority of miRNAs (2 013 from miRBase version 8 onward) was not correlated between microarray and NGS. Specifically, we observed 67 miRNAs with a median read count above 10 in NGS, while they were not detected in any of the 10 replicated array experiments. In contrast, 234 miRNAs were discovered in all 10 replicated array measurements but were not found in any of the NGS experiments of the same individual. While the first group had average guanine content of 22%, the latter group consisted of 41% of this nucleotide. Selected concordant and discordant miRNAs were tested in quantitative real-time-polymerase chain reaction (RT-qPCR) experiments again of the same individual, providing further evidence for the substantial bias depending on the base composition. As a consequence, biomarkers that have been discovered by specific high-throughout technologies have to be carefully considered. Especially for validation of the platform, the selection of reasonable candidates is essential.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers / analysis*
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • MicroRNAs / analysis*
  • MicroRNAs / genetics
  • Oligonucleotide Array Sequence Analysis
  • Real-Time Polymerase Chain Reaction
  • Sequence Analysis, RNA

Substances

  • Biomarkers
  • MicroRNAs