Exaggerated false positives by popular differential expression methods when analyzing human population samples

Yumei Li; Xinzhou Ge; Fanglue Peng; Wei Li; Jingyi Jessica Li

doi:10.1186/s13059-022-02648-4

Exaggerated false positives by popular differential expression methods when analyzing human population samples

Genome Biol. 2022 Mar 15;23(1):79. doi: 10.1186/s13059-022-02648-4.

Authors

Yumei Li^#¹, Xinzhou Ge^#², Fanglue Peng³, Wei Li⁴, Jingyi Jessica Li^{5

6

7

8

9}

Affiliations

¹ Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA.
² Department of Statistics, University of California, Los Angeles, CA, 90095, USA.
³ Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, 77030, USA.
⁴ Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA. [email protected].
⁵ Department of Statistics, University of California, Los Angeles, CA, 90095, USA. [email protected].
⁶ Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA, 90095, USA. [email protected].
⁷ Department of Human Genetics, University of California, Los Angeles, CA, 90095, USA. [email protected].
⁸ Department of Computational Medicine, University of California, Los Angeles, CA, 90095, USA. [email protected].
⁹ Department of Biostatistics, University of California, Los Angeles, CA, 90095, USA. [email protected].

^# Contributed equally.

Abstract

When identifying differentially expressed genes between two conditions using human population RNA-seq samples, we found a phenomenon by permutation analysis: two popular bioinformatics methods, DESeq2 and edgeR, have unexpectedly high false discovery rates. Expanding the analysis to limma-voom, NOISeq, dearseq, and Wilcoxon rank-sum test, we found that FDR control is often failed except for the Wilcoxon rank-sum test. Particularly, the actual FDRs of DESeq2 and edgeR sometimes exceed 20% when the target FDR is 5%. Based on these results, for population-level RNA-seq studies with large sample sizes, we recommend the Wilcoxon rank-sum test.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology* / methods
Gene Expression Profiling* / methods
Humans
RNA-Seq
Sample Size
Sequence Analysis, RNA / methods

Abstract

Publication types

MeSH terms

Grants and funding