Adjusting for gene-specific covariates to improve RNA-seq analysis

Bioinformatics. 2023 Aug 1;39(8):btad498. doi: 10.1093/bioinformatics/btad498.

Abstract

Summary: This article suggests a novel positive false discovery rate (pFDR) controlling method for testing gene-specific hypotheses using a gene-specific covariate variable, such as gene length. We suppose the null probability depends on the covariate variable. In this context, we propose a rejection rule that accounts for heterogeneity among tests by using two distinct types of null probabilities. We establish a pFDR estimator for a given rejection rule by following Storey's q-value framework. A condition on a type 1 error posterior probability is provided that equivalently characterizes our rejection rule. We also present a suitable procedure for selecting a tuning parameter through cross-validation that maximizes the expected number of hypotheses declared significant. A simulation study demonstrates that our method is comparable to or better than existing methods across realistic scenarios. In data analysis, we find support for our method's premise that the null probability varies with a gene-specific covariate variable.

Availability and implementation: The source code repository is publicly available at https://github.com/hsjeon1217/conditional_method.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Data Analysis*
  • Probability
  • RNA-Seq
  • Research Design*