DP2LM: leveraging deep learning approach for estimation and hypothesis testing on mediation effects with high-dimensional mediators and complex confounders

Biostatistics. 2024 Jul 1;25(3):818-832. doi: 10.1093/biostatistics/kxad037.

Abstract

Traditional linear mediation analysis has inherent limitations when it comes to handling high-dimensional mediators. Particularly, accurately estimating and rigorously inferring mediation effects is challenging, primarily due to the intertwined nature of the mediator selection issue. Despite recent developments, the existing methods are inadequate for addressing the complex relationships introduced by confounders. To tackle these challenges, we propose a novel approach called DP2LM (Deep neural network-based Penalized Partially Linear Mediation). This approach incorporates deep neural network techniques to account for nonlinear effects in confounders and utilizes the penalized partially linear model to accommodate high dimensionality. Unlike most existing works that concentrate on mediator selection, our method prioritizes estimation and inference on mediation effects. Specifically, we develop test procedures for testing the direct and indirect mediation effects. Theoretical analysis shows that the tests maintain the Type-I error rate. In simulation studies, DP2LM demonstrates its superior performance as a modeling tool for complex data, outperforming existing approaches in a wide range of settings and providing reliable estimation and inference in scenarios involving a considerable number of mediators. Further, we apply DP2LM to investigate the mediation effect of DNA methylation on cortisol stress reactivity in individuals who experienced childhood trauma, uncovering new insights through a comprehensive analysis.

Keywords: deep neural networks; direct effects; high-dimensional mediators; hypothesis testing; indirect effects; intricate nonlinear confounders.

MeSH terms

  • Deep Learning*
  • Humans
  • Mediation Analysis*
  • Models, Statistical