Partial correlation matrix estimation using ridge penalty followed by thresholding and re-estimation

Biometrics. 2014 Sep;70(3):765-73. doi: 10.1111/biom.12186. Epub 2014 May 20.

Abstract

Motivated by the problem of construction of gene co-expression network, we propose a statistical framework for estimating high-dimensional partial correlation matrix by a three-step approach. We first obtain a penalized estimate of a partial correlation matrix using ridge penalty. Next we select the non-zero entries of the partial correlation matrix by hypothesis testing. Finally we re-estimate the partial correlation coefficients at these non-zero entries. In the second step, the null distribution of the test statistics derived from penalized partial correlation estimates has not been established. We address this challenge by estimating the null distribution from the empirical distribution of the test statistics of all the penalized partial correlation estimates. Extensive simulation studies demonstrate the good performance of our method. Application on a yeast cell cycle gene expression data shows that our method delivers better predictions of the protein-protein interactions than the Graphic Lasso.

Keywords: Co‐expression network; Empirical null distribution; Graphical model; Partial correlation matrix; Ridge regression.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cell Cycle Proteins / metabolism*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Fungal / physiology
  • Models, Statistical
  • Protein Interaction Mapping / methods*
  • Regression Analysis
  • Saccharomyces cerevisiae / cytology
  • Saccharomyces cerevisiae / metabolism*
  • Saccharomyces cerevisiae Proteins / metabolism*

Substances

  • Cell Cycle Proteins
  • Saccharomyces cerevisiae Proteins