Correlation between gene expression and clinical data through linear and nonlinear principal components analyses: muscular dystrophies as case studies

OMICS. 2009 Jun;13(3):173-84. doi: 10.1089/omi.2009.0003.

Abstract

The large dimension of microarray data and the complex dependence structure among genes make data analysis extremely challenging. In the last decade several statistical techniques have been proposed to tackle genome-wide expression data; however, clinical and molecular data associated to pathologies have often been considered as separate dimensions of the same phenomenon, especially when clinical variables lie on a multidimensional space. A better comprehension of the relationships between clinical and molecular data can be obtained if both data types are combined and integrated. In this work we adopt a multidimensional correlation strategy together with linear and nonlinear principal component, to integrate genetic and clinical information obtained from two sets of dystrophic patients. With this approach we decompose different aspects of clinical manifestations and correlate these features with the correspondent patterns of differential gene expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods
  • Computer Simulation
  • Databases, Genetic
  • Gene Expression Profiling / methods
  • Gene Expression*
  • Humans
  • Microarray Analysis
  • Models, Genetic*
  • Muscular Dystrophies / genetics*
  • Muscular Dystrophies / physiopathology
  • Principal Component Analysis*
  • Statistics as Topic