Correcting transcription factor gene sets for copy number and promoter methylation variations

Drug Dev Res. 2014 Sep;75(6):343-7. doi: 10.1002/ddr.21220.

Abstract

Gene set analysis provides a method to generate statistical inferences across sets of linked genes, primarily using high-throughput expression data. Common gene sets include biological pathways, operons, and targets of transcriptional regulators. In higher eukaryotes, especially when dealing with diseases with strong genetic and epigenetic components such as cancer, copy number loss and gene silencing through promoter methylation can eliminate the possibility that a gene is transcribed. This, in turn, can adversely affect the estimation of transcription factor or pathway activity from a set of target genes, as some of the targets may not be responsive to transcriptional regulation. Here we introduce a simple filtering approach that removes genes from consideration if they show copy number loss or promoter methylation, and demonstrate the improvement in inference of transcription factor activity in a simulated dataset based on the background expression observed in normal head and neck tissue.

Keywords: copy number variations; gene set analysis; promoter methylation; simulated dataset; transcription factor gene sets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • DNA Methylation
  • Epigenesis, Genetic
  • Gene Dosage*
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Neoplasms / genetics*
  • Promoter Regions, Genetic*
  • Software
  • Transcription Factors / genetics*

Substances

  • Transcription Factors