Transactional database transformation and its application in prioritizing human disease genes

IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):294-304. doi: 10.1109/TCBB.2011.58. Epub 2011 Mar 16.

Abstract

Binary (0,1) matrices, commonly known as transactional databases, can represent many application data, including genephenotype data where “1” represents a confirmed gene-phenotype relation and “0” represents an unknown relation. It is natural to ask what information is hidden behind these “0”s and “1”s. Unfortunately, recent matrix completion methods, though very effective in many cases, are less likely to infer something interesting from these (0,1)-matrices. To answer this challenge, we propose INDEVI, a very succinct and effective algorithm to perform independent-evidence-based transactional database transformation. Each entry of a (0,1)-matrix is evaluated by “independent evidence” (maximal supporting patterns) extracted from the whole matrix for this entry. The value of an entry, regardless of its value as 0 or 1, has completely no effect for its independent evidence. The experiment on a genephenotype database shows that our method is highly promising in ranking candidate genes and predicting unknown disease genes.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic*
  • Disease / genetics*
  • Humans
  • Models, Genetic
  • Phenotype