Unsupervised learning from complex data: the matrix incision tree algorithm

Pac Symp Biocomput. 2001:30-41. doi: 10.1142/9789814447362_0004.

Abstract

Analysis of large-scale gene expression data requires novel methods for knowledge discovery and predictive model building as well as clustering. Organizing data into meaningful structures is one of the most fundamental modes of learning. DNA microarray data set can be viewed as a set of mutually associated genes in a high-dimensional space. This paper describes a novel method to organize a complex high-dimensional space into successive lower-dimensional spaces based on the geometric properties of the data structure in the absence of a priori knowledge. The matrix incision tree algorithm reveals the hierarchical structural organization of observed data by determining the successive hyperplanes that 'optimally' separate the data hyperspace. The algorithm was tested against published data sets yielding promising results.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Cluster Analysis
  • Data Interpretation, Statistical
  • Gene Expression Profiling / statistics & numerical data
  • Humans
  • Leukemia / genetics
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*