Motivation: Unsupervised clustering of microarray data may detect potentially important, but not obvious characteristics of samples, for instance subgroups of diagnoses with distinct gene profiles or systematic errors in experimentation.
Results: Multidimensional clustering (mdclust) is a method, which identifies sets of sample clusters and associated genes. It applies iteratively two-means clustering and score-based gene selection. For any phenotype variable best matching sets of clusters can be selected. This provides a method to identify gene-phenotype associations, suited even for settings with a large number of phenotype variables. An optional model based discriminant step may reduce further the number of selected genes.