Discovering combinatorial interactions in survival data

David A Duverle; Ichiro Takeuchi; Yuko Murakami-Tonami; Kenji Kadomatsu; Koji Tsuda

doi:10.1093/bioinformatics/btt532

Discovering combinatorial interactions in survival data

Bioinformatics. 2013 Dec 1;29(23):3053-9. doi: 10.1093/bioinformatics/btt532. Epub 2013 Sep 13.

Authors

David A Duverle¹, Ichiro Takeuchi, Yuko Murakami-Tonami, Kenji Kadomatsu, Koji Tsuda

Affiliation

¹ Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan, Department of Computer Science, Nagoya Institute of Technology, Nagoya, Japan, Division of Molecular Oncology, Aichi Cancer Center, Nagoya, Japan and Department of Molecular Biology, Nagoya University Graduate School of Medicine, Nagoya, Japan.

Abstract

Motivation: Although several methods exist to relate high-dimensional gene expression data to various clinical phenotypes, finding combinations of features in such input remains a challenge, particularly when fitting complex statistical models such as those used for survival studies.

Results: Our proposed method builds on existing 'regularization path-following' techniques to produce regression models that can extract arbitrarily complex patterns of input features (such as gene combinations) from large-scale data that relate to a known clinical outcome. Through the use of the data's structure and itemset mining techniques, we are able to avoid combinatorial complexity issues typically encountered with such methods, and our algorithm performs in similar orders of duration as single-variable versions. Applied to data from various clinical studies of cancer patient survival time, our method was able to produce a number of promising gene-interaction candidates whose tumour-related roles appear confirmed by literature.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Breast Neoplasms / genetics
Breast Neoplasms / mortality*
Computational Biology / methods*
Female
Gene Expression Profiling
Gene Regulatory Networks*
Humans
Likelihood Functions
Logistic Models
Models, Biological
Neoplasm Proteins / genetics*
Neuroblastoma / genetics
Neuroblastoma / mortality*
Proportional Hazards Models
Risk Factors
Survival Rate

Substances

Neoplasm Proteins