A New Algorithm to Optimize Maximal Information Coefficient

PLoS One. 2016 Jun 22;11(6):e0157567. doi: 10.1371/journal.pone.0157567. eCollection 2016.

Abstract

The maximal information coefficient (MIC) captures dependences between paired variables, including both functional and non-functional relationships. In this paper, we develop a new method, ChiMIC, to calculate the MIC values. The ChiMIC algorithm uses the chi-square test to terminate grid optimization and then removes the restriction of maximal grid size limitation of original ApproxMaxMI algorithm. Computational experiments show that ChiMIC algorithm can maintain same MIC values for noiseless functional relationships, but gives much smaller MIC values for independent variables. For noise functional relationship, the ChiMIC algorithm can reach the optimal partition much faster. Furthermore, the MCN values based on MIC calculated by ChiMIC can capture the complexity of functional relationships in a better way, and the statistical powers of MIC calculated by ChiMIC are higher than those calculated by ApproxMaxMI. Moreover, the computational costs of ChiMIC are much less than those of ApproxMaxMI. We apply the MIC values tofeature selection and obtain better classification accuracy using features selected by the MIC values from ChiMIC.

MeSH terms

  • Algorithms*
  • Databases as Topic
  • Humans
  • Information Theory*
  • Models, Theoretical*
  • Neoplasms / classification
  • Statistics as Topic
  • Time Factors

Grants and funding

This work was supported by the Doctoral Foundation of Ministry of Education of China (No. 20124320110002, ZY), http://www.cutech.edu.cn/cn/index.htm; Science and Technology Planning Projects of Changsha, China (No. K1406018-21, ZY), http://www.cssti.cn/html/cskjw/index.html; and National Natural Science Foundation of China (No.31471789, FL), http://www.nsfc.gov.cn/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.