Tuning model parameters in class-imbalanced learning with precision-recall curve

Biom J. 2019 May;61(3):652-664. doi: 10.1002/bimj.201800148. Epub 2018 Dec 12.

Abstract

An issue for class-imbalanced learning is what assessment metric should be employed. So far, precision-recall curve (PRC) as a metric is rarely used in practice as compared with its alternative of receiver operating characteristic (ROC). This study investigates the performance of PRC as the evaluating criterion to address the class-imbalanced data and focuses on the comparison of PRC with ROC. The advantages of PRC over ROC on assessing class-imbalanced data are also investigated and tested on our proposed algorithm by tuning the whole model parameters in simulation studies and real data examples. The result shows that PRC is competitive with ROC as performance measurement for handling class-imbalanced data in tuning the model parameters. PRC can be considered as an alternative but effective assessment for preprocessing (such as variable selection) skewed data and building a classifier in class-imbalanced learning.

Keywords: class imbalance; measurement; parameter tuning; precision-recall curve; receiver operating characteristic.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biometry / methods*
  • Brain Injuries, Traumatic / diagnosis
  • Brain Injuries, Traumatic / metabolism
  • Colonic Neoplasms / diagnosis
  • Colonic Neoplasms / genetics
  • Colonic Neoplasms / physiopathology
  • Humans
  • Machine Learning*
  • Models, Statistical*
  • ROC Curve
  • Support Vector Machine