A modified k-TSP algorithm and its application in LC-MS-based metabolomics study of hepatocellular carcinoma and chronic liver diseases

J Chromatogr B Analyt Technol Biomed Life Sci. 2014 Sep 1:966:100-8. doi: 10.1016/j.jchromb.2014.05.044. Epub 2014 Jun 2.

Abstract

In systems biology, the ability to discern meaningful information that reflects the nature of related problems from large amounts of data has become a key issue. The classification method using top scoring pairs (TSP), which measures the features of a data set in pairs and selects the top ranked feature pairs to construct the classifier, has been a powerful tool in genomics data analysis because of its simplicity and interpretability. This study examined the relationship between two features, modified the ranking criteria of the k-TSP method to measure the discriminative ability of each feature pair more accurately, and correspondingly, provided an improved classification procedure. Tests on eight public data sets showed the validity of the modified method. This modified k-TSP method was applied to our serum metabolomics data derived from liquid chromatography-mass spectrometry analysis of hepatocellular carcinoma and chronic liver diseases. Based on the 27 selected feature pairs, HCC and chronic liver diseases were accurately distinguished using the principal component analysis, and certain profound metabolic disturbances related to liver disease development were revealed by the feature pairs.

Keywords: Feature selection; LC–MS; Liver diseases; Metabolomics; TSP; Top scoring pairs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Carcinoma, Hepatocellular / blood*
  • Carcinoma, Hepatocellular / metabolism
  • Chromatography, Liquid / methods*
  • Humans
  • Liver Diseases / blood*
  • Liver Diseases / metabolism
  • Liver Neoplasms / blood*
  • Liver Neoplasms / metabolism
  • Mass Spectrometry / methods*
  • Metabolome*
  • Metabolomics / methods*