Detect influential points of feature rankings

Comput Biol Chem. 2025 Jan 4:115:108339. doi: 10.1016/j.compbiolchem.2024.108339. Online ahead of print.

Abstract

Background: Feature rankings are crucial in bioinformatics but can be distorted by influential points (IPs), which are often overlooked. This study aims to investigate the impact of IPs on feature rankings and propose IPs detection method METHOD: We use a leave-one-out approach to assess each case's influence on feature rankings by comparing rank changes after its removal. The rank changes are measured by a novel rank comparison method that involves using adaptive top-prioritized weights that are adjustable to the distribution of rank changes. Our IP detection method was evaluated on several public datasets.

Results: Our method identified potential IPs in several TCGA gene expression datasets, revealing that IPs can severely distort feature rankings. These rank changes can ultimately affect subsequent analyses such as enriched pathways, suggesting the necessity of IPs detection when deriving feature rankings.

Conclusions: IPs significantly impact feature rankings and subsequent analyses; routine IP detection is necessary yet underutilized. Our method is available in the R package findIPs.

Keywords: Adaptive weights, TCGA; Feature rankings; Influential points; Rank comparison.