Bayesian optimization driven strategy for detecting credit card fraud with Extremely Randomized Trees

MethodsX. 2024 Nov 16:13:103055. doi: 10.1016/j.mex.2024.103055. eCollection 2024 Dec.

Abstract

Credit card usage has surged, heightening concerns about fraud. To address this, advanced credit card fraud detection (CCFD) technology employs machine learning algorithms to analyze transaction behavior. Credit card data's complexity and imbalance can cause overfitting in conventional models. We propose a Bayesian-optimized Extremely Randomized Trees via Tree-structured Parzen Estimator (TP-ERT) to detect fraudulent transactions. TP-ERT uses higher randomness in split points and feature selection to capture diverse transaction patterns, improving model generalization. The performance of the model is assessed using real-world credit card transaction data. Experimental results demonstrate the superiority of TP-ERT over the other CCFD systems. Furthermore, our validation exhibits the effectiveness of TPE compared to other optimization techniques with higher F1 score.•The optimized Extremely Randomized Trees model is a viable artificial intelligence tool for detecting credit card fraud.•Model hyperparameter tuning is conducted using Tree-structured Parzen Estimator, a Bayesian optimization strategy, to efficiently explore the hyperparameter space and identify the best combination of hyperparameters. This facilitates the model to capture intricate patterns in the transactions, resulting in enhanced model performance.•The empirical findings exhibit that the proposed approach is superior to the other machine learning models on a real-world credit card transaction dataset.

Keywords: Credit card fraud detection; Extremely Randomized Trees; Machine learning; Optimization; TP-ERT: TPE-optimized Extremely Randomized Trees; Tree-structured Parzen Estimator.