Analysis of the 10-day ultra-marathon using a predictive XG boost model

BMC Res Notes. 2024 Dec 19;17(1):372. doi: 10.1186/s13104-024-07028-8.

Abstract

Objective: Ultra-marathon running races are held as distance-limited or time-limited events, ranging from 6 h to 10 days. Only a few runners compete in 10-day events, and so far, we have little knowledge about the athletes' origins, performance, and event characteristics. The aim of the present study was to investigate the origin and performance of these runners and the fastest race locations. A machine learning model based on the XG Boost algorithm was built to predict running speed from the athlete´s age, gender, country of origin, country where the race takes place, the type of race and the kind of running surface. The model explainability tools were then used to investigate how each independent variable would influence the predicted running speed.

Results: The model rated the origin of the athlete as the most important predictor, followed by age group, running on dirt path, gender, running on asphalt, and event location. Running on dirt path led to a significant reduction of running speed, while running on asphalt showed faster running speeds compared to other surfaces. Most athletes came from USA, followed by Russia, Germany, Ukraine, the Czech Republic, and Slovakia. Most of the runners competed in USA. The fastest 10-day runners were from Finland and Israel. The fastest 10-day races were held in Greece.

Conclusions: Most 10-day runners originated from USA, but the fastest runners originate from Finland and Israel. The fastest race courses were in Greece. Running on dirt paths leads to a significant reduction in running speed while running on asphalt leads to faster running speeds.

Keywords: Age group; Gender; Machine learning; Nationality; Origin; Performance; Ultra-endurance.

MeSH terms

  • Adult
  • Athletes* / statistics & numerical data
  • Athletic Performance* / physiology
  • Athletic Performance* / statistics & numerical data
  • Female
  • Humans
  • Machine Learning
  • Male
  • Marathon Running* / physiology
  • Middle Aged
  • Physical Endurance / physiology
  • Running / physiology
  • United States
  • Young Adult