Sex estimation from long bones: a machine learning approach

Int J Legal Med. 2023 Nov;137(6):1887-1895. doi: 10.1007/s00414-023-03072-4. Epub 2023 Aug 1.

Abstract

Sex estimation from skeletal remains is one of the crucial issues in forensic anthropology. Long bones can be a valid alternative to skeletal remains for sex estimation when more dimorphic bones are absent or degraded, preventing any estimation from the first intention methods. The purpose of this study was to generate and compare classification models for sex estimation based on combined measurement of long bones using machine learning classifiers. Eighteen measurements from four long bones (radius, humerus, femur, and tibia) were taken from a total of 2141 individuals. Five machine learning methods were employed to predict the sex: a linear discriminant analysis (LDA), penalized logistic regression (PLR), random forest (RF), support vector machine (SVM), and artificial neural network (ANN). The different classification algorithms using all bones generated highly accuracy models with cross-validation, ranging from 90 to 92% on the validation sample. The classification with isolated bones ranked between 83.3 and 90.3% on the validation sample. In both cases, random forest stands out with the highest accuracy and seems to be the best model for our investigation. This study upholds the value of combined long bones for sex estimation and provides models that can be applied with high accuracy to different populations.

Keywords: Forensic anthropology; Long bones; Machine learning algorithms; Sex prediction; Sexual dimorphism; Statistical models.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Discriminant Analysis
  • Female
  • Forensic Anthropology* / methods
  • Humans
  • Logistic Models
  • Machine Learning*
  • Male
  • Middle Aged
  • Neural Networks, Computer*
  • Sex Determination by Skeleton* / methods
  • Support Vector Machine
  • Young Adult