Evaluating Accuracy in Five Commercial Sleep-Tracking Devices Compared to Research-Grade Actigraphy and Polysomnography

Sensors (Basel). 2024 Jan 19;24(2):635. doi: 10.3390/s24020635.

Abstract

The development of consumer sleep-tracking technologies has outpaced the scientific evaluation of their accuracy. In this study, five consumer sleep-tracking devices, research-grade actigraphy, and polysomnography were used simultaneously to monitor the overnight sleep of fifty-three young adults in the lab for one night. Biases and limits of agreement were assessed to determine how sleep stage estimates for each device and research-grade actigraphy differed from polysomnography-derived measures. Every device, except the Garmin Vivosmart, was able to estimate total sleep time comparably to research-grade actigraphy. All devices overestimated nights with shorter wake times and underestimated nights with longer wake times. For light sleep, absolute bias was low for the Fitbit Inspire and Fitbit Versa. The Withings Mat and Garmin Vivosmart overestimated shorter light sleep and underestimated longer light sleep. The Oura Ring underestimated light sleep of any duration. For deep sleep, bias was low for the Withings Mat and Garmin Vivosmart while other devices overestimated shorter and underestimated longer times. For REM sleep, bias was low for all devices. Taken together, these results suggest that proportional bias patterns in consumer sleep-tracking technologies are prevalent and could have important implications for their overall accuracy.

Keywords: Fitbit; Garmin; Oura; Withings; actigraphy; polysomnography; sleep.

MeSH terms

  • Actigraphy* / methods
  • Humans
  • Polysomnography / methods
  • Reproducibility of Results
  • Sleep
  • Sleep Stages
  • Sleep Wake Disorders*
  • Young Adult

Grants and funding

This research was funded by an Honors Research Grant to JC from the Commonwealth Honors College at the University of Massachusetts.