An investigation into online videos as a source of safety hazard reports

J Safety Res. 2018 Jun:65:89-99. doi: 10.1016/j.jsr.2018.03.004. Epub 2018 Mar 14.

Abstract

Introduction: Despite the advantages of video-based product reviews relative to text-based reviews in detecting possible safety hazard issues, video-based product reviews have received no attention in prior literature. This study focuses on online video-based product reviews as possible sources to detect safety hazards.

Methods: We use two common text mining methods - sentiment and smoke words - to detect safety issues mentioned in videos on the world's most popular video sharing platform, YouTube.

Results: 15,402 product review videos from YouTube were identified as containing either negative sentiment or smoke words, and were carefully manually viewed to verify whether hazards were indeed mentioned. 496 true safety issues (3.2%) were found. Out of 9,453 videos that contained smoke words, 322 (3.4%) mentioned safety issues, vs. only 174 (2.9%) of the 5,949 videos with negative sentiment words. Only 1% of randomly-selected videos mentioned safety hazards.

Conclusions: Comparing the number of videos with true safety issues that contain sentiment words vs. smoke words in their title or description, we show that smoke words are a more accurate predictor of safety hazards in video-based product reviews than sentiment words. This research also discovers words that are indicative of true hazards versus false positives in online video-based product reviews. Practical applications: The smoke words lists and word sub-groups generated in this paper can be used by manufacturers and consumer product safety organizations to more efficiently identify product safety issues from online videos. This project also provides realistic baselines for resource estimates for future projects that aim to discover safety issues from online videos or reviews.

Keywords: Online video sharing; Product recall; Safety hazard; Smoke words; Text mining.

MeSH terms

  • Data Mining*
  • Humans
  • Safety / statistics & numerical data*
  • Social Media / statistics & numerical data*
  • Video Recording / statistics & numerical data*