Product safety professionals must assess the risks to consumers associated with the foreseeable uses and misuses of products. In this study, we investigate the utility of generative artificial intelligence (AI), specifically large language models (LLMs) such as ChatGPT, across a number of tasks involved in the product risk assessment process. For a set of six consumer products, prompts were developed related to failure mode identification, the construction and population of a failure mode and effects analysis (FMEA) table, risk mitigation identification, and guidance to product designers, users, and regulators. These prompts were input into ChatGPT and the outputs were recorded. A survey was administered to product safety professionals to ascertain the quality of the outputs. We found that ChatGPT generally performed better at divergent thinking tasks such as brainstorming potential failure modes and risk mitigations. However, there were errors and inconsistencies in some of the results, and the guidance provided was perceived as overly generic, occasionally outlandish, and not reflective of the depth of knowledge held by a subject matter expert. When tested against a sample of other LLMs, similar patterns in strengths and weaknesses were demonstrated. Despite these challenges, a role for LLMs may still exist in product risk assessment to assist in ideation, while experts may shift their focus to critical review of AI-generated content.
Keywords: FMEA; Generative AI; Product safety.
© 2024 The Author(s). Risk Analysis published by Wiley Periodicals LLC on behalf of Society for Risk Analysis.