The content accuracy of off-the-shelf large language models (LLMs) mirrors the content accuracy of the unregulated Internet from which these generative artificial intelligence models are supplied. With error rates approximating 30% in terms of treatment recommendations for the management of common musculoskeletal conditions, seeking expert opinion remains paramount. However, custom LLMs represent an excellent opportunity to infuse niche, bespoke expertise from the many specialties and subspecialties within medicine. Methods of customizing these generative models broadly fall under the categories of prompt engineering; "retrieval-augmented generation" prioritizing retrieval of relevant information from a specific domain of data; "fine-tuning" of a basic pretrained model into one that is refined for health care-related vernacular and acronyms; and "agentic augmentation" including software that breaks down complex tasks into smaller ones, recruiting multiple LLMs (with or without retrieval-augmented generation), optimizing the output, internally deciding whether the response is appropriate or sufficient, and even passing on an unmet outcome to a human for supervision ("phone a friend"). Custom LLMs offer physicians and their associated organizations the rare opportunity to regain control of our profession by re-establishing authority in our increasingly digital landscape.
Copyright © 2024 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.