We present the motivation, experience, and learnings from a data challenge conducted at a large pharmaceutical corporation on the topic of subgroup identification. The data challenge aimed at exploring approaches to subgroup identification for future clinical trials. To mimic a realistic setting, participants had access to 4 Phase III clinical trials to derive a subgroup and predict its treatment effect on a future study not accessible to challenge participants. A total of 30 teams registered for the challenge with around 100 participants, primarily from Biostatistics organization. We outline the motivation for running the challenge, the challenge rules, and logistics. Finally, we present the results of the challenge, the participant feedback as well as the learnings. We also present our view on the implications of the results on exploratory analyses related to treatment effect heterogeneity.
Keywords: common task framework; data science; machine learning; subgroup analysis; subgroup identification.
© 2024 John Wiley & Sons Ltd.