Background: A systematic risk identification system has the potential to test marketed drugs for important Health Outcomes of Interest or HOI. For each HOI, multiple definitions are used in the literature, and some of them are validated for certain databases. However, little is known about the effect of different definitions on the ability of methods to estimate their association with medical products.
Objectives: Alternative definitions of HOI were studied for their effect on the performance of analytical methods in observational outcome studies.
Methods: A set of alternative definitions for three HOI were defined based on literature review and clinical diagnosis guidelines: acute kidney injury, acute liver injury and acute myocardial infarction. The definitions varied by the choice of diagnostic codes and the inclusion of procedure codes and lab values. They were then used to empirically study an array of analytical methods with various analytical choices in four observational healthcare databases. The methods were executed against predefined drug-HOI pairs to generate an effect estimate and standard error for each pair. These test cases included positive controls (active ingredients with evidence to suspect a positive association with the outcome) and negative controls (active ingredients with no evidence to expect an effect on the outcome). Three different performance metrics where used: (i) Area Under the Receiver Operator Characteristics (ROC) curve (AUC) as a measure of a method's ability to distinguish between positive and negative test cases, (ii) Measure of bias by estimation of distribution of observed effect estimates for the negative test pairs where the true effect can be assumed to be one (no relative risk), and (iii) Minimal Detectable Relative Risk (MDRR) as a measure of whether there is sufficient power to generate effect estimates.
Results: In the three outcomes studied, different definitions of outcomes show comparable ability to differentiate true from false control cases (AUC) and a similar bias estimation. However, broader definitions generating larger outcome cohorts allowed more drugs to be studied with sufficient statistical power.
Conclusions: Broader definitions are preferred since they allow studying drugs with lower prevalence than the more precise or narrow definitions while showing comparable performance characteristics in differentiation of signal vs. no signal as well as effect size estimation.