Feasibility of reusing time-matched controls in an overlapping cohort

Bénédicte Delcoigne; Niels Hagenbuch; Maria Ec Schelin; Agus Salim; Linda S Lindström; Jonas Bergh; Kamila Czene; Marie Reilly

doi:10.1177/0962280216669744

Feasibility of reusing time-matched controls in an overlapping cohort

Stat Methods Med Res. 2018 Jun;27(6):1818-1829. doi: 10.1177/0962280216669744. Epub 2016 Sep 21.

Authors

Bénédicte Delcoigne¹, Niels Hagenbuch¹, Maria Ec Schelin², Agus Salim³, Linda S Lindström^{1

4}, Jonas Bergh¹, Kamila Czene¹, Marie Reilly¹

Affiliations

¹ 1 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
² 2 Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.
³ 3 Department of Mathematics and Statistics, La Trobe University, Victoria, Australia.
⁴ 4 Department of Surgery, University of California, San Francisco, CA, USA.

PMID: 27659169
DOI: 10.1177/0962280216669744

Abstract

The methods developed for secondary analysis of nested case-control data have been illustrated only in simplified settings in a common cohort and have not found their way into biostatistical practice. This paper demonstrates the feasibility of reusing prior nested case-control data in a realistic setting where a new outcome is available in an overlapping cohort where no new controls were gathered and where all data have been anonymised. Using basic information about the background cohort and sampling criteria, the new cases and prior data are "aligned" to identify the common underlying study base. With this study base, a Kaplan-Meier table of the prior outcome extracts the risk sets required to calculate the weights to assign to the controls to remove the sampling bias. A weighted Cox regression, implemented in standard statistical software, provides unbiased hazard ratios. Using the method to compare cases of contralateral breast cancer to available controls from a prior study of metastases, we identified a multifocal tumor as a risk factor that has not been reported previously. We examine the sensitivity of the method to an imperfect weighting scheme and discuss its merits and pitfalls to provide guidance for its use in medical research studies.

Keywords: GLM weights; Kaplan–Meier type weights; Nested case-control; cost-efficiency; secondary analysis; weighted Cox regression.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Breast Neoplasms / drug therapy
Breast Neoplasms / pathology
Case-Control Studies
Cohort Studies*
Data Analysis*
Feasibility Studies
Female
Humans
Kaplan-Meier Estimate
Middle Aged
Proportional Hazards Models*