Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs. A Powerful and Economical Tool

K Ohneberg; M Wolkewitz; J Beyersmann; M Palomar-Martinez; P Olaechea-Astigarraga; F Alvarez-Lerma; M Schumacher

doi:10.3414/ME14-01-0113

Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs. A Powerful and Economical Tool

Methods Inf Med. 2015;54(6):505-14. doi: 10.3414/ME14-01-0113. Epub 2015 Jun 25.

Authors

K Ohneberg¹, M Wolkewitz, J Beyersmann, M Palomar-Martinez, P Olaechea-Astigarraga, F Alvarez-Lerma, M Schumacher

Affiliation

¹ Kristin Ohneberg, Institute for Medical Biometry and Statistics, Medical Center - University of Freiburg, Stefan-Meier-Str. 26, 79104 Freiburg, Germany, E-mail: [email protected].

PMID: 26108707
DOI: 10.3414/ME14-01-0113

Abstract

Background: Sampling from a large cohort in order to derive a subsample that would be sufficient for statistical analysis is a frequently used method for handling large data sets in epidemiological studies with limited resources for exposure measurement. For clinical studies however, when interest is in the influence of a potential risk factor, cohort studies are often the first choice with all individuals entering the analysis.

Objectives: Our aim is to close the gap between epidemiological and clinical studies with respect to design and power considerations. Schoenfeld's formula for the number of events required for a Cox' proportional hazards model is fundamental. Our objective is to compare the power of analyzing the full cohort and the power of a nested case-control and a case-cohort design.

Methods: We compare formulas for power for sampling designs and cohort studies. In our data example we simultaneously apply a nested case-control design with a varying number of controls matched to each case, a case cohort design with varying subcohort size, a random subsample and a full cohort analysis. For each design we calculate the standard error for estimated regression coefficients and the mean number of distinct persons, for whom covariate information is required.

Results: The formula for the power of a nested case-control design and the power of a case-cohort design is directly connected to the power of a cohort study using the well known Schoenfeld formula. The loss in precision of parameter estimates is relatively small compared to the saving in resources.

Conclusions: Nested case-control and case-cohort studies, but not random subsamples yield an attractive alternative for analyzing clinical studies in the situation of a low event rate. Power calculations can be conducted straightforwardly to quantify the loss of power compared to the savings in the num-ber of patients using a sampling design instead of analyzing the full cohort.

Keywords: Case-cohort design; cohort study; nested case-control design; power; sample size.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Case-Control Studies*
Cohort Studies*
Data Interpretation, Statistical
Outcome Assessment, Health Care / methods*
Proportional Hazards Models*
Research Design*
Sample Size*