The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.
© 2010 Wiley-Liss, Inc.