Motivation: Somatic amplification of particular genomic regions and selection of cellular lineages with such amplifications drives tumor development. However, pinpointing genes under such selection has been difficult due to the large span of these regions. Our recently-developed method, the amplification distortion test (ADT), identifies specific nucleotide alleles and haplotypes that confer better survival for tumor cells when somatically amplified. In this work, we focus on evaluating ADT's power to detect such causal variants across a variety of tumor dataset scenarios.
Results: Towards this end, we generated multiple parameter-based, synthetic datasets-derived from real data-that contain somatic copy number aberrations (CNAs) of various lengths and frequencies over germline single nucleotide polymorphisms (SNPs) genome-wide. Gold-standard causal sub-regions were assigned within these CNAs, followed by an assessment of ADT's ability to detect these sub-regions. Results indicate that ADT possesses high sensitivity and specificity in large sample sizes across most parameter cases, including those that more closely reflect existing SNP and CNA cancer data.