Probability that a Two-Stage Genome-Wide Association Study will Detect a Disease-Associated SNP and Implications for Multistage Designs
Large two-stage genome-wide association studies (GWASs) have been shown to reduce required genotyping with little loss of power, compared to a one-stage design, provided a substantial fraction of cases and controls, sample , is included in stage 1. However, a number of recent GWASs have used sample < 0.2 . Moreover, standard power calculations are not applicable because SNPs are selected in stage 1 by ranking their p-values, rather than comparing each SNP's statistic to a fixed critical value. We define the detection probability (DP) of a two-stage design as the probability that a given disease-associated SNP will have a p-value among the lowest ranks of p-values at stage 1, and, among those SNPs selected at stage 1, at stage 2. For 8000 cases and 8000 controls available for study and for odds ratios per allele in the range 1.1-1.3, we show that DP is substantially reduced for designs with sample≤ 0.25 , and that DP cannot be appreciably increased by analyzing the stage 1 and stage 2 data jointly. These results suggest that multistage designs with small first stages (e.g. sample≤ 0.25 ) should be avoided, and that additional genotyping in earlier studies with small first stages will yield previously unselected disease-associated SNPs.
Document Type: Research Article
Affiliations: 1: Division of Cancer Epidemiology and Genetics, National Cancer Institute, EPS 8032, Bethesda, MD, 20892-7244, US 2: Information Management Services Incorporated, 6110 Executive Boulevard, Suite 310, Rockville, MD, 20852, US
Publication date: 01 November 2008