The current issue of Cell has some important correspondence in response to an essay published by Jon McClellan and Mary Claire King in April. Daniel covered the original piece and hosted a guest post from Kai Wang which detailed some of the more obvious flaws in their argument. Now, Wang and his colleagues from Philadelphia have published an official response in Cell, in parallel with a similar letter from Robert Klein and colleagues from New York. Accompanying these is a further reply from McClellan and King. Read on for an overview of three contentious statements made in the original piece, and the rebuttals to each.
[In response to a comment, I've added the most representative single sentence quotation I could find from the original McClellan and King essay next to each of their claims as I've expressed them. --JB]
- Claim: Disease predisposing alleles cannot circulate at high frequency in human populations because negative selection is constantly trying to purge them. ["In order to be maintained at polymorphic frequencies worldwide, common variants with even modest influence on disease must withstand selective pressure in every generation."]
Reply: Selection always acts within an environmental context, and the environment in which alleles identified by GWAS are deleterious (i.e. Western countries in the last hundred years) is starkly different from the environment in which nearly all of human evolution (indeed evolution of any species) has taken place. Furthermore, balancing (rather than strictly negative) selection might play an important role in many GWAS hits (there are numerous examples documented where one allele at a locus simultaneously increases risk of one condition and protects from another). Finally, GWAS hits have a very weak effect on disease risk, and many of the diseases in question are relatively late-onset. These factors combine to mean that the net selective disadvantage is weak, and sweeping the alleles out of the population will take a long time even in a relatively stable environment (let alone a rapidly changing one).
- Claim: Most GWAS hits are intronic or intergenic, and thus can’t possibly be pointing at something functional. ["A major limitation of genome-wide association studies is the lack of any functional link between the vast majority of risk variants and the disorders they putatively influence."]
Reply: This has to be the most bizarre of McClellan and King’s claims. First, King herself posited in a famous 1975 Science paper that regulation, rather than protein coding changes likely explain many phenotypic differences. Second, the fundamental design of GWAS relies on the fact that SNPs actually studied aren’t necessarily causative, but are correlated with unknown causal alleles. We certainly haven’t pegged down all the biology underlying GWAS hits, but that to me is the most exciting part of ongoing analysis of these studies: it would be boring indeed if they had all easily mapped to nonsynonymous coding SNPs in candidate genes.
- Claim: Many (most? all?) GWAS hits are due to cryptic population structure. ["We further suggest that many GWAS findings stem from factors other than a true association with disease risk."]
Reply: Wang et al point out that GWAS practitioners generally bend over backwards to address possible population stratification, and there exist well known and widely used methods. The particular example that McClellan and King harped on was, in fact, studied largely in family based samples which are immune to population stratification. Furthermore, the “evidence” that the SNP in question varies widely in frequency across Europe was based on absurdly small sample sizes (McClellan and King neglected to put any confidence intervals on their published estimates of allele frequency in different parts of Europe). Examination in an enlarged sample set from one such population reveals that McClellan and King’s estimate of 71% frequency in Tuscans is actually 41%, which is remarkably similar to the 39% estimate elsewhere in Europe.