Early last year David Goldstein and colleagues published a provocative paper claiming that many GWAS associations are driven not by common variants of modest effect (the canonical common disease – common variant hypothesis underpinning GWAS) but instead by a local cluster of lower frequency variants that have much bigger effects on disease risk. They dubbed this hypothesized phenomenon “synthetic association” and the term quickly became a genetics buzzword. The paper was widely discussed in both the specialist and mainstream media, and caused quite a stir among academic statistical geneticists.
That debate has been re-opened today by a set of Perspectives in PLoS Biology: a rebuttal by us (Carl & Jeff) and our colleagues at Sanger, a rebuttal by Naomi Wray, Shaun Purcell and Peter Visscher, a rebuttal to the rebuttals by David Goldstein and an editorial by Robert Shields to tie it all together.
What are the messages from all this? Well, we argue in our piece that several lines of evidence suggest that synthetic associations, while plausible, aren’t very common. First, family based linkage studies (like the ones used to identify BRCA1, CFTR and other culprits behind single-gene disorders), which were remarkably unsuccessful in studying complex disease, are well powered to pick up the kind of genetic model underlying synthetic association. Second, attempts to find these rarer ‘smoking gun’ mutations (e.g. by completely sequencing many patients in GWAS regions) haven’t turned up much yet. Finally, the synthetic association hypothesis would predict that GWAS hits are ancestry-specific (e.g. genes found in Europeans wouldn’t turn up in a study of Japanese), whereas nearly all GWAS results studied in sufficient depth have replicated across many populations.
Interestingly there is a well documented example of a synthetic association that we’ve worked extensively on: NOD2 and Crohn’s disease is a GWAS hit driven by three nearby low-frequency, large effect variants. It conveniently also illustrates all of our points above: it was originally discovered by linkage, the three coding variants were discovered by resequencing and it is not associated in East Asia. For these reasons and more, NOD2 is an outlier from the GWAS experience, underlining the likelihood that such occurrences are rare.
The Wray et al paper provides an even more technical critique, showing that neither the allele frequency distribution, nor the number of independent associations predicted by the synthetic association model are consistent with the bulk of GWAS observations. In reply, Goldstein presents a series of logical arguments which he asserts contradict some of the data presented in the other two papers. He usefully presents a number of points where all parties agree: GWAS were useful and valuable information about disease genetics has been learned, and synthetic associations are theoretically possible. He maintains, however, that the question of how widespread they are is still unresolved. We obviously disagree with this notion, but are glad that PLoS Biology has put together a nice series of articles arguing both sides of the question.
[Added in edit: Razib Khan has additional background on synthetic associations, and a dissection of the paper by Wray et al., over at Discover.]