Tag Archive for 'GWAS'

Size matters, and other lessons from medical genetics

Size really matters: prior to the era of large genome-wide association studies, the large effect sizes reported in small initial genetic studies often dwindled towards zero (that is, an odds ratio of one) as more samples were studied. Adapted from Ioannidis et al., Nat Genet 29:306-309.

[Last week, Ed Yong at Not Exactly Rocket Science covered a paper positing an association between a genetic variant and an aspect of social behavior called prosociality. On Twitter, Daniel and Joe dismissed this study out of hand due to its small sample size (n = 23), leading Ed to update his post. Daniel and Joe were then contacted by Alex Kogan, the first author of the study in question. He kindly shared his data with us, and agreed to an exchange here on Genomes Unzipped. In this post, we expand on our point about the importance of sample size; Alex’s reply is here.

Edit 01/12/11 (DM): The original version of this post included language that could have been interpreted as an overly broad attack on more serious, well-powered studies in psychiatric disease genetics. I've edited the post to reduce the possibility of collateral damage. To be clear: we're against over-interpretation of results from small studies, not behavioral genetics as a whole, and I apologise for any unintended conflation of the two.]

In October of 1992, genetics researchers published a potentially groundbreaking finding in Nature: a genetic variant in the angiotensin-converting enzyme ACE appeared to modify an individual’s risk of having a heart attack. This finding was notable at the time for the size of the study, which involved a total of over 500 individuals from four cohorts, and the effect size of the identified variant–in a population initially identified as low-risk for heart attack, the variant had an odds ratio of over 3 (with a corresponding p-value less than 0.0001).

Readers familiar with the history of medical association studies will be unsurprised by what happened over the next few years: initial excitement (this same polymorphism was associated with diabetes! And longevity!) was followed by inconclusive replication studies and, ultimately, disappointment. In 2000, 8 years after the initial report, a large study involving over 5,000 cases and controls found absolutely no detectable effect of the ACE polymorphism on heart attack risk. In the meantime, the same polymorphism had turned up in dozens of other association studies for a wide range of traits ranging from obstet­ric cholestasis to menin­go­­coccal disease in children, virtually none of which have ever been convincingly replicated.
Continue reading ‘Size matters, and other lessons from medical genetics’

Friday Links: Studying association studies, and success at last in psychiatric genetics

In PLoS Genetics this week there is a viewpoint article on data sharing in disease genetics. The authors systematically looked at 643 genome-wide association studies published between 2002 and 2010, to see how easily available the results of the studies are now. They found that the availability of full study results has gone down over time, and many groups that do share data have put more restrictions in place on its use. They put this down to fears over the privacy of research subjects, and in particular to the Homer et al study. The Homer et al result is somewhat complicated, but in essence it says that if you have stolen someone’s genotype data, you can use it to figure out if they have participated in any given research study by looking at the full results of the study.

It certainly seems possible that worries about privacy are reducing the free flow of information within the research community. However, whether on balance the decrease in information flow is worth the increase in security is an open question. For my own view, I feel that having the genome-wide results of genome-wide association studies freely available is very important to the field, and is more important than the the rather esoteric risk of someone stealing someone’s DNA and using it to figure out that they once took part in a research study of inflammatory bowel disease. [LJ]

Genome-wide association studies have been hugely successful in identifying dozens of common genetic risk factors for a large number of common diseases. However, one area that GWAS has not had much success in is the field of psychiatric illness, where finding common risk factors that replicate across studies has been consistently difficult. However, it looks like this is starting to change. The current issue of Nature Genetics has two papers from the Psychiatric GWAS Consortium, detailing some of the largest meta-analyses of schizophrenia and bipolar disease ever published.

The schizophrenia study robustly replicated two previously implicated variants, and discovered five new ones, and the bipolar disease study replicated one and discovered a new one. The new variants give us some pretty startling insights into the genetics of the diseases, in particular revealing the importance of a non-coding gene micro-RNA 137 in regulating a wide range of genes expressed in neurons. As always, these variants explain only a small proportion of the total genetic effect, but they show that psychiatric genetics has now truly entered the GWAS arena, with all the scientific benefits that this can bring to medical research. [LJ]

The images above, in order, are taken from the paper Temporal Trends in Results Availability from Genome-Wide Association Studies, and from Wikimedia Commons.

How do variants outside genes influence disease risk?

Over the last several years, the number of genetic variants unambiguously associated with disease risk has grown dramatically. However, interpreting these signals has been extremely difficult—most of the identified variants do not disrupt genes, and indeed many don’t fall anywhere near genes (this observation has even led some to discount these signals entirely). To an investigator interested in following up on these signals, this is somewhat depressing: how can we hope to explore how polymorphisms affect disease risk if they don’t seem to fall in any sort of genome annotation that we understand?

In this context, I thought I’d point to an important paper that, among many other things, gives the first systematic evidence that variants which influence disease are not just randomly scattered across the genome, but instead tend to fall in particular regions—in particular, enhancer elements (regions where DNA-binding proteins interact with DNA to influence gene expression).

The authors rely on the fact that, in the cell, DNA is wrapped around proteins called histones, which control how accessible the DNA is to things like transcription factors (see above figure). These proteins can be chemically modified, and it is now clear that particular patterns of modifications are predictive of the function of the DNA in the region—some modifications indicate transcribed genes, others regions of enhancer activity, others repressed regions, etc.

What the authors did in this study was generate genome-wide maps of several histone modifications in nine different cell types, and use this data to predict the function of each 200 base pair segment of the human genome in each cell type. There are a number of interesting analyses of these “maps” of genome function in the paper, but for our purposes here there’s one of particular interest: the authors took sets of SNPs associated with various diseases and simply asked, are these variants enriched in regions with any particular functional prediction? And indeed, for several phenotypes, there is a striking enrichment of association signals in enhancers elements in a relevant cell type. For example, SNPs which influence lipid levels are enriched in enhancers in a liver cancer cell line, and SNPs which influence the autoimmune disease lupus are enriched in enhancers in a lymphoblastoid cell line.

As these types of functional maps are generated in more cell types, I imagine there will be more stories like this. The problem with interpreting disease association studies, it seems likely, is largely due to our lack of understanding of genome function.

—-
Citation: Ernst et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. doi:10.1038/nature09906

Analysing your own genome, bloggers respond to the FDA and more reporting on bogus GWAS results

Razib Khan, more known for his detailed low-downs of population biology and history, has written an important post on Gene Expression, explaining in careful detail exactly how to run some simple population genetic analysis on public genomes, as well as on your own personal genomics data. The outcome of the tutorial is an ADMIXTURE plot (like the one to the left), showing what proportion of your genome comes from different ancestral populations. This sort of analysis is not difficult, but it can often be hard to know how to start, so Razib’s post gives a good landing point for people who want to dig deaper into their own genomes.

This tutorial also ties in to some political ideas that Razib has been talking about since the recent call to allow access to genomic information only via prescription. If you are worried about losing access to your genome, one option is to ensure that you do not require companies to generate and interpret your genome. As sequencing, genotyping and computing prices fall, DIY genetics becomes more and more plausible. Learn to discover things about your own genome, and no-one will be able to take that away from you. [LJ]

Continue reading ‘Analysing your own genome, bloggers respond to the FDA and more reporting on bogus GWAS results’

Are synthetic associations a man-made phenomenon?

Early last year David Goldstein and colleagues published a provocative paper claiming that many GWAS associations are driven not by common variants of modest effect (the canonical common disease – common variant hypothesis underpinning GWAS) but instead by a local cluster of lower frequency  variants that have much bigger effects on disease risk. They dubbed this hypothesized phenomenon “synthetic association” and the term quickly became a genetics buzzword. The paper was widely discussed in both the specialist and mainstream media, and caused quite a stir among academic statistical geneticists.

That debate has been re-opened today by a set of Perspectives in PLoS Biology: a rebuttal by us (Carl & Jeff) and our colleagues at Sanger, a rebuttal by Naomi Wray, Shaun Purcell and Peter Visscher, a rebuttal to the rebuttals by David Goldstein and an editorial by Robert Shields to tie it all together.

Continue reading ‘Are synthetic associations a man-made phenomenon?’

From GWAS to pathways, the consequences of DTC genetics and screening by sequencing

A paper out in PLoS Genetics this week takes a step towards using genome-wide association data to reconstruct functional pathways. Using protein-protein interaction data and tissue-specific expression data, the authors reconstruct biochemical pathways that underlie various diseases, by looking for variants that interact with genes in GWAS regions. These networks can then tell us about what systems are disrupted by GWAS variants as a whole, as well as identifying potential drug targets. The figure to the right shows the network constructed for Crohn’s disease; large colored circles are genes in GWAS loci, small grey circles are other genes in the network they constructed. As an interesting side note, the GWAS variants were taken from a 2008 study; since then, we have published a new meta-analysis, which implicated a lot of new regions. 10 genes in these regions, marked as small red circles on the figure, were also in the disease network. [LJ]

23andMe customers will be interested in a neat little FireFox plug-in that allows them to view their own genotypes for any 23andMe SNP mentioned on a web page. You can download the plug-in here (you’ll need to have an up-to-date version of FireFox), and I have a brief review of the tool here. [DM]
Continue reading ‘From GWAS to pathways, the consequences of DTC genetics and screening by sequencing’

Estimating heritability using twins

Last week, a post went up on the Bioscience Resource Project blog entited The Great DNA Data Deficit. This is another in a long string of “Death of GWAS” posts that have appeared around the last year. The authors claim that because GWAS has failed to identify many “major disease genes”, i.e. high frequency variants with large effect on disease, it was therefore not worthwhile; this is all old stuff, that I have discussed elsewhere (see also my “Standard GWAS Disclaimer” below). In this case, the authors argue that the genetic contribution to complex disease has been massively overestimated, and in fact genetics does not play as large a part in disease as we believe.

The one particularly new thing about this article is that they actually look at the foundation for beliefs about missing heritability; the twin studies of identical and non-identical twins from which we get our estimates of the heritability of disease. I approve of this: I think all those who are interested in the genetics of disease should be fluent in the methodology of twin studies. However, in this case, the authors come to the rather odd conclusion that heritability measures are largely useless, based on a small statistical misunderstanding of how such studies are done.

I thought I would use this opportunity to explain, in relative detail, where we get our estimates of heritability from, why they are generally well-measured and robust, and real issues need to be considered when interpreting twin study results. This post is going to contain a little bit of maths, but don’t worry if it scares you a little, you only really need to get the gist.
Continue reading ‘Estimating heritability using twins’

Friday Links

A quick note about the Reader Survey; we are going to stop taking responses at the end of Saturday (Pacific Time). If you haven’t already done so, please fill out the survey now.

A couple of interesting articles this week on the Personal Genome Project and public genomics in general. Mark Henderson at the Times has an opinion piece (behind a paywall, I’m afraid) about Misha Angrist‘s book Here Is A Human Being (see also this review from The Intersection), and in the Duke Magazine Mary Carmichael has an in-depth feature on the work of George Church, with some interesting history of the early days of the PGP.

One aspect that comes out of these articles is how those who take part in public genomics projects are starting to own the unknown unknowns. They accept that they cannot anticipate all the risks of making their data public, but are willing to take the risk of exposing themselves to these unknown risks, and in doing so turn them into knowns. Another aspect is the sheer number of individuals who want to sign up to have their data published online: 15,000 people have expressed interesting in being part of the PGP, despite initial NIH concerns the no-one would want to take part at all. This also chimes with research presented at ASHG this year, showing that members of the public are more concerned with contributing to scientific knowledge, and, crucially, getting access to their own genetic data than they are about the potential risks that such data could expose them too. [LJ]

Continue reading ‘Friday Links’

Friday Links

At the risk of turning Friday Links into a self-trumpet-blowing occasion, we are happy to report that a number of GNZ contributors (Jeff, Carl and Luke) are authors on a new Crohn’s disease GWAS meta-analysis of 6000 patients that came out in Nature Genetics this week. The study brings the number of Crohn’s associations up to 71, with 30 novel, bringing the proportion of heritability explained up to about 24%; also worth noting that all of the associations from the previous meta-analysis were replicated it this one, showing how the cross-platform independent replication experiments that are now standard have largely obliterated false positives in GWAS. There were also 5 loci that showed evidence of a second, independent signal, which I think is a promising sign of things to come.

Continue reading ‘Friday Links’

Friday Links

The largest genome-wide association study ever undertaken was published in Nature this week. The appropriately named Genetic Investigation of ANthropocentric Traits (GIANT) consortium combined data from 183,727 individuals and identified around 180 loci influencing human height. The loci were enriched with genes underlying skeletal growth and other relevant biological pathways. Interestingly, these 180 loci are estimated to only account for 10% of the phenotypic variation in height (or around 12.5% of the heritability). [CAA]

Christophe Lambert from Golden Helix has an excellent, thorough post looking at the importance of careful experimental design in large-scale genetic association studies. In particular, Lambert focuses on the need for randomising samples across experimental batches: if you have some batches containing entirely cases and others entirely controls, then the all-too-pervasive spectre of batch effects can easily create false positive associations. In many cases batch effects can be recognised and corrected for post hoc (Lambert cites a good example from the original WTCCC study), but in other cases a failure to perform the right quality controls can have devastating consequences (Lambert cites the recent longevity GWAS paper in Science). I’d be interested to hear from my more GWAS-savvy colleagues (Carl, Jeff) whether randomisation is standard procedure in most large GWAS now. [DM]

We managed to miss this out last week, but the current issue of Nature Genetics has a strange and wonderful paper on breast cancer genetics. The study looked at 2838 individuals with BRCA1 mutations that strongly predispose to breast cancer, and looked for non-BRCA1 variants associations with breast cancer in this group. They found an associated variant of chromosome 19, and replicated it in another 5986 BRCA1 carriers (where do they find this many BRCA1 carriers?). To top it all off, they looked at this variant in another 6800 breast cancer patients without BRCA1 mutations, and found no association. However, when they stratified their samples into ER+ and ER- associations, they found associations in both, but going in opposite directions! The variant predisposes people to ER- cancer, but is protective against ER+, and taken together they pretty much perfectly balance out. [LJ]


Page optimized by WP Minify WordPress Plugin