Tag Archive for 'GWAS'

The undiscovered chromosome

ChrXSexDiffPicBlackCroppedThis guest post was contributed by Taru Tukiainen, a postdoctoral research fellow in the Analytic and Translational Research Unit at Massachusetts General Hospital and the Broad Institute of MIT and Harvard.

The X chromosome contains around 5% of DNA in the human genome, but has remained largely unexplored in genome-wide association studies (GWAS) – to date, roughly two thirds of GWAS have thrown the X-chromosomal data out of their analyses. In a paper published in PLOS Genetics yesterday we dig into X chromosome associations and demonstrate why this stretch of DNA warrants particular attention in genetic association and sequencing studies. This post will focus on one of our key results: the possibility that some of the X chromosome loci contribute to sexual dimorphism, i.e. biological differences between men and women.
Continue reading ‘The undiscovered chromosome’

Looking closer at natural selection in inflammatory bowel disease

As I mentioned a few weeks ago, we recently published a large study into the genetics of inflammatory bowel disease (IBD), which included a number of analyses digging into the biology and evolutionary history of IBD genetic risk. Gratifyingly, our paper has stimulated a lot of discussion among other scientists, which has generated several ideas about future directions for this work. One question that was raised by several population-genetics experts at ASHG was about our natural selection analysis, and in particular our claim to discover an enrichment of balancing selection in IBD loci. In the paper, we found clear signals of natural selection on IBD loci, a subset of which we interpreted as balancing selection. In this post I will set out how I came to this conclusion, but then outline another explanation that could explain the results: recent local positive selection in Europeans.

Continue reading ‘Looking closer at natural selection in inflammatory bowel disease’

Dozens of new IBD genes, but can they predict disease?

Out in Nature this week is a paper by three Genomes Unzipped authors reporting 71 new genetic associations with inflammatory bowel disease (IBD). This breaks the record for the largest number of associations for any common disease, and includes many new and interesting biological insights that you should all go and read about in the paper itself (pay-to-access I’m afraid) or on the Sanger Institute’s website.

One thing that we did not discuss in the paper was genetic prediction of IBD (i.e. using the risk variants we have discovered to predict who will or will not develop the disease). In this post I want to outline some of the situations in which we have considered using genetic risk prediction of IBD, and discuss whether any of them would actually work in practice.

Continue reading ‘Dozens of new IBD genes, but can they predict disease?’

Size matters, and other lessons from medical genetics

Size really matters: prior to the era of large genome-wide association studies, the large effect sizes reported in small initial genetic studies often dwindled towards zero (that is, an odds ratio of one) as more samples were studied. Adapted from Ioannidis et al., Nat Genet 29:306-309.

[Last week, Ed Yong at Not Exactly Rocket Science covered a paper positing an association between a genetic variant and an aspect of social behavior called prosociality. On Twitter, Daniel and Joe dismissed this study out of hand due to its small sample size (n = 23), leading Ed to update his post. Daniel and Joe were then contacted by Alex Kogan, the first author of the study in question. He kindly shared his data with us, and agreed to an exchange here on Genomes Unzipped. In this post, we expand on our point about the importance of sample size; Alex’s reply is here.

Edit 01/12/11 (DM): The original version of this post included language that could have been interpreted as an overly broad attack on more serious, well-powered studies in psychiatric disease genetics. I've edited the post to reduce the possibility of collateral damage. To be clear: we're against over-interpretation of results from small studies, not behavioral genetics as a whole, and I apologise for any unintended conflation of the two.]

In October of 1992, genetics researchers published a potentially groundbreaking finding in Nature: a genetic variant in the angiotensin-converting enzyme ACE appeared to modify an individual’s risk of having a heart attack. This finding was notable at the time for the size of the study, which involved a total of over 500 individuals from four cohorts, and the effect size of the identified variant–in a population initially identified as low-risk for heart attack, the variant had an odds ratio of over 3 (with a corresponding p-value less than 0.0001).

Readers familiar with the history of medical association studies will be unsurprised by what happened over the next few years: initial excitement (this same polymorphism was associated with diabetes! And longevity!) was followed by inconclusive replication studies and, ultimately, disappointment. In 2000, 8 years after the initial report, a large study involving over 5,000 cases and controls found absolutely no detectable effect of the ACE polymorphism on heart attack risk. In the meantime, the same polymorphism had turned up in dozens of other association studies for a wide range of traits ranging from obstet­ric cholestasis to menin­go­­coccal disease in children, virtually none of which have ever been convincingly replicated.
Continue reading ‘Size matters, and other lessons from medical genetics’

Friday Links: Studying association studies, and success at last in psychiatric genetics

In PLoS Genetics this week there is a viewpoint article on data sharing in disease genetics. The authors systematically looked at 643 genome-wide association studies published between 2002 and 2010, to see how easily available the results of the studies are now. They found that the availability of full study results has gone down over time, and many groups that do share data have put more restrictions in place on its use. They put this down to fears over the privacy of research subjects, and in particular to the Homer et al study. The Homer et al result is somewhat complicated, but in essence it says that if you have stolen someone’s genotype data, you can use it to figure out if they have participated in any given research study by looking at the full results of the study.

It certainly seems possible that worries about privacy are reducing the free flow of information within the research community. However, whether on balance the decrease in information flow is worth the increase in security is an open question. For my own view, I feel that having the genome-wide results of genome-wide association studies freely available is very important to the field, and is more important than the the rather esoteric risk of someone stealing someone’s DNA and using it to figure out that they once took part in a research study of inflammatory bowel disease. [LJ]

Genome-wide association studies have been hugely successful in identifying dozens of common genetic risk factors for a large number of common diseases. However, one area that GWAS has not had much success in is the field of psychiatric illness, where finding common risk factors that replicate across studies has been consistently difficult. However, it looks like this is starting to change. The current issue of Nature Genetics has two papers from the Psychiatric GWAS Consortium, detailing some of the largest meta-analyses of schizophrenia and bipolar disease ever published.

The schizophrenia study robustly replicated two previously implicated variants, and discovered five new ones, and the bipolar disease study replicated one and discovered a new one. The new variants give us some pretty startling insights into the genetics of the diseases, in particular revealing the importance of a non-coding gene micro-RNA 137 in regulating a wide range of genes expressed in neurons. As always, these variants explain only a small proportion of the total genetic effect, but they show that psychiatric genetics has now truly entered the GWAS arena, with all the scientific benefits that this can bring to medical research. [LJ]

The images above, in order, are taken from the paper Temporal Trends in Results Availability from Genome-Wide Association Studies, and from Wikimedia Commons.

How do variants outside genes influence disease risk?

Over the last several years, the number of genetic variants unambiguously associated with disease risk has grown dramatically. However, interpreting these signals has been extremely difficult—most of the identified variants do not disrupt genes, and indeed many don’t fall anywhere near genes (this observation has even led some to discount these signals entirely). To an investigator interested in following up on these signals, this is somewhat depressing: how can we hope to explore how polymorphisms affect disease risk if they don’t seem to fall in any sort of genome annotation that we understand?

In this context, I thought I’d point to an important paper that, among many other things, gives the first systematic evidence that variants which influence disease are not just randomly scattered across the genome, but instead tend to fall in particular regions—in particular, enhancer elements (regions where DNA-binding proteins interact with DNA to influence gene expression).

The authors rely on the fact that, in the cell, DNA is wrapped around proteins called histones, which control how accessible the DNA is to things like transcription factors (see above figure). These proteins can be chemically modified, and it is now clear that particular patterns of modifications are predictive of the function of the DNA in the region—some modifications indicate transcribed genes, others regions of enhancer activity, others repressed regions, etc.

What the authors did in this study was generate genome-wide maps of several histone modifications in nine different cell types, and use this data to predict the function of each 200 base pair segment of the human genome in each cell type. There are a number of interesting analyses of these “maps” of genome function in the paper, but for our purposes here there’s one of particular interest: the authors took sets of SNPs associated with various diseases and simply asked, are these variants enriched in regions with any particular functional prediction? And indeed, for several phenotypes, there is a striking enrichment of association signals in enhancers elements in a relevant cell type. For example, SNPs which influence lipid levels are enriched in enhancers in a liver cancer cell line, and SNPs which influence the autoimmune disease lupus are enriched in enhancers in a lymphoblastoid cell line.

As these types of functional maps are generated in more cell types, I imagine there will be more stories like this. The problem with interpreting disease association studies, it seems likely, is largely due to our lack of understanding of genome function.

Citation: Ernst et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. doi:10.1038/nature09906

Analysing your own genome, bloggers respond to the FDA and more reporting on bogus GWAS results

Razib Khan, more known for his detailed low-downs of population biology and history, has written an important post on Gene Expression, explaining in careful detail exactly how to run some simple population genetic analysis on public genomes, as well as on your own personal genomics data. The outcome of the tutorial is an ADMIXTURE plot (like the one to the left), showing what proportion of your genome comes from different ancestral populations. This sort of analysis is not difficult, but it can often be hard to know how to start, so Razib’s post gives a good landing point for people who want to dig deaper into their own genomes.

This tutorial also ties in to some political ideas that Razib has been talking about since the recent call to allow access to genomic information only via prescription. If you are worried about losing access to your genome, one option is to ensure that you do not require companies to generate and interpret your genome. As sequencing, genotyping and computing prices fall, DIY genetics becomes more and more plausible. Learn to discover things about your own genome, and no-one will be able to take that away from you. [LJ]

Continue reading ‘Analysing your own genome, bloggers respond to the FDA and more reporting on bogus GWAS results’

Are synthetic associations a man-made phenomenon?

Early last year David Goldstein and colleagues published a provocative paper claiming that many GWAS associations are driven not by common variants of modest effect (the canonical common disease – common variant hypothesis underpinning GWAS) but instead by a local cluster of lower frequency  variants that have much bigger effects on disease risk. They dubbed this hypothesized phenomenon “synthetic association” and the term quickly became a genetics buzzword. The paper was widely discussed in both the specialist and mainstream media, and caused quite a stir among academic statistical geneticists.

That debate has been re-opened today by a set of Perspectives in PLoS Biology: a rebuttal by us (Carl & Jeff) and our colleagues at Sanger, a rebuttal by Naomi Wray, Shaun Purcell and Peter Visscher, a rebuttal to the rebuttals by David Goldstein and an editorial by Robert Shields to tie it all together.

Continue reading ‘Are synthetic associations a man-made phenomenon?’

From GWAS to pathways, the consequences of DTC genetics and screening by sequencing

A paper out in PLoS Genetics this week takes a step towards using genome-wide association data to reconstruct functional pathways. Using protein-protein interaction data and tissue-specific expression data, the authors reconstruct biochemical pathways that underlie various diseases, by looking for variants that interact with genes in GWAS regions. These networks can then tell us about what systems are disrupted by GWAS variants as a whole, as well as identifying potential drug targets. The figure to the right shows the network constructed for Crohn’s disease; large colored circles are genes in GWAS loci, small grey circles are other genes in the network they constructed. As an interesting side note, the GWAS variants were taken from a 2008 study; since then, we have published a new meta-analysis, which implicated a lot of new regions. 10 genes in these regions, marked as small red circles on the figure, were also in the disease network. [LJ]

23andMe customers will be interested in a neat little FireFox plug-in that allows them to view their own genotypes for any 23andMe SNP mentioned on a web page. You can download the plug-in here (you’ll need to have an up-to-date version of FireFox), and I have a brief review of the tool here. [DM]
Continue reading ‘From GWAS to pathways, the consequences of DTC genetics and screening by sequencing’

Estimating heritability using twins

Last week, a post went up on the Bioscience Resource Project blog entited The Great DNA Data Deficit. This is another in a long string of “Death of GWAS” posts that have appeared around the last year. The authors claim that because GWAS has failed to identify many “major disease genes”, i.e. high frequency variants with large effect on disease, it was therefore not worthwhile; this is all old stuff, that I have discussed elsewhere (see also my “Standard GWAS Disclaimer” below). In this case, the authors argue that the genetic contribution to complex disease has been massively overestimated, and in fact genetics does not play as large a part in disease as we believe.

The one particularly new thing about this article is that they actually look at the foundation for beliefs about missing heritability; the twin studies of identical and non-identical twins from which we get our estimates of the heritability of disease. I approve of this: I think all those who are interested in the genetics of disease should be fluent in the methodology of twin studies. However, in this case, the authors come to the rather odd conclusion that heritability measures are largely useless, based on a small statistical misunderstanding of how such studies are done.

I thought I would use this opportunity to explain, in relative detail, where we get our estimates of heritability from, why they are generally well-measured and robust, and real issues need to be considered when interpreting twin study results. This post is going to contain a little bit of maths, but don’t worry if it scares you a little, you only really need to get the gist.
Continue reading ‘Estimating heritability using twins’

Page optimized by WP Minify WordPress Plugin