Archive for the 'Journal Club' Category

Page 2 of 3

Report on clinical genome sequencing

The PHG Foundation, an independent genomics think-tank, has launched a new report on next generation sequencing and its impact on health and health systems. The Report, Next steps in the sequence: the implications of whole genome sequencing for health in the UK can be freely downloaded and aims to provide a comprehensive overview of the many and varied issues relating to clinical genome sequencing.

When planning the work, we were motivated by the astonishingly rapid development of fast, affordable whole genome sequencing (WGS) technologies, which are set to change many aspects of health care. The sheer quantity and complexity of the information generated by genome sequencing, along with ever-changing understanding of the function of genomes in health and disease, presents new challenges for health systems.

The Report reviews the technologies, informatics pipeline and key clinical applications of WGS, and as well as the economic, ethical, legal and social implications and organisational challenges of offering WGS within the UK NHS. The final two policy chapters outline different scenarios for testing, storing and returning results, and contains 10 key recommendations reached with the help of several expert stakeholder workshops.

Continue reading ‘Report on clinical genome sequencing’

Revisiting RNA-DNA sequence differences

A few months ago, I discussed a paper by Li and colleagues reporting a large number of sequence differences between mRNA and DNA from the same individual [1]. While some such differences are expected due to known mechanisms of RNA editing (e.g. A->I editing, see [2]), Li et al. reported an astonishingly high number of them, including thousands of events inconsistent with any known regulatory mechanism. These results implied at least one, and probably many, new mechanisms of gene regulation, and called into question some basic assumptions in molecular biology.

An alternative explanation for the observations of Li et al. is less exciting–imagine two genes with similar (but not identical) sequences, which produce similar (but not identical) mRNAs. If you accidentally attributed both mRNA sequences to the same gene, you could erroneously conclude that one of the two sequences arose via RNA editing of the other. According to a new paper in by Schrider and colleagues [3], this banal artifact accounts for the majority of the reported RNA-DNA sequence differences in Li et al.

Schrider et al. show that RNA-DNA mismatches are enriched in genes with close paralogs or copy number variants, both of which are consistent with the technical artifact mentioned above. However, their most striking result is that, at many of the putative RNA editing sites, the “edited” base from the mRNA is actually present in genomic DNA. To show this, Schrider et al. took advantage of the fact that low-coverage DNA sequencing data is available for the individuals used in the Li et al. study. They searched through these data to find genomic sequences matching the “edited” mRNA form. If these sites were truly due to RNA editing, they shouldn’t find any. Instead, at ~75% of the tested sites, they could find a genomic match to the “edit” in at least one individual. There are some potential complications with the interpretation of this number (as they note, the genomic data could include sequencing errors that happen to be the same base as the “edit”), but this observation strongly suggests that a majority of the sites identified by Li et al. are false positives due to this single technical issue.


[1] Li et al. (2011) Widespread RNA and DNA Sequence Differences in the Human Transcriptome. Science. doi: 10.1126/science.1207018

[2] Levanon et al. (2004) Systematic identification of abundant A-to-I editing sites in the human transcriptome. Nature Biotechnology. doi:10.1038/nbt996

[3] Schrider et al. (2011) Very Few RNA and DNA Sequence Differences in the Human Transcriptome. PLoS One. doi:10.1371/journal.pone.0025842

Genetic risk prediction in complex disease

I thought I’d point out a review article in Human Molecular Genetics that just came out in (open access) preprint form by Luke and myself on genetic risk prediction in complex disease. In it we discuss some of the strengths and weaknesses of genetic and risk prediction compared to classical epidemiological predictors, different statistical modelling considerations, and the effect of GWAS on prediction. Readers of this space might find the conclusion of some interest, where we consider some of the societal aspects of trying to bring the interpretation of genomes into mainstream medical practice.

Notes on the evidence for extensive RNA editing in humans

UPDATE 3/17/12: A more extensive analysis of the paper discussed in this post is here. Several groups have concluded that at least 90% of the sites identified are technical artifacts

The “central dogma” of molecular biology holds that the information present in DNA is transferred to RNA and then to protein. In a paper published online at Science yesterday, Li and colleagues report a potentially extraordinary observation: they show evidence that, within any given individual, there are tens of thousands of places where transcribed RNA does not match the template DNA from which it is derived [1]. This phenomenon, called RNA editing, is generally thought to be limited (in humans) to conversions of the base adenosine to the base inosine (which is read as guanine by DNA sequencers), and occasionally from cytosine to uracil. In contrast, these authors report that any type of base can be converted to any other type of base.

If these observations are correct, they represent a fundamental change in how we view the process of gene regulation. However, in this post I am going to point out a couple of technical issues that, if not properly taken into account, have the potential to cause a large number of false positives in this type of data. The main point can be summarized like this: RNA editing involves the production of two different RNA and/or protein sequences from a single DNA sequence. To infer RNA editing from the presence of two different RNA and/or protein sequences, then, one must be very sure that they derive from the same DNA sequence, rather than from two different copies of the DNA (due to, for example, paralogs or copy number variants). Although this issue has the potential to be a large source of false positives in a study like this, I will discuss an additional technical problem that could also result in false positives.

Continue reading ‘Notes on the evidence for extensive RNA editing in humans’

How do variants outside genes influence disease risk?

Over the last several years, the number of genetic variants unambiguously associated with disease risk has grown dramatically. However, interpreting these signals has been extremely difficult—most of the identified variants do not disrupt genes, and indeed many don’t fall anywhere near genes (this observation has even led some to discount these signals entirely). To an investigator interested in following up on these signals, this is somewhat depressing: how can we hope to explore how polymorphisms affect disease risk if they don’t seem to fall in any sort of genome annotation that we understand?

In this context, I thought I’d point to an important paper that, among many other things, gives the first systematic evidence that variants which influence disease are not just randomly scattered across the genome, but instead tend to fall in particular regions—in particular, enhancer elements (regions where DNA-binding proteins interact with DNA to influence gene expression).

The authors rely on the fact that, in the cell, DNA is wrapped around proteins called histones, which control how accessible the DNA is to things like transcription factors (see above figure). These proteins can be chemically modified, and it is now clear that particular patterns of modifications are predictive of the function of the DNA in the region—some modifications indicate transcribed genes, others regions of enhancer activity, others repressed regions, etc.

What the authors did in this study was generate genome-wide maps of several histone modifications in nine different cell types, and use this data to predict the function of each 200 base pair segment of the human genome in each cell type. There are a number of interesting analyses of these “maps” of genome function in the paper, but for our purposes here there’s one of particular interest: the authors took sets of SNPs associated with various diseases and simply asked, are these variants enriched in regions with any particular functional prediction? And indeed, for several phenotypes, there is a striking enrichment of association signals in enhancers elements in a relevant cell type. For example, SNPs which influence lipid levels are enriched in enhancers in a liver cancer cell line, and SNPs which influence the autoimmune disease lupus are enriched in enhancers in a lymphoblastoid cell line.

As these types of functional maps are generated in more cell types, I imagine there will be more stories like this. The problem with interpreting disease association studies, it seems likely, is largely due to our lack of understanding of genome function.

—-
Citation: Ernst et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. doi:10.1038/nature09906

Are synthetic associations a man-made phenomenon?

Early last year David Goldstein and colleagues published a provocative paper claiming that many GWAS associations are driven not by common variants of modest effect (the canonical common disease – common variant hypothesis underpinning GWAS) but instead by a local cluster of lower frequency  variants that have much bigger effects on disease risk. They dubbed this hypothesized phenomenon “synthetic association” and the term quickly became a genetics buzzword. The paper was widely discussed in both the specialist and mainstream media, and caused quite a stir among academic statistical geneticists.

That debate has been re-opened today by a set of Perspectives in PLoS Biology: a rebuttal by us (Carl & Jeff) and our colleagues at Sanger, a rebuttal by Naomi Wray, Shaun Purcell and Peter Visscher, a rebuttal to the rebuttals by David Goldstein and an editorial by Robert Shields to tie it all together.

Continue reading ‘Are synthetic associations a man-made phenomenon?’

Solving Medical Mysteries Using Sequencing

There is a real “wow” paper out in pre-print at the journal Genetics in Medicine. It is a wonderful example of the application of cutting edge sequencing technology to solve a medical mystery. Even better, the authors also include an auxiliary discussion about the medical and ethical issues surrounding the diagnosis, which raises some interesting issues about the transition from research to clinical sequencing.

The Case

A child manifested severe inflammation of the bowel at 15 months; antibiotics failed to clear it up, and he started to lose weight. Standard treatments seemed to have only sporadic effects, and only severe treatment with immunosuppressants, surgery and full bowel clearing could slow down the disease, which is not a long term solution. No cause could be found; the patient’s active immune system seemed to be acting abnormally, but all tests for the known congenital immune deficiencies came back negative. The doctors could try a full bone-marrow transplant, but without knowing what was causing the disease, and where it was localised, they had no way of knowing if such an extreme intervention would be successful.

Such a severe and early onset disease is likely to be genetic, but testing immune genes at random to find the mutation could take years before it turned anything up. Meanwhile, the child was seriously malnourished, and at times required daily wound care under general anaesthetic. A few years ago this might have been the end of the story.

Continue reading ‘Solving Medical Mysteries Using Sequencing’

Our favourite papers of 2010

To celebrate the end of the blogging year here at Genomes Unzipped, we wanted to spend a bit of time reminiscing about the papers we enjoyed the most in 2010. Feel free to add your own suggestions in the comments!

Joe: Mice, men, and PRDM9. A key goal in evolutionary biology is to identify the mechanisms leading to speciation. One way to get at that goal is to identify genes that cause sterility or reduced fitness in hybrids between species or diverged populations. In mammals, exactly one such gene has been identified to date: the DNA-binding protein PRDM9. This year, three groups working on a seemingly different problem–deciphering the molecular mechanisms by which recombination shuffles genetic variation between generations–stumbled across an important gene in this process: PRDM9. Variation in this gene influences recombination patterns in both mice and humans, and is responsible for the dramatic differences in recombination patterns between humans and chimpanzees. Is it a simple coincidence that a gene which influences recombination also appears to have a role in speciation? Time will tell.

Parvanov et al. (2010) Prdm9 Controls Activation of Mammalian Recombination Hotspots. Science. DOI: 10.1126/science.1181495.

Baudat et al. (2010). PRDM9 Is a Major Determinant of Meiotic Recombination Hotspots in Humans and Mice. Science. DOI: 10.1126/science.1183439.

Myers et al. (2010). Drive Against Hotspot Motifs in Primates Implicates the PRDM9 Gene in Meiotic Recombination. Science. DOI: 10.1126/science.1182363.

Daniel: Whole-genome sequencing to develop personalised cancer assays. The area of medicine where the transforming power of new DNA sequencing technologies is moving the fastest is in cancer diagnostics and therapy. There were many studies relevant to this field in 2010 (with a fair proportion featuring on the excellent MassGenomics blog), but this paper was a simple, elegant example: the authors performed low-coverage whole-genome sequencing of four tumour samples, identified large genomic rearrangements present in the tumour cells but not in the patient’s healthy tissue, and then designed personalised, quantitative assays measuring the proportion of cells carrying these rearrangements in the patients’ blood. These assays allowed them to track, almost in real time, how the patients’ cancers responded to various therapies, like so:

Leary et al. (2010) Development of personalized tumor biomarkers using massively parallel sequencing. Science Translational Medicine. DOI: 10.1126/scitranslmed.3000702.
Continue reading ‘Our favourite papers of 2010′

The cell is a messy place: understanding alternative splicing with RNA sequencing

Though this site is largely dedicated to discussions of personal genomics, I’d like to use this post to discuss some of my recent work (done with Athma Pai, Yoav Gilad, and Jonathan Pritchard) on mRNA splicing. Our paper, in which we argue that splicing is a relatively error-prone and noisy process, has just been published in PLoS Genetics [1].

Continue reading ‘The cell is a messy place: understanding alternative splicing with RNA sequencing’

Friday Links

At the risk of turning Friday Links into a self-trumpet-blowing occasion, we are happy to report that a number of GNZ contributors (Jeff, Carl and Luke) are authors on a new Crohn’s disease GWAS meta-analysis of 6000 patients that came out in Nature Genetics this week. The study brings the number of Crohn’s associations up to 71, with 30 novel, bringing the proportion of heritability explained up to about 24%; also worth noting that all of the associations from the previous meta-analysis were replicated it this one, showing how the cross-platform independent replication experiments that are now standard have largely obliterated false positives in GWAS. There were also 5 loci that showed evidence of a second, independent signal, which I think is a promising sign of things to come.

Continue reading ‘Friday Links’


Page optimized by WP Minify WordPress Plugin