The recent announcement that the UK Government has earmarked £100 million to “sequence 100,000 whole genomes of NHS patients at diagnostic quality over the next three to five years” raises a number of questions, with which the Department of Health are no doubt grappling as I write. I’ve previously discussed the thorny issue of using targeted versus whole genome sequencing to maximize diagnostic yield and benefit patients. However, one of the great achievements of next generation sequencing technologies is to make the assay – actually sequencing genome (or some portion of it) – one of the easier parts of clinical genomics. Although laboratories will have to be suitably equipped, staffed and flexibly managed to deal with high sample throughput and ever changing scientific specifications, the biggest challenge will be to implement genomic knowledge in the clinic.
The first most obvious challenge will be data. Lots of data. Decisions will need to be made about format, storage, transfer, management and access. There is little doubt that the computationally heavy work of sequencing, mapping reads and calling variants should only be done once by a suitably equipped provider; raw BAM files are large, cumbersome and of little use to most people. But the much smaller variant (VCF) files could plausibly be stored by the NHS, with an fully auditable, versioned record of the analytical packages used to create them. (What one algorithm today calls a SNP, a different algorithm may tomorrow call an indel, and that’s without the issue of changing reference genomes…). But what next? Assuming these files can be generated, stored and transferred to the relevant place, how are we going to use them to answer clinical questions?
The next challenge is variant filtering and targeting. Individual genomic data contains thousands and thousands of variants with very different properties – ranging from common to rare, and from irrelevant to disastrous – making it very difficult to interpret. Determining the clinical effect of a single variant can be challenging even for just one gene, never mind a whole genome! The only way to get from a whole genome to a diagnosis is to assess the variants based on population prevalence and disease relevance. Both require databases of genetic variation against which to compare and score each variant. But this will only get you so far. The analysis also needs to be targeted at genes or variants of relevance to each clinical question, which requires detailed phenotypic information about the patient.
Importantly, the level of targeting possible is directly related to the depth of phenotyping. Which will itself be a challenge. Provided with a specific phenotype like ‘ventricular septal defect’, for example, an automated variant filtering pipeline can pick out just a single causal variant; but with a broad clinical description like ‘developmental delay’, the same pipeline could potentially return hundreds of candidate variants. How should these be handled? Training is not the answer here. Instead, clinicians need to provide detailed phenotype information – using a structured vocabulary like the human phenotype ontology – before they receive any test results. Deep phenotyping is even more crucial for facilitating research to tease out new and meaningful genotype-phenotype associations. However, it requires a major culture change in the way medicine is practiced, and creating a working system will be a Herculean task.
The final challenge I’m going to outline relates to defining the purpose of this endeavour, as this is key to tackling many of the ethical and legal questions that arise. Is the aim to maximize diagnostic yield, research potential, equality of access or economic benefits? These questions relate to:
- who should be sequenced? (rare intractable conditions vs. diseases with known molecular causes, and individuals vs. families);
- what should be sequenced? (targeted vs. whole genome);
- when they should be sequenced? (first-line diagnostic test vs. last resort); and
- what will the data be used for? (single diagnosis vs. repeated clinically-driven or public health investigations)
Is the aim of this first £100M to pave the way for a new type of medicine? One where genome sequencing is used as a first-line diagnostic test for numerous conditions, where the full list of variants is stored and used time and time again to answer future clinical and public health questions unrelated to the original purpose of testing? Doubtless part of the point of this initiative is to kick-start clinical genome sequencing in the NHS, by creating the necessary informatics architecture and manpower required. It will also generate a rich dataset to enable future research and facilitate clinical interpretation. But since this is primarily a clinical initiative, current NHS patients should benefit too. Which means ensuring both clinical utility and value for money.