From genetic association to genetic test
We’ll start with defining our genetic test. For the sake of argument, imagine we’ve done a genome-wide association study of a sample of patients with a serious polygenic medical condition called Madeupitis (thanks to Luke for drawing my attention to this important disorder), and a sample of people without the disorder. We’ve found a set of genetic variants associated with Madeupitis, and we’d like to use them as the basis of a predictive genetic test. First we need to characterize the test’s ability to classify people into two groups: those who are at high risk of Madeupitis, and those who are not. We compare our test’s classification results to a diagnostic “gold standard”, in this case, diagnosis by a clinician specialising in Madeupitis. This comparison gives us the sensitivity and specificity of our test.
Sensitivity concerns our study participants who have Madeupitis. It is the proportion of people with the disorder that our test correctly classifies as having Madeupitis. This is the proportion of “true positive” test results. Specificity concerns the people who do not have Madeupitis. It is the proportion of people without the disorder who are correctly classified by our test as not having Madeupitis. This is the proportion of “true negative” results; 1-specificity therefore gives us the proportion of “false positive” results for the test (the proportion of people without Madeupitis who had a positive test result). If sensitivity and specificity both have a value of 1, the test correctly classifies everyone; in practice all tests have some degree of error, but we would still like these values to be as close to 1 as possible.
Here’s a diagram of how we’d estimate the sensitivity and specificity of our Madeupitis test. In our study we had 100 people with Madeupitis (coloured dark blue) and 100 healthy people (coloured green). When we administered our genetic test, 80 people with Madeupitis had a positive genetic test result (indicated by the red dot), but 40 of the healthy people also had a positive genetic test result. This gives us sensitivity of 0.8 and specificity of 0.6 for our test.
From genetic test to population screening
Sensitivity and specificity are helpful for giving us an idea of the discriminatory power of our test, but they don’t actually tell us about its validity as a screening test in the general population (i.e. not our research sample). To get an idea of this, we actually want to look at the problem from the opposite perspective. Before we were interested in the probability of someone having a particular genotype given that they had (or didn’t have) Madeupitis; now we want to know the probability of someone developing Madeupitis given that they have a particular genotype. This sounds like semantics but they are different things. Here’s an example…
We have a population of 100 people. We know from epidemiological studies that the prevalence (proportion of people with a condition) of Madeupitis in this population is 5%. This means 5 people will develop Madeupitis and the other 95 will not. Our genetic test has moderate discriminatory power, with sensitivity of 0.80 and specificity of 0.60. We can summarise the screening results in the diagram below, where people coloured blue are those who will develop Madeupitis, and those coloured green will not. As before, a red dot indicates a positive genetic test result.
Using our test we will identify 42 people as being likely to develop Madeupitis. Of these people, 4 will be true positives and the remaining 38 will be false positives. The Positive Predictive Value (PPV) is the proportion of people with a positive test result who go on to develop the condition, which is 0.095 for our test in this population. Our test will also categorise 58 people as being unlikely to develop Madeupitis, 1 of which will be a false negative. The Negative Predictive Value (NPV) is the proportion of people with a negative test result who don’t develop Madeupitis, which is 0.983 in this population.
Unfortunately this means our test isn’t doing a good job of identifying cases. If you’re given a positive result, we’re not very confident about whether you’ll get Madeupitis or not (our PPV of 0.095). Put another way, for a member of this population, the pre-test probability of developing Madeupitis is 0.05 and the post-test probability (for those with a positive test result) is ~0.10 – not a great improvement in prediction. On the other hand, if you’re given a negative result, we’re pretty sure you’re not going to get Madeupitis (our NPV of 0.983). Whether the NPV or PPV of a test is more important really depends upon the consequences of a negative or positive result for that test. But let’s say that in this instance we’re concerned that those 38 out of a 100 people who are given false positive results might go out and unnecessarily spend a lot of money on expensive treatments to protect themselves against Madeupitis. How can we improve the performance of the test?
Improving PPV and NPV
The easiest way to improve PPV is to increase the prevalence of Madeupitis in the population. Obviously we can’t do that directly, but we can avoid testing people who have a low likelihood of testing positive, based on some other source of information. Let’s say we only offer our test to people who have at least one parent who has Madeupitis (knowing that Madeupitis is influenced by genetics, this seems a plausible criterion).
Going back to our diagram, we have now “pre-screened” our population based on family history. The people coloured green have no family history and are not administered the genetic test. People coloured light and dark blue all have a family history of Madeupitis and are therefore given the test, but as before only those coloured dark blue will go on to develop the disorder.
By filtering our population on family history we effectively increase the prevalence of Madeupitis in the subset of the population we’re testing (from 5% to 20%), improving our chances of correctly classifying people. Thus our PPV increases to 0.33 even though the sensitivity and specificity of our genetic test haven’t changed. However, there is a trade-off; our NPV has been reduced to 0.923. This is called “cascade testing” and it is sometimes used to determine whether people should have a genetic test for a particular disorder (e.g. Huntington’s disease).
The other thing we could do is improve the sensitivity and/or specificity of the genetic test itself. We might do this by including additional variants in the test, or we might combine our genetic test results with information on other non-genetic risk factors. For example, if we want to estimate risk of cardiovascular disease, we might find we have more predictive power if we combine genetic risk factors with known environmental risk factors such as smoking.
What does this mean for personal genomics?
First of all, it doesn’t mean that DTC genetic tests aren’t useful. At the moment the sensitivity and specificity of a lot of genetic tests for complex, polygenic disorders (for which we haven’t yet identified all the genetic variants that increase risk) are unlikely to match those of standard diagnostic or screening tests. What’s likely is that the predictive capacity of these tests will improve as more variants are identified, and/or if additional non-genetic information is included in the test. The important thing is to keep concepts such as PPV and NPV in mind when you look at DTC test results and remember that for complex diseases, the results you get are always probabilistic not deterministic.
The diagrams in this post were made using Inkscape, an excellent free software package. You can re-use the text and/or diagrams from this post as long as you acknowledge that they’re from Genomes Unzipped (under a Creative Commons Attribution-NonCommercial 2.5 Generic License).