Statistics: Remembering Sensitivity and Specificity

If you’re like me, you have trouble remembering all the different statistical vocabulary like sensitivity and specificity. In this article, I describe these two statistical terms through an analogy of an email spam filter.

Your email inbox comes with a spam filter. Its task is to predict whether an incoming email is spam or not. Not all spam filters are created equal, though. Let’s say you've got a trigger-happy spam filter. It catches almost all the junk email that reaches your inbox. In statistical terms, this one has high sensitivity. That means it will return few false negatives. Because it’s so sensitive, it also sometimes catches messages that aren't actually spam. Let’s call him Rambo the Email Filter.

A drawing of a Rambo-like character shooting a machine gun at terrified anthropomorphic envelopes — Rambo the Email Filter is good at killing spam, but he’s not especially selective. (N.B. Graphic imagery improves memory retention!)

On the other hand, if your spam filter is more careful, it won’t call something spam unless it’s really sure (that might be a good thing—you don’t want to miss an important message because of it). In other words, the spam filter gives few false positives. This one has high specificity.

Definitions and formulas

Mathematically, sensitivity and specificity fall between zero and one. For sensitivity, a zero means it catches no positive cases. A one means it catches all positive cases. The equation looks like this:

\frac{TP}{ TP + FN },

where TP is a true positive test result, and FN is a false negative test result. True positives and false negatives add up to all the actual positive cases. This tells us, “how many of the actual positive cases does my test catch?” Conversely, specificity is defined as:

\frac{TN}{TN+FP},

where FP is a false positive test result. This equation is analogous to the previous one. One is for positives, and the other is for negatives. To sum up, high sensitivity catches many positives. High specificity catches many negatives. Mathematically, they are analogous, but for different aspects of a test, one for positive results, and one for negative results.

Usefulness and limitations

Sensitivity and specificity are useful terms, but they alone don’t reveal much about any particular test result. Let’s look at a numerical example to clarify.

Last month, I received 1,000 emails. Of all those emails, 100, or ten percent, of them are actually spam. My spam filter has a sensitivity of 0.95, meaning it catches 95 spam emails for every 100. It also does well at identifying the negatives, with a specificity of 0.90. In the table below, I show how sensitivity and specificity apply to my 1,000 emails in what‘s called a confusion matrix.

	Predicted Negative	Predicted Positive
Actually Negative	810	90
Actually Positive	5	95

When I check my spam folder and find an email marked spam, what are the chances that it’s actually spam? Sensitivity or specificity alone won’t give me the answer. Instead, we should look at positive predictive value (PPV), which shows how many of my positive predictions were accurate. Mathematically, it’s similar to sensitivity, but instead of dividing by $TP + FN$ , we’re dividing by $TP + FP$ :

PPV = \frac{TP}{TP + FP} = \frac{95}{95+90} \approx 51%

Based on the results, I’ve got about a 50/50 chance that the email in my spam folder will actually be spam. What that tells me is I need a new spam filter. It also shows how the overall prevalence of positive cases affects the precision of the test. Since only ten percent of the email is spam, high sensitivity doesn’t guarantee that a given email will be spam or not. Instead, PPV is more applicable because of the imbalance of positives and negatives. Refer back to the confusion matrix above: even with a high sensitivity and specificity, the number of false positives is nearly as many as the true positives.

To conclude, here’s a list of the terms discussed in this article along with their definitions.

Sensitivity

How good are you at predicting the positive cases? Think “sensitive to positives,” and don’t forget Rambo the Email Filter. Also called recall or the true positive rate (TPR).

Specificity

How good is the test at predicting the negative cases? It’s the same as sensitivity, but for negatives. (Sorry I don’t have a better mnemonic for this. Try memorizing sensitivity first.)

Positive predictive value (PPV)

How many of your positive predictions are accurate? This looks at predictions, right or wrong, not positive cases. Also called precision.

Confusion matrix

A table that compares the results of a test’s predictions versus actual cases.

	Predicted Negative	Predicted Positive
Actually Negative	TN	FP
Actually Positive	FN	TP

In conclusion, sensitivity tells you how many positive cases were correctly predicted, while specificity is the proportion of negative cases correctly predicted. Our Rambo spam filter will do a great job of catching all the spam, but he may indiscriminately catch a few false positives. A more careful filter only catches the obvious spam, potentially letting through a few false negatives. Hopefully, the image of Rambo gunning down your email is vivid enough to help you remember what sensitivity is!

Randy Reflects

Statistics: Remembering Sensitivity and Specificity

Definitions and formulas

Usefulness and limitations

More From the Office

Definitions and formulas

Usefulness and limitations

More From the Office 📑

More From the Office