by Carl V Phillips
I have been having an ongoing conversation with Kristin Noll-Marsh about how statistics like relative risks can be communicated in a way that allows most people to really understand their meaning. There is more there than I can cover in a dozen posts, but I thought I would at least start it. I have created the tag “methodology” for these background discussions about how to properly analyze and report statistics (“methodology” is epidemiologist-speak for “how to analyze and report data”).
Most statistics about health risks are reported in the research literature as ratio measures. That is, they are reported in terms of changes from the baseline, as in a risk ratio of 1.5, which means take the baseline level (the level if the exposures that are being discussed are absent) and multiply by 1.5 to get the new level. This is the same as saying a 50% increase in risk. It turns out that these ratios are convenient for researchers to work with, but are inherently a terrible way to report information to the public or decision makers. There is really no way for the average person to make sense of them. What does “increased risk, with an odds ratio of 1.8” mean to most people? It means “increased risk”, full stop.
Every health reporter who puts risk ratios in the newspaper with no further context should be fired (some of you will recall my Unhealthful News series at EP-ology). But the average person should not feel bad because it is likely that the health reporter — and most supposed experts in health — cannot make any more sense of it either.
The biggest problem is that a ratio measure obviously depends on the size of the baseline. When the baseline is highly stable and relatively well understood, then the ratio measure makes sense. This is especially true when that deviation from the baseline is actually better understood than actual quantities. So, for example, we might learn that GDP increased by 2% during a year. Few people have any intuition for how big the GDP even is, so if that were reported as “increased by $X billion” rather than the ratio change, it would be useless. Of course, that 2% is not terribly informative without context, but the context is one that many people basically know or that can easily be communicated (“2% is low by historical standards, but better than the recent depression years”).
By contrast, to stay on the financial page, you might hear that a company’s profits increased by 10,000% last year. Wow! Except that might mean that they profited $1 the year before and got up to $100 last year. Or it might be $1 billion and $100 billion. The problem is that the baseline is extremely unstable and not very meaningful. This contrasts yet again with a report of revenue (total sales) increasing by 50%, which is much more useful information because a company’s sales, as opposed to profits, are relatively stable and when they change a lot (compared to baseline), that really means something concrete.
So returning to health risk, for a few statistics we might want to report, the baseline is a stable anchor point, but not for most reported statistics. It is meaningful to report that overall heart attack rates are falling by about 5% per year. The baseline is stable and meaningful in itself (the average across the whole population), and so the percentage change is useful information in itself. This is even more true because we are talking about a trend so that any little anomalies get averaged out. By contrast, telling you that some exposure increases your own risk of heart attack by about 5% per year is close to utterly uninformative, and indeed probably qualifies as disinformative.
As I mentioned, the ratio measure (in forms like 1.2 or 3.5) are convenient for researchers to use. You probably also noticed me playing with percentage reporting, using numbers you seldom see like 10,000%. This brings us to the reporting of risk ratios in the form of percentages as a method of lying — or if it is not lying (an attempt to intentionally try to make people believe something one knows is not true), it is a grossly negligent disregard for accurate communication.
Reporting a risk ratio of 1.7 for some disease may not mean much to most people, but at least that means it is not misleading them. There is a good way to explain it in simple terms, something like, “there is an increase in risk, though less than double”. If the baseline is low (if the outcome is relatively uncommon) then most people will recognize this to be a bad thing, but not too terribly bad. So the liars will not report it that way, but rather report it as “a 70% increase”. This is technically accurate, but we know that it is very likely to confuse most people, and thus qualifies as lying with the literal truth. Most people see the “70%” and think (consciously or subconsciously), “I know that 70% is most of 100%, and 100% is a sure thing, so this is a very big risk.”
(As a slightly more complicated observation: When these liars want to scare people about a risk, they prefer that a risk ratio come in at 1.7 rather than a much larger 2.4. This is because “70% increase” triggers this misperception, but”140% increase”, while still sounding big and scary, sends a clear reminder that the “almost a sure thing” misinterpretation cannot be correct.)
The problem here is that people — even fairly numerate people when working outside areas they think about a lot — tend to confuse a percent change and a percentage point change. When the units being talked about are percentages (which is to say, probabilities, as opposed to the quantities of money like the above examples) that are changing by some percentage of that original percentage, this is an easy source of confusion that liars can take advantage of. An increase in probability by 70 percentage points (e.g., from a 2% chance to a 72% chance) is huge. An increase of 70 percent (e.g., from 2% to 3.4%) is not, so long as the baseline probability is low, which it is for almost all diseases for almost everyone.
There seems to be more research on this regarding breast cancer than other topics (breast cancer is characterized by an even larger industry than anti-tobacco that depends on misleading people about the risks, and there is also more interest in the statistics among the public). It is pretty clear that when you tell someone an exposure increases her risk of breast cancer by 30%, she is quite likely to freak out about it, believing that this means there will be a 1-in-3 chance she will get the disease as a result of the exposure.
Reporting the risk ratio of 1.3 will at least avoid this problem. But there are easy ways to make the statistic meaningful to someone — assuming someone genuinely wants to communicate honest information and not to lie with statistics to further a political goal or self-enrichment. The most obvious is to report the relative risk based on the absolute risk (the actual risk probability, without reference to a baseline), or similarly report the risk difference (the change in the absolute risk), rather than ratio/percentage. This is something that anyone with a bit of expertise on a topic can do (though it is a bit tricky — it is not quite as simple as a non-expert might think).
Reporting absolute changes is what I did when I reported with the example of 2% changing to 3.4% (or, for the case of 1.3, that would be changing to 2.6%). The risk difference when going from 2.0% to 3.4% would be 1.4 percentage points, or put another way, you would have a 1.4% chance of getting the outcome as a result of the exposure. Most people are still not great at intuiting what probabilities mean, but they are not terrible. At least they have a fighting chance. (Their chances are much better when the probabilities are in the 1% range or higher, rather than the 0.1% range — once we get below about 1% intuition starts to fail badly.)
To finish with an on-topic example of the risk difference, what does it mean to say that smoke-free alternatives cause 1% of the risk of serious cardiovascular even (e.g., heart attack, stroke) of smoking? [Note: that this comparison is yet another meaning of “percent” than those talked about above — even more room for confusion! Also, this is in the plausible range of estimates, but I am not claiming it is necessarily the best estimate.] It means that if we consider a man of late middle age whose nicotine-free baseline risk is 5% over the next decade, then his risk as a smoker is 10%. Meanwhile, his risk as a THR product user would be 5.05%. Moreover, this should still be reported as simply 5% (no measurable change) since the uncertainty around the original 5% is far greater than that 0.05% difference.