Tag Archives: statistiLie

Smoking trends don’t show whether ecigs are “working”. Ever. So quit it!

by Carl V Phillips

Live by the sword….

A new study by Goniewicz et al. found that smoking and e-cigarette trialing[*] are both up in Poland. They conclude based on that (yes, just on that — my sentence fully sums up their results), “Observed parallel increase in e-cigarette use and smoking prevalence does not support the idea that e-cigarettes are displacing tobacco cigarettes in this population.” It turns out that simple sentence is wrong in its details (the trend was not remotely parallel) while right in its conclusion. But that is only because the conclusion is basically always true: There is no conceivable data from population usage trends that could either support or deny the conclusion that e-cigarettes are displacing cigarettes. Continue reading

The Florida Department of Health are liars (and innumerate)

by Carl V Phillips

Perhaps as a tribute to our nation’s great accredited schools of public health, the Florida Department of Health recently blasted the world with junk science claims based on incorrect research methods and basic innumeracy.  What they were trying to do was issue dire warnings about children using e-cigarettes, but mostly I think they succeeded in issuing a different warning to parents:  Do not let your children study public health!

The Florida exaggerations are already being used in anti-e-cigarette propaganda.  For example, they appear in the actual language of the bill to effectively ban e-cigarette sales in NYC (it would ban flavors, which are obviously rather critical to the product.)  (See this recounting of the language.)  The claim was that 40% more high school students tried e-cigarettes in 2012 than the year before, which is not actually what the data shows.  According to the Florida DoH Fact (sic — and LOL) Sheet about this (not dated, but clearly from earlier this month since that was when it was press-released), their surveys found that 6.0% of such students had tried an e-cigarette in their 2011 survey and 8.4% in 2012, which they described as a “40.0%” increase.  Numerate readers will immediately notice that (a) the uncertainty in the survey means that there is absolutely no way they can make a claim with three significant figures and (b) the third digit is undoubtedly wrong, and probably the second too, since they apparent rounded the other results to two sig figs and then did the calculation.  (Credit to the above news story for correctly rounding this to 40%.)  So, basically, public health people lack grade 7 level math/science training.

Rather worse in their reporting is describing this ever use statistics as “prevalence of this behavior”, meaning they did not understand the one semester of epidemiology they took in public health school either.  The word “prevalence” is inappropriate, and thus misleading, when describing an “ever” statistics.  Since “ever” ratchets (once you are in that category, you can’t go back), it is basically inevitable that there will be an increase in a population that is 3/4 the same people from one year to the next.  This misreporting may partially explain why the equally innumerate people in the NYC Health Department misinterpreted the result (lied) by saying the number who tried e-cigarettes that year had increased by 40%.  And, of course, if they had emphasized the more useful number, that 8% of high school students reported ever having tried an e-cigarettes, even just one puff, the number would not have been impressive at all.

The statistics are not legitimately reported even if we believe the point estimates are exactly right.  After we consider random sampling error (not reported), response bias (clearly a problem, but completely ignored), and measurement error (I know that I always gave random data when asked to do some intrusive survey like this when I was in school) the results are pretty meaningless.  The only reason that we should believe the trend at all (that there has been an increase) is that it is pretty much inevitable, not because if their data.  Needless to say, given these basic errors in reporting, we should doubt the accuracy of their numbers also.

But even if we start with their basic numbers and ignore their errors and sensationalism, what can we make of it?  How many of those who tried e-cigarettes were regular smokers?  Quite possibly all of them, since the rate of current smoking is higher than the number who have merely tried an e-cigarette, but we will never know because they suppressed that information (which they apparently do have).  How many of them had tried at least one puff of a cigarette?  I would guess approximately all.  How many of them are of legal age to use tobacco products, as many high school students are?  Again, they intentionally hid all that information.  It is difficult to see such obvious omissions as mere incompetence — they are clearly intended to mislead readers.

Similarly, they lied about current use.  They did this mainly by referring to “tried at least one puff in the last 30 days” as “current use”.  Moreover, their numbers for recent trying (which is what this really is, not “use”) are very low —  half or less the figures for “ever tried”.  So, of course, the propagandists did not mention them, hoping that sloppy readers would mistake “ever tried” for “uses”.

It is interesting to note that as a portion of those who had ever tried, those who had taken one puff within 30 days dropped substantially from 2011 to 2012.  Since many of those who have only tried e-cigarettes on a few occasions must have done so within any given month, this shows that a rather small fraction of those who have tried an e-cigarette did it very often after that (let alone qualified as genuine “current users”).  This is especially true for the 1.8% of middle school students who had tried in the last month (compared to 3.9% who had ever tried), since they would have had relatively few total months in their history that they might have tried them.  Nevertheless this was breathlessly reported as having “increased by 20.0%” (emphasis and that same sig fig error in the original), and it is that statistics that has been repeated in subsequent propaganda, rather than the low absolute numbers.

What can we make of this?  Well, we know what the ANTZ want to make of it, as quoted in the press release (attributed to the American Cancer Society):

We do know that e-cigarettes can lead to nicotine addiction, especially in young people who may be experimenting with them, and may lead kids to try other tobacco products, many of which are known to cause life-threatening diseases.

Of course, we most certainly do not know that e-cigarettes can lead to addiction.  There is not the slightest piece of evidence to support that claim.  Notice that the Florida data itself shows that most of those who try e-cigarettes have not tried one in the last month — if this is addiction, then that ANTZ word has become even more meaningless than it was before.  Nor is there any evidence that e-cigarettes cause anyone to use other tobacco products.  And, of course, only cigarettes and their minor variations (not “many” products) are known to cause life-threatening diseases (though it was amusing to see the implicit claim in that that e-cigarettes do not cause such diseases).

Honest people looking at this data can conclude almost nothing meaningful, other than that e-cigarettes exist.  Is it possible that all the students who are using e-cigarettes are current or former regular smokers using them for THR?  Yes — it is consistent with what was reported that every last one of them is.  Could experimentation with e-cigarettes be causing other net reductions in risks in this population?  Yes — those who are experimenting are the ones who are most likely experimenting with other drugs or behaviors that can do a lot of harm.  If using an e-cigarette is displacing underage drinking, it is contributing even more to harm reduction than it does when it displaces smoking.

E-cigarettes are used by people almost exclusively to replace a much more harmful behavior.  Students are people.  Why then, exactly, is the assumption that when they are using them, there is net harm?  Of all the drugs or other youthful dalliances that kids might be engaging in, it is difficult to imagine one that is less harmful than e-cigarettes (or smokeless tobacco), except maybe coffee, and even then it is not clear which is less harmful.

People who report health risks as percentage changes are (often) liars

by Carl V Phillips

I have been having an ongoing conversation with Kristin Noll-Marsh about how statistics like relative risks can be communicated in a way that allows most people to really understand their meaning.  There is more there than I can cover in a dozen posts, but I thought I would at least start it.  I have created the tag “methodology” for these background discussions about how to properly analyze and report statistics (“methodology” is epidemiologist-speak for “how to analyze and report data”).

Most statistics about health risks are reported in the research literature as ratio measures.  That is, they are reported in terms of changes from the baseline, as in a risk ratio of 1.5, which means take the baseline level (the level if the exposures that are being discussed are absent) and multiply by 1.5 to get the new level.  This is the same as saying a 50% increase in risk.  It turns out that these ratios are convenient for researchers to work with, but are inherently a terrible way to report information to the public or decision makers.  There is really no way for the average person to make sense of them.  What does “increased risk, with an odds ratio of 1.8” mean to most people?  It means “increased risk”, full stop.

Every health reporter who puts risk ratios in the newspaper with no further context should be fired (some of you will recall my Unhealthful News series at EP-ology).  But the average person should not feel bad because it is likely that the health reporter — and most supposed experts in health — cannot make any more sense of it either.

The biggest problem is that a ratio measure obviously depends on the size of the baseline.  When the baseline is highly stable and relatively well understood, then the ratio measure makes sense.  This is especially true when that deviation from the baseline is actually better understood than actual quantities.  So, for example, we might learn that GDP increased by 2% during a year.  Few people have any intuition for how big the GDP even is, so if that were reported as “increased by $X billion” rather than the ratio change, it would be useless.  Of course, that 2% is not terribly informative without context, but the context is one that many people basically know or that can easily be communicated (“2% is low by historical standards, but better than the recent depression years”).

By contrast, to stay on the financial page, you might hear that a company’s profits increased by 10,000% last year.  Wow!  Except that might mean that they profited $1 the year before and got up to $100 last year.  Or it might be $1 billion and $100 billion.  The problem is that the baseline is extremely unstable and not very meaningful.  This contrasts yet again with a report of revenue (total sales) increasing by 50%, which is much more useful information because a company’s sales, as opposed to profits, are relatively stable and when they change a lot (compared to baseline), that really means something concrete.

So returning to health risk, for a few statistics we might want to report, the baseline is a stable anchor point, but not for most reported statistics.  It is meaningful to report that overall heart attack rates are falling by about 5% per year.  The baseline is stable and meaningful in itself (the average across the whole population), and so the percentage change is useful information in itself.  This is even more true because we are talking about a trend so that any little anomalies get averaged out.  By contrast, telling you that some exposure increases your own risk of heart attack by about 5% per year is close to utterly uninformative, and indeed probably qualifies as disinformative.

As I mentioned, the ratio measure (in forms like 1.2 or 3.5) are convenient for researchers to use.  You probably also noticed me playing with percentage reporting, using numbers you seldom see like 10,000%.  This brings us to the reporting of risk ratios in the form of percentages as a method of lying — or if it is not lying (an attempt to intentionally try to make people believe something one knows is not true), it is a grossly negligent disregard for accurate communication.

Reporting a risk ratio of 1.7 for some disease may not mean much to most people, but at least that means it is not misleading them.  There is a good way to explain it in simple terms, something like, “there is an increase in risk, though less than double”.  If the baseline is low (if the outcome is relatively uncommon) then most people will recognize this to be a bad thing, but not too terribly bad.  So the liars will not report it that way, but rather report it as “a 70% increase”.  This is technically accurate, but we know that it is very likely to confuse most people, and thus qualifies as lying with the literal truth.  Most people see the “70%” and think (consciously or subconsciously), “I know that 70% is most of 100%, and 100% is a sure thing, so this is a very big risk.”

(As a slightly more complicated observation:  When these liars want to scare people about a risk, they prefer that a risk ratio come in at 1.7 rather than a much larger 2.4.  This is because “70% increase” triggers this misperception, but”140% increase”, while still sounding big and scary, sends a clear reminder that the “almost a sure thing” misinterpretation cannot be correct.)

The problem here is that people — even fairly numerate people when working outside areas they think about a lot — tend to confuse a percent change and a percentage point change.  When the units being talked about are percentages (which is to say, probabilities, as opposed to the quantities of money like the above examples) that are changing by some percentage of that original percentage, this is an easy source of confusion that liars can take advantage of.  An increase in probability by 70 percentage points (e.g., from a 2% chance to a 72% chance) is huge.  An increase of 70 percent (e.g., from 2% to 3.4%) is not, so long as the baseline probability is low, which it is for almost all diseases for almost everyone.

There seems to be more research on this regarding breast cancer than other topics (breast cancer is characterized by an even larger industry than anti-tobacco that depends on misleading people about the risks, and there is also more interest in the statistics among the public).  It is pretty clear that when you tell someone an exposure increases her risk of breast cancer by 30%, she is quite likely to freak out about it, believing that this means there will be a 1-in-3 chance she will get the disease as a result of the exposure.

Reporting the risk ratio of 1.3 will at least avoid this problem.  But there are easy ways to make the statistic meaningful to someone — assuming someone genuinely wants to communicate honest information and not to lie with statistics to further a political goal or self-enrichment.  The most obvious is to report the relative risk based on the absolute risk (the actual risk probability, without reference to a baseline), or similarly report the risk difference (the change in the absolute risk), rather than ratio/percentage.  This is something that anyone with a bit of expertise on a topic can do (though it is a bit tricky — it is not quite as simple as a non-expert might think).

Reporting absolute changes is what I did when I reported with the example of 2% changing to 3.4% (or, for the case of 1.3, that would be changing to 2.6%).  The risk difference when going from 2.0% to 3.4% would be 1.4 percentage points, or put another way, you would have a 1.4% chance of getting the outcome as a result of the exposure. Most people are still not great at intuiting what probabilities mean, but they are not terrible.  At least they have a fighting chance.  (Their chances are much better when the probabilities are in the 1% range or higher, rather than the 0.1% range — once we get below about 1% intuition starts to fail badly.)

To finish with an on-topic example of the risk difference, what does it mean to say that smoke-free alternatives cause 1% of the risk of serious cardiovascular even (e.g., heart attack, stroke) of smoking?  [Note: that this comparison is yet another meaning of “percent” than those talked about above — even more room for confusion!  Also, this is in the plausible range of estimates, but I am not claiming it is necessarily the best estimate.]  It means that if we consider a man of late middle age whose nicotine-free baseline risk is 5% over the next decade, then his risk as a smoker is 10%.  Meanwhile, his risk as a THR product user would be 5.05%.  Moreover, this should still be reported as simply 5% (no measurable change) since the uncertainty around the original 5% is far greater than that 0.05% difference.

StatistiLie – ANTZ cleverly choosing the wrong year to report

posted by Carl V Phillips

Since I am still trying to recover from my travels, though have about ten more complicated posts I want to write, I will temporize again — this time outsourcing the lie of the day to Dick Puddlecote.

He points out how several anti-smoking liars tried to mislead the public by reporting statistics from the wrong years.  In particular, in response to the many recent claims that rising taxes are driving more smokers to the black market and this will continue to increase, they compared two past years instead of the current data.  They cleverly (obviously intentionally) ignored the facts that (a) most of the taxes that inspired the flurry of concern took effect since the period they report and (b) there are newer statistics available — and indeed even reported in the newspaper.  It will come as no surprise that the ANTZ concluded that their taxes do not drive people to the black market, but the new statistics support the opposite conclusions.

It is even a bit worse than that, so I recommend you read the whole thing.

Though this is about cigarettes, it is very much an anti-THR lie.  It touches on both of the fundamental lies of anti-THR.  It implies that tobacco/nicotine use is somehow so unlike every other consumption choice that consumers will not behave rationally.  That is, unlike what economics tells us about …well, about every choice people make, for some mysterious reason, tobacco consumers will not gravitate toward low-cost competitors when someone increases prices.  If you can be tricked into believing that, then you can be tricked into other departures from the obvious simple economics, like believing that people who choose to consume nicotine/tobacco do not really get any benefit from that choice.  Thus, the lie goes, there is no value in THR because no one really wants to be consuming these products rather than being abstinent.

In addition, the absurd claim that consumers will not shift toward the black market means that taxes can be raised, without bound, and smokers will respond only by quitting.  (Note that a ban is basically an attempt to raise the tax to infinity, though incomplete enforcement means that it is always actually finite.  Thus, punitive taxes are effectively partial bans.)  For any other good, a ban or price increase will cause substitute markets — in particular, black market supply chains — to gain market share.  But if smokers are the exception to that, and will just obey like they are supposed to, then universal cessation is just a few tax increases away.  And so, their lie goes, there is no reason to pursue THR.

I am adding the new tag to this blog, statistiLie, to try to identify lies that are based on intentionally using the wrong statistics.  (As always, keeping in mind that when an author knows so little that he does not know a claim is wrong, it does not change the existence of the lie, but merely its nature.  An author who unintentionally uses clearly wrong statistics is lying about his knowledge.)  A large portion of all anti-THR lies involve statistics, of course.  I will try to reserve this for the particular case when it is clear that there are statistics that are useful for addressing a particular point, but other numbers are chosen instead, and moreover that the author hides the existence of the more useful numbers.

I did not want to merely use the tag “statistics” because too many people observe that statistics are used to lie and jump to the conclusion that those quoting statistics should not be trusted.  But statistics are the only way we move toward the truth in these matters, and I want to push back against this sullying of the word.  It is interesting to note that in this case that the ANTZ were claiming that the various commentators who based their conclusions on the right statistics were lying.  There is a danger that the real liars will accuse someone else of lying with statistics; indeed, it is a typical ploy by the liars.

Finally, it is useful to note that some anti-smoking lies are pro-THR (e.g., exaggerations of the risk from ETS) — still lies with everything that implies, but they do tend to encourage THR rather than discouraging.  But the statistiLies about the economics (black markets, plain packaging, advertising, bans, etc.) very often try to support the core anti-THR assumptions.  Thus, those lies about smoking turn out to be an anti-THR issue.