by Carl V Phillips
A few months ago, Borderud, Li, Burkhalter, Sheffer, and Ostroff, from Memorial Sloan-Kettering Cancer Center published a paper in the peer-reviewed journal, Cancer, that they claimed showed that using e-cigarettes did not help — and indeed hindered — attempts to quit smoking by cancer patients who enrolled in a smoking cessation program. The problem is that it showed no such thing. Instead, what is shows quite clearly is just how bad journal peer review really is in this field.
Shortly after the paper appeared in September, Brad Rodu mentioned to me that he had noticed something very strange about their Table 2 (note that most everything is paywalled, so I am providing download links to copies of the relevant material). We are not talking about some obscure detail here. He noticed that the table showed their main result to be exactly the opposite of what the text said it was. Upon reviewing the results, I shared his confusion about that point and noticed a slightly more subtle (but still extremely obvious) error: their entire result stemmed not from what in the data they did have, but from one obviously incorrect assumption about data they did not have.
We, along with Rodu’s colleague, Nantaporn Plurphanswat, wrote a letter to the journal (it appears in the journal here, you can read it here). It appeared along with an erratum (not paywalled) that says, “The authors discovered some errors regarding reference group labels in Table 2. The corrected table appears below.” Yeah, right, the authors discovered it. Rodu discusses this and presents some of the basic parameters of the study in his post on the matter.
Let’s think about what this means. There is no realistic chance that this error (inverting the labels in a table) was introduced in the typesetting phase. There is equally little chance that the authors introduced the error themselves between versions. This means that the version of paper that was reviewed and approved by journal reviewers and editors contained an error such that the numbers in the table showed exactly the opposite of what they were claiming in the text.
Now you might suggest that failing to catch what is basically a typo is not the biggest failure of peer-review, and you would obviously be right. On the other hand, this was not as simple as it might sound from what I have written. It was not one those typos that you glance at and immediately correct in head because it was obvious what the otters meant. It took me a while of mulling it over before I agreed with Brad’s assessment (reviewing my contemporaneous notes, I notice that I went back and forth on my assessment of what was going on because you have to parse the table title and notes, and the prose to figure out if their endpoint is smoking or smoking cessation, among other things). Thus it should not have been something that was just not noticed because it was so obviously wrong (not that it still should not have been corrected; and as an aside, in the erratum for another paper that appears immediately before it in the authors apologize for inverting two of the tables — the journal does not apologize for any of these patent errors in their pages). Also recall that it was in the main result in the main result table that any reader would focus on, not some obscure minor point. Of course, journal reviewers are supposed to be looking at the minor points too, but we certainly know better than to believe that.
Failing to notice this problem, it is no big surprise that reviewers and editors failed to notice that not only was the main result misreported, but it was basically just made up by the authors. We mention that in our letter too, though the word limits in what passes for debate in public health sciences are absurd (it takes 10 times as many words to debunk junk science as to write it, not 1/10th as many), so you have to read this post to understand the full extent of their errors.
If you look at their first result in Table 2 (you can see it in the erratum, the left-hand columns) they show that smokers in the cessation program who had used e-cigarettes (at all, within the previous 30 days) had a slightly higher smoking cessation rate than those who had not, with just under half of each group being smoking abstinent at the time of follow-up.
But we know — though the authors fail to point out — that older smokers who want to quit and try to switch to e-cigarettes are different from the average smokers who want to quit. They are likely to have tried and failed to quit using other methods. Most important, they are less likely to be among those who have just decided that they would genuinely prefer to just not smoke and merely need some focusing event to allow them to make the switch to abstinence, those we call Category 1 in our recent paper on the topic. People in that category are far more likely to successfully quit smoking than those who need some active assistance like e-cigarettes, which biases any comparisons about successful cessation by method, as we explain in our paper. Thus, the fact that those who were trying e-cigarettes did about as well — indeed a bit better — than those who did not feel like they needed such aid suggests that the e-cigarettes were helpful, though it is obviously not the strongest support for that claim that exists.
But now look to the right, the second set of columns. All of a sudden the authors are claiming that those who had recently used an e-cigarette were only half as likely to have quit smoking. This claim (which appears in the abstract of the paper) is rather different, to say the least. So how did data that showed no measurable difference morph into that? Via an absurd assumption.
Almost half the subjects who enrolled dropped out of the study. This always introduces major complications in interpreting the data. For example, we would expect that those dropping out of a cessation program and study would be either (a) those who successfully quit smoking and so had no reason to stick around or (b) those who decided it was a waste of time. Or both — you could imagine that a lot of the e-cigarette users had been through such programs before and found themselves saying “same old waste of time” but thanks to e-cigarettes (and the shock of being treated for cancer) they did quit smoking. If that were the case, the main results that showed nearly equal quit rates would be underestimating the benefits of e-cigarettes
But it is worse than that. Far worse. And you would never know that from reading what the authors emphasized and discussed. If you comb through the text you discover that one-third of those who had not recently used an e-cigarette dropped out of the treatment program and study, but a full two-thirds of those who had used an e-cigarette dropped out. The latter drop-out rate basically invalidates any attempt to use this study to assess whether e-cigarette users quit smoking, which is what this paper was about. They simply do not know. They lost almost all of them before they could find out. Perhaps it would have been worthwhile for the authors to just report their data from this study, but the calculations they did based on it were entirely inappropriate; their study produced ignorance, but they dressed it up to look like knowledge.
Rather than admit that, or even merely do the inappropriate calculations based on what data they had, the authors decided to report a result based on the assumption that everyone who dropped out was still smoking. See any problem there? If every e-cigarette user had dropped out of the study, rather than just most of them dropping out, they would have concluded from their assumption that no one who uses e-cigarettes to try to quit ever succeeds. Their assumption would be indefensible even if the drop-out rates had been similar for the reasons noted above — it seems quite possible that the e-cigarette users dropped out because of success in quitting, not failure. But given that the drop-out rates were so radically different (something that the authors should not have buried, but rather should have tried to explain — it is more interesting than the mostly-missing outcome data they had) this assumption simply creates the reported result.
Needless to say, the peer-review process let the authors get away with all of this: they wrote a paper about e-cigarettes’ effects on quitting attempts even though they had such loss-to-followup that they should never have attempted it; they then reported and emphasized a result that was entirely an artifact of an extreme and unlikely assumption; they failed to highlight and analyze the radically different loss-to-followup rate, a far more interesting result than what they reported; and, indeed, they failed to analyze the importance of loss-to-followup and sample selection bias at all.
This authors’ excuse for making their absurd and radical assumption is claiming that this represents an “intention to treat” (ITT) analysis. They could not be more wrong. ITT refers to a data analysis option of comparing the outcomes for two groups in an experiment (RCT) based on what treatment they were assigned, not what treatment they actually got (if, e.g., they were assigned to take a course of a drug but refused to take it after having some side effects, they would be included in the treatment group rather than non-treatment group when calculating ITT statistics). There are reasons for doing such an analysis when analyzing experimental data (though there are also reasons for analyzing the actual treatment rather than the assignment — they answer different questions, so it is not as if one is right and the other is wrong). So, do you see how ITT applies to making assumptions about the outcomes for missing subjects in an observational study? Neither do I. There are no intentions to do anything in an observational study, and dealing with missing outcome data is unrelated to the options for dealing with cases where realized treatment differs from assigned treatment.
We know that reviewers for medical journals are often just medics rather than scientists, and so are probably in over their heads when reviewing something that, like this, is not a simple RCT of medical treatments. Still, the ITT concept is solidly within the knowledge base of anyone who is qualified to review a paper on a simple RCT. So either the reviewers did not even read the text closely enough to notice this gaffe or they were not even that marginally qualified.
What is worse, the authors refer to their not-actually-ITT assumption — pretty much the most favorable assumption one could make about the missing data to support their anti-ecig claims — as “more conservative” than just analyzing the data they had. Seriously? A genuinely conservative assumption, thanks to those differential drop-out rates, would be to assume that everyone who was lost to follow-up had quit smoking. That would have produced the adjusted estimate that the e-cigarette group was twice as likely to have quit smoking as the others. To go really conservative, they could have assumed that all the e-cigarette users who dropped out had quit while all the others still smoked. While it is difficult to estimate the numerical results for that from what is reported, it would clearly favor e-cigarettes by more than that twice as likely — something in the range of four times as likely.
[To get more technical: It is not entirely clear that the authors were intentionally lying about doing a conservative analysis, despite actually making an anti-conservative assumption that hugely favored their biases. Another explanation is that their understanding of research is limited to drug trials, where the object can be to show that the treatment works even if you make worst-case assumptions. In that case, unlike the present one, “conservative” means “if you are missing data, assume it is all whatever would make the treatment look worst”. It so happens that they did not even get that right — the more extreme assumption would be that all the lost e-cigarette users were still smoking but all the others had quit. But this is not conservative in the present context because this was not a study where it was appropriate to take an extreme hypothesis (“this does not work at all”) and see if the data is completely incompatible with it even under extreme assumptions. That approach is fine in some cases, so long as you do not (as they did) report the assumption-based results as if they were meaningful other than as extreme sensitivity tests. But you can never do that for a sloppy study like this, as evidenced by the fact that the assumptions simply created the grossly misleading results. The authors basically turned the sloppiness of their study (losing most of the subjects) into their reported result. This was a clear lie/error on the part of the authors that the peer-review process did nothing to correct.
It turns out that such non-conservative “conservative” assumptions have some relationship to the actual ITT concept (e.g., when analyzing by ITT, if the drug does cure the disease but many people stop using it because of side effects, then the analysis understates the effectiveness of the drug if taken, because those who stopped and did not get its benefits are counted against it). But it still should be completely obvious that it is not the same as the ITT concept.]
Also notable is the fact that 75% of the eligible subjects refused to enroll in the smoking cessation program. This introduces huge potential for selection bias which the authors do not acknowledge. The minimal proper response to this is to report whatever is known about those who did not enroll (age distribution, etc.) to see if they look even superficially similar to those who did. Almost certainly they did not, but we will never know because the authors hid this information and the reviewers failed to take the obvious step of telling them to add it. We do know that one-fourth of the subjects who enrolled in the stop smoking program had used an e-cigarette, at least once in the previous 30 days. This is so much higher than the population average that it suggests major selection bias.
Given all that, of course, it is no shock that the paper is rife with other problems. Reading the introduction, you would think that e-cigarettes were medicines whose role in the world is to be imposed on people, and people are just machines to be fixed. No real shock there in a paper by medics (see also), but it spills over into ignoring the overwhelming evidence about the role of e-cigarettes in the world in favor of a few inconsequential RCTs.
Once again, the lesson is that the journal review process does almost nothing to prevent errors, ranging from utterly invalid analyses to out-and-out misreporting in the text. Once again, that is why truth-seeking sciences circulate papers for real peer review, which looks like what appears in this post, rather than just churning them into journals based on the obviously non-useful pabulum that was submitted by the reviewers for Cancer.
[Update P.S.: This is somewhat tangential to the points of this post, but it is worth noting and does add to the content here.
It was called to my attention (h/t Gregory Conley) that Stanton Glantz called Borderud the “best study on e-cigs for cessation so far” and interpreted it as showing that they hurt cessation attempts. So, basically, he is impressed a study of an extremely non-representative population (high age; narrow geography; suffering from cancer and its treatment with all the effects that has), in an extremely unnatural setting, with biases from only 1/4 agreeing to enroll and 1/2 lost to followup, and that did not even really measure the effectiveness of e-cigarettes (it only measured if someone happened to have recently used one, not whether they were using them to try to quit; also note that it would miss those who already succeeded in quitting using e-cigarettes). And the result he highlighted was just made up. I did not go into most of that in this post because I wanted to focus on the bright-line errors of the authors, not the many ways that the study was uninformative about the world.
Glantz explicitly believed that what the authors did was an ITT analysis and showed no sign of noticing that the numbers did not match the prose, in addition to failing to understand the many reasons that the study is generally useless. This is the type of person who “peer-reviews” journal articles about tobacco — someone who cannot even recognize the most glaring errors. Presumably neither Glantz nor the authors had any inkling that perhaps a study of a few self-selected cancer patients in a clinical setting is not the best way to understand people’s consumer choices. Public health research in general is bad, but anti-tobacco “research” is worse still. The one thing that is accurate is that it does, in fact, undergo peer review — the reviewers genuinely are peers of the authors who have little expertise in how to do or understand research. Of course, people think “peer review” means “expert review” and do not see the truth that is hidden in plain sight.]