Peer review of: Dunbar et al. (Rand Corp), Disentangling Within- and Between-Person Effects of Shared Risk Factors on E-cigarette and Cigarette Use Trajectories From Late Adolescence to Young Adulthood, Nicotine & Tobacco Research, 2018

by Carl V Phillips

For an explanation of what this post is, please see this brief footnote post.

The paper reviewed here is available at Sci-Hub. The paywalled link is here.


The typical “gateway” paper consists of observing the exposure of whether subjects (typically teenagers) have, at baseline, engaged in a particular behavior (vaping, in this case), and then observing the association with an outcome behavior (in this case, smoking). There is also an even worse collection of papers that do not even assess the order of events and simply look at whether prevalent ever-exposed status is associated with prevalent smoking. All of these suffer from the obvious fatal problem that a positive association is inevitable because inclination to ever vape is associated with inclination to ever smoke. In a counterfactual world in which vapor products did not exist, someone who vaped in the real world would be more likely to smoke than average, and this would obviously not be caused by (nonexistent) vaping. In short, since a positive association is inevitable, regardless of whether the hypothesis “vaping causes smoking” is true, observing a positive association obviously tells us nothing about about the hypothesis.

The present paper attempts to improve upon the standard worthless analysis. This is a commendable goal, and there is information value in what was done (unlike most gateway papers). However, the contributed information is very modest and does not actually support the authors’ conclusions. In particular, they claim that their results support the gateway hypothesis, and that they do so in ways that the usual longitudinal studies do not. This is simply false.

Though a few dishonest actors in this space pretend otherwise, everyone knows that different people have different propensities to use a tobacco product, and the degree of propensity is going to be highly correlated across products, creating an obvious confounding problem. There are often covariates available that are associated with the latent propensity variable, and the more honest (but still fatally flawed) attempts at assessing gateway effects attend to them. However, the authors usually make the mistake common throughout health research: “We threw all the variables we happened to have into a single model, and therefore there must be no remaining confounding.”

The present authors go in a different direction and focus on how well the covariates they happen to have (mental health scores, other substance use, and the usual simple demographic information) predict the exposure. Their basic results show the usual expected patterns: Inclination to use one drug or product is associated with greater inclination to use another, as are what society judges to be inferior states of mental health.

The bells and whistles added by these authors consist of separate within- and between-person analyses. Between- vs within-subject analysis can do various things. But in this case there is really only a single somewhat interesting question they tease out, whether the variables (this particular set of variables that they happen to have!) predict whether someone starts smoking once you already know she started vaping. The answer is no, based on the data they happened to have (three waves from 2015 to 2017, recruited students attending school near Los Angeles, a total of 2039, mostly age 19 at the final wave) and the model they used.

This is interesting as a normal incremental bit of science. However, it suffers from the limits faced by all epidemiology. There are no constants in epidemiology, and this will probably be different for a different population (place and time) and also, critically in this case, for a different set of collected covariates. It is notable that the authors indicate no awareness of the fact that their result may not generalize, even though this is far more true for the topic at hand than on average: Tobacco use behavior varies hugely across cultures, and they studied it in a single population; a behavior that was first exploding during the study period and will, of course, not remain novel in the future).

But mildly interesting does not imply revelatory (despite reports that the academic tobacco controllers who fancy themselves real scientists are going gaga about it). It most certainly does not support the gateway hypothesis as the authors claim. The authors seem to think that their methods do something to overcome the problem of not being able to observe the latent “inclined to use (or increase use of over time) a tobacco product” variable. They do not. There is simply no reason whatsoever (and the authors present no argument that there is such a reason) to believe that these methods do not suffer from the exact same problem as the various other analysis that lack the particular bells and whistles.

The authors also seem to be oblivious to the difference between predictors and causes. In particular, regarding the observation that the covariates lose their predictive value about smoking once it is known someone vapes, they suggest that if this were not the case then trying to reduce cannabis use could reduce smoking, but since it is the case then discouraging vaping will reduce smoking. This is fatally flawed reasoning.

To fully understand the flaws in their reasoning, consider an analogy (not realistic, obviously, but illustrative): Is eating at a fast-food salad place a gateway to eating at the burger-and-fries place in the same neighborhood? As covariates we have measures of the transport routes someone uses (say, public transit routes or roads they commonly drive) and their job description. The covariates are pretty good predictors of between-person variation: People who travel routes that access that neighborhood are far more likely to go to the burger place than those who do not; people with jobs with wages associated with eating out at the fast food level are also more likely, especially if the jobs probably include a decent lunch break and might be located in that neighborhood. Now you consider people who have eaten at the salad place (the main exposure of interest) and observe how much more likely than average they are to later go to the burger place. And, lo, the covariates no longer help you predict that. Of course they do not. They have been screened (in the predictive logic sense of the word) by the salad data: You now know for sure that they are sometimes in the neighborhood and that they eat fast food. The covariates that roughly predict each of these are now uninformative, just as “has vaped” renders rough predictions about whether someone would ever use a tobacco product uninformative.

Does that somehow suggest that the salad place [analogously: vaping] is a gateway? Of course not. If it did not exist, they would still be in the neighborhood and in the market for fast food [still be inclined to use tobacco products]. Indeed, the salad place [vaping] is probably protective against eating at the burger place [smoking] (if it did not exist they would have gone somewhere else [consumed a different product] from the start). These are exactly the same problems with the standard gateway longitudinal analyses. Despite their rhetoric to the contrary, the authors have done nothing to improve on that.

In fact, what they have demonstrated is that (for this particular dataset), the latent variable “propensity to use any tobacco product” — the confounder that renders all gateway papers to date uninformative — is such a strong predictor that if you have a very strong proxy for it (having vaped) then all the covariates that supposedly control for confounding are uninformative. Contra to how the authors interpreted it, this observation tends to support the conclusion that those other variables have limited value as deconfounders and thus the main criticism of all the gateway claim literature is actually a bit stronger than it was before. That is, we now can be more confident in the existing belief (among experts) that the deconfounder variables are not actually very good at controlling for the confounder of interest.

This is a somewhat interesting result, and it would move the science forward incrementally if the authors had actually reported it accurately and authors in this space cared about improving their work.

Another mildly interesting result (again, with very limited implications due to the use of one set of measurements for one population) is that poorer mental health status is associated with smoking uptake but not vaping. The authors fail to note this among their conclusions, but this is plausibly indicative of a recognition that smoking is a self-destructive act, while vaping is not.

Many of the results are presented using test statistics and uninterpretable measures of association. This is not appropriate for a paper of this kind. It is not difficult to translate those statistics into a relative risk (or, better, absolute risk) measure that the reader can make sense of without doing their own calculations. This suggests the authors are not genuinely trying to communicate their results, but only their (erroneous) conclusions. Or perhaps they are just indulging in using interesting methods to crank out a lot of results, seemingly unaware that since these results do not extrapolate across populations all that well, overly precise details like this are pointless.

The methods are fairly well described, with references to question banks or explicit statements of what questions were asked. There appears to be no serious problems with misrepresenting important variables or simply not having good measures, as is common in other papers (e.g., only measuring “ever tried a single puff” and characterizing this as being a vaper).

The model fishing that is probably present in the paper is that typically found in economics papers, rather than epidemiology papers. That is, instead of playing with which variables and functional forms to use (which looks pretty clean) they used unnecessarily complicated models to examine a fairly simple relationship. (Any additional precision provided by the complicated models is just window-dressing, since the precision of estimates is not really a relevant issue here.) They undoubtedly tried various versions of these models, without acknowledging this fact, and reported only one. It seems likely these authors picked the model with the statistically “best” result, rather than the politically “best” result as is common in this space.

The Introduction suffers from the problem that conclusions from previous studies in the space are presented uncritically, even when they have obvious fatal flaws or are total junk. Indeed, elsewhere in this paper, the authors demonstrate sophistication that is not typical in public health research, and thus presumably know many of the prior papers are obvious junk. Yet the literature review still seems like it was written by an undergraduate, as is typical. This is a major problem, but otherwise the Introduction is far better than what is typical in the space (an amateurish politicized essay about the broad subject matter that usually just demonstrates the authors’ incompetence in the field). It actually introduces the reader to the relevant background and concepts for understanding the research, and includes almost nothing more.

The Discussion is a proper analysis of the results, not a tangential political rant. However, the content of it is fatally flawed for the reasons noted above.

The authors falsely declare they have no conflicts of interest. They work for a company that depends on U.S. government health agency grants, in particular for this and similar work, and those agencies have clearly indicated that they endorse gateway claims, and they make a practice of funding those who endorse them. This is a clear financial conflict of interest, even if the authors have no personal political preferences that also favor the conclusions they reached.

8 responses to “Peer review of: Dunbar et al. (Rand Corp), Disentangling Within- and Between-Person Effects of Shared Risk Factors on E-cigarette and Cigarette Use Trajectories From Late Adolescence to Young Adulthood, Nicotine & Tobacco Research, 2018

  1. Hello Carl,

    This is a good informative analysis, but one bit is unclear to me. You say that the ‘within person’ analysis examined ‘whether the variables predict whether someone starts smoking once you already know she started vaping’. I am not sure this is what they did. This is the authors’s explanation: ‘For example, more frequent marijuana use than one’s peers was associated with more frequent EC and cigarette use than one’s peers; however, reporting higher marijuana use from 1 year to the next (ie, relative to “typical” use) did not affect subsequent EC or cigarette use’. So it appears that they examined whether changes in the covariates during the year of the study generated changes in use of nicotine products use (they also do not seem to be talking about ‘yes/no’ use and only for EC-cig direction, but about frequency of use of both products). Do correct me if I am getting it wrong. But if this is what they did, it seems wholly irrelevant to the question of whether vaping causes smoking.

    Peter Hajek

    • Carl V Phillips

      Hi, Peter. Thanks for the feedback.

      So part of the issue here is one of my (in development) protocol for these pieces, trying to find a balance among focus on a critical analysis of key take-away points, summarizing the work beyond that, and length, as well as between feeling like a journal review vs feeling like an essay. I will put you down as a vote for “more detail summarizing the paper”.

      You are correct that they did not focus on yes-no questions (like initiation), as most papers do. They do emphasize this in the article. It thought about it and could not figure out any reason that this was particularly important to discuss, and so did not. However, I do realize that in glossing over that difference from the norm, what I wrote is a bit misleading about the analysis (not about what is wrong with it! — that is still valid — but what exactly they did). Before porting these to their new home, I will probably edit them some (since I am still developing protocol and want them to all fit it) and so I will accept the amendment and correct that omission.

      The cruxes of the analysis that I focused on were twofold. The more important one, really, is the simple repetition of the usual error: suggesting that vaping-smoking association was causal rather than being easily explained (perhaps not fully, but they offer not reason to doubt that it is fully) by the latent propensity variable I discuss. In that sense, there is nothing special about this paper at all — they make the same stupid mistake that every tobacco control paper in this space makes. Ironically, even as they have passages that show sufficient sophistication that they must know confounding can explain the association, they have other passages that basically declare, without basis, that the association is causal.

      The more technical point, the bells and whistle that you address, is emphasized in the Abstract (“The shared risk factors examined here did not affect escalations in e-cigarette or cigarette use over time within individuals, but likely influence which youths use these products.”) with more detail in the Discussion (in and around: “But at the within-person level, we observed no effects of third variables on either the direct associations between use of EC and later use of cigarette, or direct associations between use of cigarettes and later use of ECs.”). It is those statistics that seem to be at the heart of their worldly analysis of what this means and are what they (and others, based on my very limited observations) think is interesting about this. E.g., they suggest that therefore there is no point in trying to intervene on other drug use, but only on vaping, as if every association in sight were causal (*SMH*).

      The result is somewhat interesting, just not for the reason they think it is because of that mistake.

      • ” Ironically, even as they have passages that show sufficient sophistication that they must know confounding can explain the association, they have other passages that basically declare, without basis, that the association is causal.”

        Because without those declarations and the conclusions they lead to, the researchers will not get any further grants from those funders. One area you might want to look into is whether you can get hold not just of the grants that funded research but the grant PROPOSALS that won those grants. I believe you will OFTEN find statements within those proposals reassuring the funders that the findings will be sculpted to fit the desired “message.”

        Here’s an example that I used in TobakkoNacht, p. 125:

        Here is an excerpt from the original research grant proposal to Clearway’s legal predecessor, the Minnesota Partnership Acting Against Tobacco (MPAAT), a very openly antismoking organization (Just note its name!) also funded by MSA tax dollars.

        “We believe that this research will provide public health officials and tobacco control advocates with information that can help shape adoption and implementation of CIA [Clean Indoor Air] policies, and prevent their repeal [and] contribute to MPAAT’s overall mission by providing information that enables adoption and successful implementation of policies to protect employees and the general public from secondhand smoke exposure.”

        Note that the ONLY “policies to protect… etc” that were acceptable to MPAAT were total bans: considerations of ventilation alternatives and such things were simply dismissed.

        The research in question of course found (Surprise! Surprise!) that bans had no adverse consequences for bar employment!

        Of course a closer look reveals the statistical trick used: they lumped in all restaurant employment, most of which had already banned smoking before the ban, and thus the hit to bar employment was simply swallowed up in the far larger general growth of the restaurant industry during the period studied.

        The grant proposals can be VERY revealing if you can get hold of them!

        – MJM

  2. I’m a simple country lawyer, so please bear with me. As I understand it, the authors of the study recognize that people who vape are more inclined to smoke because the behaviors are similar and have similar appeal. Therefore, the same kind of people who engage in one behavior are more likely to engage in the other behavior. Moreover, people who smoke are more likely to vape because many of them perceive vaping as a means to quit smoking. Evidently, the authors understand these things and tried to account for them in their study. But, frankly, I’ve read your analysis twice and still don’t understand how they went about that or how it would even be possible.

    In any event, in a gateway study, it seems critically important at the outset to know which behavior came first. So, of the 2,039 subjects, how many non-smokers vaped and subsequently took up smoking?

    Par. 7, line 1: delete either “are” or “despite”
    Par. 7 line 6: “no” should be “not”
    Par 9, line 5: “the” should be “they”

    • Carl V Phillips

      Your confusion is legitimate. They do seem to understand about the same types of people (not necessarily the intentional switching, but at least the common propensities). All the studies in this space suffer from not doing anything to seriously address that. Thus my point that the associations will exist whether or not the causal hypothesis (gateway) is true. This study is no exception. Do they seem to think it is? Yes, apparently. On what basis? I too could not figure that out for sure. The implicit claim just sort of appears without explanation. Presumably they think it comes from that within-subject result they emphasize in their conclusions and I emphasize in the critique. But it does not follow from that.

      Re order and such, see my exchange with Prof Hajek: They actually focused on changes in behaviors rather than initiation events. This is actually a somewhat better practice. (Someone can ever smoke before ever vaping, but vaping could still be what causes them to become a smoker or heavier smoker.) Order matters some, but it is not nearly as important as some of the rhetoric (from “our side”) implies it is.

      Thanks for the typo fixes. Changed. I look forward to employing an editor!

  3. Wish I could edit my comment because it may not be entirely clear. I don’t understand how the authors rationalized their conclusion about the gateway effect because I don’t understand “covariates lose their predictive value about smoking once it is known someone vapes.” Could you walk me through that please?

    BTW: Par. 10, line 8: “noting” should be “nothing”

    • Carl V Phillips

      It’s never all the typos, is it.

      See my reply to Hajek for what they specifically said and that I paraphrased. Your confusion is well-earned — you are not being naive. They fail to defend that conclusion, or even present a real argument. Here is what I infer they are thinking from their innuendo: “When doing a simple ‘throw in everything without hierarchy’ analysis, or the between-subject analysis, it looks like other drug and psych factors *cause* [error] both smoking and vaping. But once someone vapes (or smokes), those factor no longer are associated with increases in tobacco product usage, so they are not what is causing it. Instead, since vaping predicts more smoking in the future, within-subject, it is only vaping that is *causing* [error] more smoking, so it is a gateway.” That seems to be it, and it is complete nonsense. As I explain, the covariates are rough proxies for the latent (unmeasurable) variable “propensity to use some tobacco product”, not actually causes (the psych measures may actually be causes, but it is better to think of them as causes of the propensity, and only causes of product use via that). Actually using a product is a much better proxy for that, but again that does not make it a cause.

  4. Many thanks. It’s comforting to know that I didn’t understand it, not because I’m an idiot, but rather because, as you say, it’s “complete nonsense.” However, I’ve read enough of your posts to recognize that the latter reason does not necessarily rule out the existence of the former deficit.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s