# Sunday Science Lesson: Serious sampling bias

by Carl V Phillips

Sorting out truths from lies requires an understanding of the underlying science.  Since I am, arguably more than anything else in my professional life, a science teacher, I thought it might be worth posting a few focused lessons on scientific points that are key to understanding our topic area.  Someone who reads all of my posts would probably pick up these same points en passant, but I thought I would see if there is some value in some periodic posts about a particular lesson that is not buried in the specifics of a particular topical discussion.

Most consumers of social science reports, a category that includes epidemiology, are vaguely aware that what they see is based on some sample of the total population.  They are seldom aware of quite how much sampling properties affect the results.  This is understandable, since those who conduct such science are often equally unaware of it, and even when they understand it in theory, they ignore it when presenting their results.

The error statistics that you commonly see (confidence intervals and such) are based on the potential for purely random sampling error.  That is, they offer a rough measure of about how likely it is that luck of the draw produced a misleading result, such as if you flipped a coin 100 times and got a non-representative result of 60 heads.   (Note that the numbers you see are really just that — a rough measure.  Contrary to common belief, the exact borders of the confidence interval mean nothing of importance, but that is a lesson for another day.)  Those statistics are only valid if we assume that the only error is random.  Other types of error (non-random sampling bias, measurement error, etc.) ought to be represented in summary error statistics too.  My fame in epidemiology is largely due to my work that argues this point and inspired some efforts to create partial solutions, but such efforts have been a failure to date, and so the reader is left having to recognize the unreported error.

In some cases this error represents a relatively minor adjustment in the results.  If a study attempts to get a representative sample but seems like it might have failed to do so (e.g., because people with a particular characteristic seem slightly more likely to refuse to participate in a study), the estimated effect will be biased away from the true value.  There are ways to try to adjust for this, at least roughly.

But in other cases the sample is so clearly and completely unrepresentative that it is just nonsense to even calculate some of the statistics you see.  A good example of that is an recent paper by anti-THR activists that has provoked several comment threads.  The authors mined comments in sections of e-cigarette message boards (possibly in violation of terms-of-service) that reported adverse effects that the posters think may have been caused by their e-cigarette use.  This is not an entirely illegitimate or even useless exercise.  There is substantial value in compiling adverse event reports to give some idea of what possible outcomes to look for in further research.  Indeed, a robust enough collection of the right kind of  adverse event reports, if they find a consistent problem, is good evidence the problem is real.

What is not legitimate, however, is to calculate statistics based on that extremely unrepresentative sample.  Instead of sticking to what was useful, the authors engaged in various bits of fancy intellectual masturbation with their paltry data, and reported such statistics as what percentage of the reported results were negative rather than positive.  Um, yeah.  If you search the forums pages that discuss possible adverse events, you are going to find mostly adverse events.  I doubt I have to explain why this sampling method is cannot produce any useful estimates about how often particular events occur.  If you sampled people sitting the waiting rooms of medical clinics, you would also find them reporting mostly negative health conditions.  It would not even make sense to try to estimate the distribution of which negative health states are most common from that sample, since people with some conditions are far more likely to be there than others.

The same principle applies to the many surveys of e-cigarette users.  Existing surveys, and all those that are likely to be reported in the near future, consist of convenience samples of users who highly motivated to respond, and are either customers of online merchants or are politically/socially active in the vaping community and respond to postings.  Surveys that are representative of the population, like those conducted by the US government, have not yet reported data on enough people who tried or used e-cigarettes to draw conclusions about the overall population.  Thus, we only know about the practices, habits, history, and success of people who are dedicated to vaping.

Despite this, the results are frequently reported as if they can provide information like how often use of e-cigarettes lead to successful smoking cessation (e.g., the recent report touted by the UK NHS).  They cannot do this for obvious reasons:  Anyone who found e-cigarettes unappealing and did not continue to buy them and does not frequent vaping social media is not going to be in the sample.  Even among current consumers of e-cigarettes, those motivated to respond to the survey will be biased toward those who are happiest about their experience and most excited about having found a good way to quit smoking.  There will be very few responses from, for example, from causal vapers who were never regular smokers.

It is even worse than this.  We can generally make a guess about who responds to the surveys, as I just did, but we cannot even say with certainty that they are representative of the happiest and most dedicated vapors.  In the jargon, we simply do not know the sampling properties.  When using a convenience sample of highly motivated volunteers, you really have no idea who your study population represents.  Thus it is not really even appropriated to make claims like “among dedicated vapers, X% have completely quit smoking.”

This is not to say that the surveys are uninformative.  There is a lot of useful information to be gleaned.  Nor is it difficult to see the temptation to do report some statistics that may be hopelessly biased by the weird sample.  (I did not look back to see whether our report — the first published survey of e-cigarette users — was guilty of that.  I am pretty confident I did not allow any glaring problems, but it is easy to not be sufficiently careful about acknowledging the limits of the sampling, so I am sure you could call me on something.)

So, if a study sample is a random draw from some identifiable population, but it is really not quite random in important ways, the results are biased but might still be in the right neighborhood.  Even if the sample is not random but is based on an identifiable population, there is some hope.  But when it is not even possible to describe who the population is that the non-random sample is drawn from, it is pretty much impossible to make sense of any statistics other than in reference to the specific study respondents, which is not very interesting and is pretty much never the way the results are presented.

### 12 responses to “Sunday Science Lesson: Serious sampling bias”

1. It’s strange to know they’re reading our forums and yet have not ackowledged all that is positive about our vaping experience, which is the bulk of every e-cig forum, I.E. the thousands that have stopped smoking.

2. Konstantinos Farsalinos

Since i am currently performing an huge survey of users (more that 11,000 already participated), Carl gives me a good chance to comment on that.

I absolutely agree with you that, inevitably, such surveys do not recruit a random sample and thus several information cannot be representative of the whole population of users. However, extremely important messages may come out.
Health benefits are not biased by motivation. Nicotine levels needed to stop or significantly reduce smoking, possible side effects and their duration, reason for starting using the e-cigarette and several other information can be really valuable and are not subject to bias any more than would a random sample of vapers recruited for personal interview.
I agree that data on smoking cessation rates cannot be objective and will probably oversetimate the potential of e-cigarettes, but still surveys should not be discredited.

• Carl V Phillips

I mostly agree. A survey is indeed a nice way to collect what are basically a collection of case studies which can be very useful for figuring out “how to make this work for you like others have” advice. Adverse events have a chance of being reasonably representative for minor problems, but major negative reactions will be under-represented because those who experienced them will have exited the survey population. We can be pretty sure that health benefits are substantially associated with motivation (having health benefits causes motivation, and being happy about something improves health, or at least perceived health) and so that result is very unlikely to be representative.

• Konstantinos Farsalinos

No direct proof of any health benefits could ever be derived from surveys, even if you could make sure that the sample is somewhat random. I absolutely agree with you on that. Surveys are useful in order to get to know better the group studied, usually the proportion of people who are mostly motivated (as you have already stated). In the case of smoking and e-cigarettes however, motivation is certainly one of the major causes of success. Without being motivated, no method could lead to smoking reduction or cessation.

• Carl V Phillips

I will circle back and argue the other side a bit. A survey can definitely demonstrate that self-perceived health benefits exist as well as any other study design. If you have 10,000 people describing health benefits, that is definitely evidence that the benefits exist and, moreover, that (unlike one-off reports or smaller collections) is evidence that they are quite common. There is great value there. What cannot be done is to say how probable such benefits are for the average person who switches to e-cigarettes (let alone everyone who merely tries them) — those are the statistics that are hopelessly biased by the sampling method that I was talking about. It is difficult, but it is best to resist the temptation to report, for example, something like “92% reported health benefits”.

• Konstantinos Farsalinos

I totally agree with you Carl. And as i mentioned before, it all has to do with the way you present it. I think you should report what you have found, but make clear that this cannot be representative of the whole population of potential users….

3. (Please allow me to comment on this, as a non-scientist, as I was the manager of the largest forum used for collection of data by the researchers under discussion, and am more than familiar with the health-based issues illustrated there.)

Studies that are based on user-community surveys are useful, and probably valuable in a situation where there is little research, because they give us some information, or clues to information, that was not previously available. Where they fall down down is the strict application of such data to the gen. pop., because the data is not applicable to that larger group. It is also debatable whether small-cohort gen pop studies where the results are multiplied up and then claimed as representative are even accurate, as anomalies may be noticed. There will always be sampling issues.

But the main problem with some surveys of the ecig user community is that there is an extreme financial bias to begin with, and a preconceived agenda. In that climate, such studies are clearly propaganda and nothing else. The recent study based on forum posts related to health issues is the prime example of this: almost every datum could be viewed as opposite to the true case. In fact, a perfect example of irony in studies. I won’t dignify such piffle with the name ‘clinical study’.

• Konstantinos Farsalinos

The problem is not the research, it is the way it is presented. Every kind of research has limitations, and every research also has applications. It is the way you present it (and the message you try to convey from it) that can be misleading for both the public and even scientists (or in our case public health authorities).