Misleading systematic review of mindfulness studies used to promote Bensen Institute for Mind-Body Medicine services

A seriously flawed overview “systematic review “ of systematic reviews and meta-analyses of the effects of mindfulness on health and well-being alerts readers how they need to be skeptical of what they are told about the benefits of mindfulness.

Especially when the information comes those benefiting enormously from promoting the practice.

The glowing evaluation of the benefits of mindfulness presented in a PLOS One review is contradicted by a more comprehensive and systematic review which was cited but summarily dismissed. As we will see, the PLOS One article sidesteps substantial confirmation bias and untrustworthiness in the mindfulness literature.

The review was prepared by authors associated with the Benson-Henry Institute for Mind-Body Medicine, which is tied to Massachusetts General Hospital and Harvard Medical School. The institute directly markets mindfulness treatment to patients and training to professionals and organizations.  Its website provides links to research articles such as this one, which are used to market a wide range of programs –

being calm

Recently PLOS One published corrections to five articles from this group concerning previous statements about the authors having no conflicts of interest to declare. The corrections acknowledged extensive conflicts of interest.

The Competing Interests statement is incorrect. The correct Competing Interests statement is: The following authors hold or have held positions at the Benson-Henry Institute for Mind-Body Medicine at Massachusetts General Hospital, which is paid by patients and their insurers for running the SMART-3RP and related relaxation/mindfulness clinical programs, markets related products such as books, DVDs, CDs and the like, and holds a patent pending (PCT/US2012/049539 filed August 3, 2012) entitled “Quantitative Genomics of the Relaxation Response.”

While the review we will be discussing was not corrected, it should have been.

The same conflicts of interest should have been disclosed to readers evaluating the trustworthiness of what is being presented to them.

Probing this review will demonstrate just how hard it is to uncover the bias and distortions that routinely is provided by promoters of mindfulness wanting to demonstrate the evidence base for what they offer.

The article is

Gotink, R.A., Chu, P., Busschbach, J.J., Benson, H., Fricchione, G.L. and Hunink, M.M., 2015. Standardised mindfulness-based interventions in healthcare: an overview of systematic reviews and meta-analyses of RCTs. PLOS One, 10(4), p.e0124344.

The abstract offers the conclusion:

The evidence supports the use of MBSR and MBCT to alleviate symptoms, both mental and physical, in the adjunct treatment of cancer, cardiovascular disease, chronic pain, depression, anxiety disorders and in prevention in healthy adults and children.

This evaluation is more emphatically stated near the end of the article:

This review provides an overview of more trials than ever before and the intervention effect has thus been evaluated across a broad spectrum of target conditions, most of which are common chronic conditions. Study settings in many countries across the globe contributed to the analysis, further serving to increase the generalizability of the evidence. Beneficial effects were mostly seen in mental health outcomes: depression, anxiety, stress and quality of life improved significantly after training in MBSR or MBCT. These effects were seen both in patients with medical conditions and those with psychological disorders, compared with many types of control interventions (WL, TAU or AT). Further evidence for effectiveness was provided by the observed dose-response relationship: an increase in total minutes of practice and class attendance led to a larger reduction of stress and mood complaints in four reviews [18,20,37,54].

Are you impressed? “More than ever before”? “Generalizability of the evidence”? Really?

And in wrap up summary comments:

Although there is continued scepticism in the medical world towards MBSR and MBCT, the evidence indicates that MBSR and MBCT are associated with improvements in depressive symptoms, anxiety, stress, quality of life, and selected physical outcomes in the adjunct treatment of cancer, cardiovascular disease, chronic pain, chronic somatic diseases, depression, anxiety disorders, other mental disorders and in prevention in healthy adults and children.

Compare and contrast these conclusions with a more balanced and comprehensive review.

The US Agency for Healthcare Research and Quality (AHCRQ) commissioned a report from Johns Hopkins University Evidence-based Practice Center.

The 439 page report is publicly available:

Goyal M, Singh S, Sibinga EMS, Gould NF, Rowland-Seymour A, Sharma R, Berger Z, Sleicher D, Maron DD, Shihab HM, Ranasinghe PD, Linn S, Saha S, Bass EB, Haythornthwaite JA. Meditation Programs for Psychological Stress and Well-Being. Comparative Effectiveness Review No. 124. (Prepared by Johns Hopkins University Evidence-based Practice Center under Contract No. 290-2007-10061–I.) AHRQ Publication No. 13(14)-EHC116-EF. Rockville, MD: Agency for Healthcare Research and Quality; January 2014.

A companion, less detailed article was also published in JAMA: Internal Medicine:

Goyal, M., Singh, S., Sibinga, E.M., Gould, N.F., Rowland-Seymour, A., Sharma, R., Berger, Z., Sleicher, D., Maron, D.D., Shihab, H.M. and Ranasinghe, P.D., 2014. Meditation programs for psychological stress and well-being: a systematic review and meta-analysis. JAMA Internal Medicine, 174(3), pp.357-368.

Consider how conclusions of this article were characterized in the Bensen-Henry PLOS One article. The article is briefly mentioned without detailing its methods and conclusions.

Recently, Goyal et al. published a review of mindfulness interventions compared to active control and found significant improvements in depression and anxiety[7].


A recent review compared meditation to only active control groups, and although lower, also found a beneficial effect on depression, anxiety, stress and quality of life. This review was excluded in our study for its heterogeneity of interventions [7].

What the Goyal et JAMA: Internal Medicine actually said:

After reviewing 18 753 citations, we included 47 trials with 3515 participants. Mindfulness meditation programs had moderate evidence of improved anxiety (effect size, 0.38 [95% CI, 0.12-0.64] at 8 weeks and 0.22 [0.02-0.43] at 3-6 months), depression (0.30 [0.00-0.59] at 8 weeks and 0.23 [0.05-0.42] at 3-6 months), and pain (0.33 [0.03- 0.62]) and low evidence of improved stress/distress and mental health–related quality of life. We found low evidence of no effect or insufficient evidence of any effect of meditation programs on positive mood, attention, substance use, eating habits, sleep, and weight. We found no evidence that meditation programs were better than any active treatment (ie, drugs, exercise, and other behavioral therapies).

The review also notes that evidence of the effectiveness mindfulness interventions is largely limited to trials in which it is compared to no treatment, wait list, or a usually ill-defined treatment as usual (TAU).

In our comparative effectiveness analyses (Figure 1B), we found low evidence of no effect or insufficient evidence that any of the meditation programs were more effective than exercise, progressive muscle relaxation, cognitive-behavioral group therapy, or other specific comparators in changing any outcomes of interest. Few trials reported on potential harms of meditation programs. Of the 9 trials reporting this information, none reported any harms of the intervention.

This solid JAMA: Internal Medicine review explains why its conclusions may differ from past reviews:

Reviews to date report a small to moderate effect of mindfulness and mantra meditation techniques in reducing emotional symptoms (eg, anxiety, depression, and stress) and improving physical symptoms (eg, pain).7– 26 These reviews have largely included uncontrolled and controlled studies, and many of the controlled studies did not adequately control for placebo effects (eg, waiting list– or usual care–controlled studies). Observational studies have a high risk of bias owing to problems such as self-selection of interventions (people who believe in the benefits of meditation or who have prior experience with meditation are more likely to enroll in a meditation program and report that they benefited from one) and use of outcome measures that can be easily biased by participants’ beliefs in the benefits of meditation. Clinicians need to know whether meditation training has beneficial effects beyond self-selection biases and the nonspecific effects of time, attention, and expectations for improvement.27,28

Basically, this article insists that mindfulness be evaluated in a  head-to- head comparison to an active treatment. Failure to provide such a comparison means not being able to rule out that apparent effects of mindfulness are nonspecific, i.e.,  not due to any active ingredient of the practice.

An accompanying editorial commentary raised troubling issues about the state of the mindfulness literature. It noted that limiting inclusion to RCTs with an active control condition and a patient population experiencing mental or physical health problems left only 3% (47/18,753 of the citations that had been retrieved. Furthermore:

The modest benefit found in the study by Goyal et al begs the question of why, in the absence of strong scientifically vetted evidence, meditation in particular and complementary measures in general have become so popular, especially among the influential and well educated…What role is being played by commercial interests? Are they taking advantage of the public’s anxieties to promote use of complementary measures that lack a base of scientific evidence? Do we need to require scientific evidence of efficacy and safety for these measures?

How did the Bensen-Henry review arrive at a more favorable assessment?

The issue that dominated the solid Goyal et al systematic review and meta analysis is not prominent in the Bensen-Henry review. The latter article hardly mentions the importance of whether mindfulness is compared to an active treatment. It doesn’t mention if any difference in effect size for mindfulness can be expected when the comparison is an active treatment.

The Bensen-Henry review stated that it excluded systematic reviews and meta analyses if they did not focus on MBCT or MBSR. One has to search the supplementary materials to find that Goyal et al was excluded because it did not calculate separate effect sizes for mindfulness-based stress reduction (MBSR).

However, Bensen-Henry review included narrative systematic reviews that did not calculate effect sizes at all. Furthermore, the excluded Goyal et al JAMA: Internal Medicine article summarized MBSR separate from other forms of meditation and the more comprehensive AHCQR report provided detailed forest plots of effect sizes for MBSR with specific outcomes and patient populations.

Hmm, keeping out evidence that does fit with the sell-job story?

We need to keep in mind the poor manner in which MBSR was specified, particularly in the early studies that dominate the reviews covered by the Bensen – Henry article. Many of the treatments were not standardized and certainly not manualized. They sometimes, but not always incorporate psychoeducation, other cognitive behavioral techniques, and varying types of yoga.

The Bensen-Henry authors claimed to have performed quality assessments  of the reviews  included using a checklist based on the validated PRISMA guidelines. However, PRISMA evaluates the quality of reporting in reviews, not the quality of how the review was done. The checklist used by the Bensen-Henry authors was highly selective in terms of which PRISMA items it chose to include, left unvalidated, and simply eccentric. For instance, one item evaluated a review favorably if it interpreted studies “independent of funding source.”

A lack of independence of a study from its funding source is generally considered a high risk of bias.  There is ample documentation of  industry-funded studies and reviews exaggerating the efficacy of interventions supported by industry.

Our group received the Bill Silverman Prize from the Cochrane Collaboration for our identifying funding source as an overlooked source of bias in many meta analyses and, in particular, in Cochrane reviews. The Bensen-Henry checklist scores a review ignoring funding source as a virtue, not a vice! These authors are letting trials and reviews from promoters of mindfulness off the hook for potential conflict of interest, including their own studies and this review.

Examination of the final sample of reviews included in the Bensen-Henry analysis reveals that some are narrative reviews and could not contribute effect sizes. Some are older reviews that depend on a less developed literature. While optimistic about the promise of mindfulness, authors of these reviews frequently complained about the limits on the quantity and quality of available studies, calling for larger and better quality studies. When integrated and summarized by the Bensen-Henry authors, these reviews were given a more positive glow than the original authors conveyed.

Despite claims of being an “overview of more trials than ever before”, Bensen-Henry excluded all but 23 reviews. Some of those included do not appear to be recent or rigorous, particularly when contrasted with the quality and rigor of the excluded Goyal et al:

MJ, Norris RL, Bauer-Wu SM (2006) Mindfulness meditation for oncology patients: A discussion and critical review. Integr Cancer Ther 5: 98–108. pmid:16685074

Shennan C, Payne S, Fenlon D (2011) What is the evidence for the use of mindfulness-based interventions in cancer care? A review. Psycho-Oncology 20: 681–697.

Veehof MM, Oskam MJ, Schreurs KMG, Bohlmeijer ET (2011) Acceptance-based interventions for the treatment of chronic pain: A systematic review and meta-analysis. Pain 152: 533–542

Coelho HF, Canter PH, Ernst E (2007) Mindfulness-Based Cognitive Therapy: Evaluating Current Evidence and Informing Future Research. J Consult Clin Psychol 75: 1000–1005.

Ledesma D, Kumano H (2009) Mindfulness-based stress reduction and cancer: A meta-analysis. Psycho-Oncology 18: 571–579.

Ott MJ, Norris RL, Bauer-Wu SM (2006) Mindfulness meditation for oncology patients: A discussion and critical review. Integr Cancer Ther 5: 98–108.

Burke CA (2009) Mindfulness-Based Approaches with Children and Adolescents: A Preliminary Review of Current Research in an Emergent Field. J Child Fam Stud.

Do we get the most authoritative reviews of mindfulness from  Holist Nurs Pract, Integr Cancer Ther, and Psycho-Oncology?

To cite just one example of the weakness of evidence being presented as strong, take the bold Bensen-Henry conclusion:

Further evidence for effectiveness was provided by the observed dose-response relationship: an increase in total minutes of practice and class attendance led to a larger reduction of stress and mood complaints in four reviews [18,20,37,54].

“Observed dose-response relationship”? This claim is  based [check out with respect to the citations just above] on Ott et al, 18, Smith et al 20, Burke 37 and Proulx 54, which makes the evidence neither recent nor systematic. I am confident that other examples will not hold up if scrutinized.

Further contradiction of the too perfect picture of mindfulness therapy conveyed by the Bensen – Henry review.

A more recent PLOS One review of mindfulness studies exposed the confirmation bias in the published mindfulness literature. It suggested a too perfect picture has been created of uniformly positive studies.

Coronado-Montoya, S., Levis, A.W., Kwakkenbos, L., Steele, R.J., Turner, E.H. and Thombs, B.D., 2016. Reporting of positive results in randomized controlled trials of mindfulness-based mental health interventions. PLOS One, 11(4), p.e0153220.

A systematic search yielded 124 RCTs of mindfulness-based treatments:

108 (87%) of 124 published trials reported >1 positive outcome in the abstract, and 109(88%) concluded that mindfulness-based therapy was effective, 1.6 times greater than the expected number of positive trials based on effect size d = 0.55 (expected number positivetrials = 65.7). Of 21 trial registrations, 13 (62%) remained unpublished 30 months post-trial completion.


None of the 21 registrations, however, adequately specified a single primary outcome (or multiple primary outcomes with an appropriate plan for statistical adjustment) and specified the outcome measure, the time of assessment, and the metric (e.g., continuous, dichotomous). When we removed the metric requirement, only 2 (10%) registrations were classified as adequate.

And finally:

There were only 3 trials that were presented unequivocally as negative trials without alternative interpretations or caveats to mitigate the negative results and suggest that the treatment might still be an effective treatment.

What we have is a picture of trials of mindfulness-based treatment having an excess of positive studies, given the study sample sizes. Selective reporting of positive outcomes likely contributed to this excess of published positive findings in the published literature. Most of the trials were not preregistered and so it’s unclear whether the positive outcomes that were reported were hypothesized to be the primary outcomes of interest. Most of the trials that were preregistered remained unpublished 30 months after the trials were completed.

The Goyal et al. study originally planned to conduct quantitative analyses of publication biases, but abandoned the effort when they couldn’t find sufficient numbers of the 47 studies that that reported most of the outcomes they evaluated.


 The Bensen-Henry review produces a glowing picture of the quality of RCTs evaluating MSBR and the consistency of positive findings across diverse outcomes and populations. This is consistent with the message that they want to promote in marketing their products to patients, clinicians, and institutions. In this blog post I’ve uncovered substantial problems in internal to the Bensen-Henry review in terms of the studies that were included and the manner in which they were evaluated. But now we have external evidence in two reviews without obvious conflicts of interest come into markedly different appraisals of a literature that lacks appropriate control groups and seems to be reporting findings with a distinct confirmation bias.

I could have gone further, but what I found about the Bensen-Henry review seems sufficient for a serious challenge to the validity of its conclusions.  Investigation of the claims made about dose-response relationships between amount of mindfulness practice and outcomes should encourage probing of other specific claims.

The larger issue is that we should not rely on promoters of MSBR products to provide unbiased estimates of their efficacy. This issue recalls very similar problems in the evaluation of Triple P Parenting Programs. Evaluations in which promoters were involved produce markedly more positive results than from independent evaluations. Exposure by my colleagues and me led to over 50 corrections and corrigendum to articles that previously had no conflicts of interest. But the process did not occur without fierce resistance from those whose livelihood was being challenged.

A correction to the Bensen-Henry PLOS One review is in order to clarify the obvious conflicts of interest of the authors. But the problem is not limited to reviews or original studies from Benson-Henry Institute for Mind-Body Medicine. It’s time that authors be required to answer more explicit questions about conflict of interest. Ruling out a conflict of interest should be based on authors having to endorse explicitly no conflicts, rather than on their basis of their not disclosing a conflict and then being able to claim it was an oversight that they did not report one.

Postscript Who was watching at PLOS One to keep out infomercials from promoters associated with Massachusetts General Hospital and Harvard Medical School? The Academic Editor was To avoid the appearance of  a conflict of interest,  should he have recused him from serving as editor?

This is another flawed paper for which I’d love to see the reviews.

COBRA study would have shown homeopathy can be substituted for cognitive behavior therapy for depression

If The Lancet COBRA study had evaluated homeopathy rather than behavioural activation (BA), homeopathy would likely have similarly been found “non-inferior” to cognitive behavior therapy.

This is not an argument for treating depression with homeopathy, but an argument that the 14 talented authors of The Lancet COBRA study stacked the deck for their conclusion that BA could be substituted for CBT in routine care for depression without loss of effectiveness. Conflict of interest and catering to politics intruded on science in the COBRA trial.

If a study like COBRA produces phenomenally similar results with treatments based on distinct mechanisms of change, one possibility is that background nonspecific factors are dominating the results. Insert homeopathy, a bogus treatment with strong nonspecific effects, in place of BA, and a non-inferiority may well be shown.

Why homeopathy?

Homeopathy involves diluting a substance so thoroughly that no molecules are likely to be present in what is administered to patients. The original substance is first diluted to one part per 10,000 part alcohol or distilled water. This process is repeated six times, ending up with the original material diluted by a factor of 100−6=10−12 .

Nonetheless, a super diluted and essentially inert substance is selected and delivered within a complex ritual.  The choice of the particular substance being diluted and the extent of its dilution is determined with detailed questioning of patients about their background, life style, and personal functioning. Naïve and unskeptical patients are likely to perceive themselves as receiving exceptionally personalized medicine delivered by a sympathetic and caring provider. Homeopathy thus has potentially strong nonspecific (placebo) elements that may be lacking in the briefer and less attentive encounters of routine medical care.

As an academic editor at PLOS One, I received considerable criticism for having accepted a failed trial of homeopathy for depression. The study had been funded by the German government and had fallen miserably short in efforts to recruit the intended sample size. I felt the study should be published in PLOS One  to provide evidence whether such and worthless studies should be undertaken in the future. But I also wanted readers to have the opportunity to see what I had learned from the article about just how ritualized homeopathy can be, with a strong potential for placebo effects.

Presumably, readers would then be better equipped to evaluate when authors claim in other contexts that homeopathy is effective from clinical trials with it was inadequate control of nonspecific effects. But that is also a pervasive problem in psychotherapy trials [ 1,  2 ]  that do not have a suitable comparison/control group.

I have tried to reinforce this message in the evaluation of complementary or integrative treatments in Relaxing vs Stimulating Acupressure for Fatigue Among Breast Cancer Patients: Lessons to be Learned.

The Lancet COBRA study

The Lancet COBRA study has received extraordinary promotion as evidence for the cost-effectiveness of substituting behavioural activation therapy (BA) delivered by minimally trained professionals for cognitive behaviour therapy (CBT) for depression. The study  is serving as the basis for proposals to cut costs in the UK National Health Service by replacing more expensive clinical psychologists with less trained and experienced providers.

Coached by the Science Media Centre, the authors of The Lancet study focused our attention on their finding no inferiority of BA to CBT. They are distracting us from the more important question of whether either treatment had any advantage over nonspecific interventions in the unusual context in which they were evaluated.

The editorial accompanying the COBRA study suggest a BA involves a simple message delivered by providers with very little training:

“Life will inevitably throw obstacles at you, and you will feel down. When you do, stay active. Do not quit. I will help you get active again.”

I encourage readers to stop and think how depressed persons suffering substantial impairment, including reduced ability to experience pleasure, would respond to such suggestions. It sounds all too much like the “Snap out of it, Debbie” they may have already heard from people around them or in their own self-blame.

Snap out of it, Debbie (from South Park)

 BA by any other name…

Actually, this kind of activation is routinely provided in in primary care in some countries as a first stage treatment in a stepped care approach to depression.

In such a system, when emergent mild to moderate depressive symptoms are uncovered in a primary medical care setting, providers are encouraged neither to initiate an active treatment nor even make a formal psychiatric diagnosis of a condition that could prove self-limiting with a brief passage of time. Rather, providers are encouraged to defer diagnosis and schedule a follow-up appointment. This is more than simple watchful waiting. Until the next appointment, providers encourage patients to undertake some guided self-help, including engagement in pleasant activities of their choice, much as apparently done in the BA condition in the COBRA study. Increasingly, they may encourage Internet-based therapy.

In a few parts of the UK, general practitioners may refer patients to a green gym.

green gym

It’s now appreciated that to have any effectiveness, such prescriptions have to be made in a relationship of supportive accountability. For patients to adhere adequately to such prescriptions and not feel they are simply being dismissed by the provider and sent away. Patients need to have a sense that the prescription is occurring within the context of a relationship with someone who cares with whether they carry out and benefit from the prescription.

Used in this way, this BA component of stepped care could possibly be part of reducing unnecessary medication and the need for more intensive treatment. However, evaluation of cost effectiveness is complicated by the need for a support structure in which treatment can be monitored, including any antidepressant medication that is subsequently prescribed. Otherwise, the needs of a substantial number of patients needing more intensive, quality care for depression would be neglected.

The shortcomings of COBRA as an evaluation of BA in context

COBRA does not provide an evaluation of any system offering BA to the large pool of patients who do not require more intensive treatment in a system where they would be provided appropriate timely evaluation and referral onwards.

It is the nature of mild to moderate depressive symptoms being presented in primary care, especially when patients are not specifically seeking mental health treatment, that the threshold for a formal diagnosis of major depression is often met by the minimum or only one more than the five required symptoms. Diagnoses are of necessity unreliable, in part because the judgment of particular symptoms meeting a minimal threshold of severity is unreliable. After a brief passage of time and in the absence of formal treatment, a substantial proportion of patients will no longer meet diagnostic criteria.

COBRA also does not evaluate BA versus CBT in the more select population that participates in clinical trials of treatment for depression. Sir David Goldberg is credited  with first describing the filters that operate on the pathway of patients from presenting a complex combination of problems in living and psychiatric symptoms in primary medical care to treatment in specialty settings.

Results of the COBRA study cannot be meaningfully integrated into the existing literature concerning BA as a component of stepped care or treatment for depression that is sufficient in itself.

More recently, I reviewed in detail The Lancet COBRA study, highlighting how one of the most ambitious and heavily promoted psychotherapy studies ever – was noninformative.  The authors’ claim was unwarranted that it would be wise to substitute BA delivered by minimally trained providers for cognitive behavior therapy delivered by clinical psychologists.

I refer readers to that blog post for further elaboration of some points I will be making here. For instance, some readers might want to refresh their sense of how a noninferiority trial differs from a conventional comparison of two treatments.

Risk of bias in noninferiority trial

 Published reports of clinical trials are notoriously unreliable and biased in terms of the authors’ favored conclusions.

With the typical evaluation of an active treatment versus a control condition, the risk of bias is that reported results will favor the active treatment. However, the issue of bias in a noninferiority trial is more complex. The investigators’ interest is in demonstrating that within certain limits, there are no significant differences between two treatments. Yet, although it is not always tested directly, the intention is to show that this lack of difference is due them both being effective, rather than ineffective.

In COBRA, the authors’ clear intention was to show that less expensive BA was not inferior to CBT, with the assumption that both were effective. Biases can emerge from building in features of the design, analysis, and interpretation of the study that minimized differences between these two treatments. But bias can also arise from a study design in which nonspecific effects are distributed across interventions so that any difference in active ingredients is obscured by shared features of the circumstances in which the interventions are delivered. As in Alice in Wonderland [https://en.wikipedia.org/wiki/Dodo_bird_verdict ], the race is rigged so that almost everybody can get a prize.

Why COBRA could have shown almost any treatment with nonspecific effects was noninferior to CBT for depression

 1.The investigators chose a population and a recruitment strategy that increase the likelihood that patients participating in the trial would likely get better with minimal support and contact available in either of the two conditions – BA versus CBT.

The recruited patients were not actively seeking treatment. They were identified from records of GPs has having had a diagnosis of depression, but were required to not currently being in psychotherapy.

GP recording of a diagnosis of depression has poor concordance with a formal, structured interview-based diagnosis, with considerable overdiagnosis and overtreatment.

A recent Dutch study found that persons meeting interview-based criteria for major depression in the community who do not have a past history of treatment mostly are not found to be depressed upon re-interview.

To be eligible for participation in the study, the patients also had to meet criteria for major depression in a semi structured research interview with (Structured Clinical Interview for the Diagnostic and Statistical Manual of  Mental Disorders, Fourth Edition [SCID]. Diagnoses with the SCID obtained under these circumstances are also likely to have a considerable proportion of false positives.

A dirty secret from someone who has supervised thousands of SCID interviews of medical patients. The developers of the SCID recognized that it yielded a lot of false positives and inflated rates of disorder among patients who are not seeking mental health care.

They attempted to compensate by requiring that respondents not only endorse symptoms, but indicate that the symptoms are a source of impairment. This is the so-called clinical significance criterion. Respondents automatically meet the criterion if they are seeking mental health treatment. Those who are not seeking treatment are asked directly whether the symptoms impair them. This is a particularly on validated aspect of the SCID in patients typically do not endorse their symptoms as a source of impairment.

When we asked breast cancer patients who otherwise met criteria for depression with the SCID whether the depressive symptoms impaired them, they uniformly said something like ‘No, my cancer impairs me.’ When we conducted a systematic study of the clinical significance criterion, we found that whether or not it was endorsed substantially affected individual in overall rates of diagnosis. Robert Spitzer, who developed the SCID interview along with his wife Janet Williams, conceded to me in a symposium that application of the clinical significance criterion was a failure.

What is the relevance in a discussion of the COBRA study? I would wager that the authors, like most investigators who use the SCID, did not inquire about the clinical significance criterion, and as a result they had a lot of false positives.

The population being sampled in the recruitment strategy used in COBRA is likely to yield a sample unrepresentative of patients participating in the usual trials of psychotherapy and medication for depression.

2. Most patients participating in COBRA reported already receiving antidepressants at baseline, but adherence and follow-up are unknown, but likely to be inadequate.

Notoriously, patients receiving a prescription for an antidepressant in primary care actually take the medication inconsistently and for only a short time, if at all. They receive inadequate follow-up and reassessment. Their depression outcomes may actually be poorer than for patients receiving a pill placebo in the context of a clinical trial, where there is blinding and a high degree of positive expectations, attention and support.

Studies, including one by an author of the COBRA study suggests that augmenting adequately managed treatment with antidepressants with psychotherapy is unlikely to improve outcomes.

We’re stumbling upon one of the more messy features of COBRA. Most patients had already been prescribed medication at baseline, but their adherence and follow-up is left unreported, but is likely to be poor. The prescription is likely to have been made up to two years before baseline.

It would not be cost-effective to introduce psychotherapy to such a sample without reassessing whether they were adequately receiving medication. Such a sample would also be highly susceptible to nonspecific interventions providing positive expectations, support, and attention that they are not receiving in their antidepressant treatment. There are multiple ways in which nonspecific effects could improve outcomes – perhaps by improving adherence, but perhaps because of the healing effects of support on mild depressive symptoms.

3. The COBRA authors’ way of dealing with co-treatment with antidepressants blocked readers ability to independently evaluate main effects and interactions with BA versus CBT.

 The authors used antidepressant treatment as a stratification factor, insuring that the 70% of patients receiving them were evenly distributed the BA in CBT conditions. This strategy made it more difficult to separate effects of antidepressants. However, the problem is compounded by the authors failure to provide subgroup analyses based on whether patients had received an antidepressant prescription, as well as the authors failure to provide any descriptions of the extent to which patients received management of their antidepressants at baseline or during active psychotherapy and follow-up. The authors incorporated data concerning the cost of medication into their economic analyses, but did not report the data in a way that could be scrutinized.

I anticipate requesting these data from the authors to find out more, although they have not responded to my previous query concerning anomalies in the reporting of how long since patients had first received a prescription for antidepressants.

4. The 12 month assessment designated as the primary outcomes capitalized on natural recovery patterns, unreliability of initial diagnosis, and simple regression to the mean.

Depression identified in the community and in primary care patient populations is variable in the course, but typically resolves in nine months. Making reassessment of primary outcomes at 12 months increases the likelihood that effects of active ingredients of the two treatments would be lost in a natural recovery process.

5. The intensity of treatment (allowable number of 20 sessions plus for additional sessions) offered in the study exceeded what is available in typical psychotherapy trials and exceeded what was actually accessed by patients.

Allowing this level of intensity of treatment generates a lot of noise in any interpretation of the resulting data. Offering so much treatment encourages patients dropping out, with the loss of their follow-up data. We can’t tell if they simply dropped out because they had received what they perceived as sufficient treatment or if they were dissatisfied. This intensity of offered treatment reduces generalizability to what actually occurs in routine care and comparing and contrasting results of the COBRA study to the existing literature.

 6. The low rate of actual uptake of psychotherapy and retention of patients for follow-up present serious problems for interpreting the results of the COBRA study.

Intent to treat analyses with imputation of missing data are simply voodoo statistics with so much missing data. Imputation and other multivariate techniques make the assumption that data are missing at random, but as I just noted, this is an improbable assumption. [I refer readers back to my previous blog post who want to learn more about intent to treat versus per-protocol analyses].

The authors cite past literature in their choice to emphasize the per-protocol analyses. That means that they based their interpretation of the results on 135 of 221 patients originally assigned to the BA and in the 151 of 219 patients originally signed to CBT. This is a messy approach and precludes generalizing back to original assignment. That’s why that intent to treat analyses are emphasized in conventional evaluations of psychotherapy.

A skeptical view of what will be done with the COBRA data

 The authors clear intent was to produce data supporting an argument that more expensive clinical psychologists could be replaced by less trained clinicians providing a simplified treatment. The striking lack of differences between BA and CBT might be seen as strong evidence that BA could replace CBT. Yet, I am suggesting that the striking lack of differences could also indicate features built into the design that swamped any differences in limited any generalizability to what would happen if all depressed patients were referred to BA delivered by clinicians with little training versus CBT. I’m arguing that homeopathy would have done as well.

BA is already being implemented in the UK and elsewhere as part of stepped care initiatives for depression. Inclusion of BA is inadequately evaluated, as is the overall strategy of stepped care. See here for an excellent review of stepped care initiatives and a tentative conclusion that they are moderately effective, but that many questions remain.

If the COBRA authors were most committed to improving the quality of depression care in the UK, they would’ve either designed their study as a fairer test of substituting BA for CBT or they would have tackled the more urgent task of evaluating rigorously whether stepped care initiatives work.

Years ago, collaborative care programs for depression were touted as reducing overall costs. These programs, which were found to be robustly effective in many contexts, involved placing depression managers in primary care to assist the GPs in improved monitoring and management of treatment. Often the most immediate and effective improvement was that patients got adequate follow-up, where previously they were simply being ignored. Collaborative care programs did not prove to be cheaper, and not surprising, because better care is often more expensive than ineptly provided inadequate care.

We should be extremely skeptical of experienced investigators who claim that they demonstrate that they can cut costs and maintain quality with a wholesale reduction in the level of training of providers treating depression, a complex and heterogeneous disorder, especially when their expensive study fails to deal with this complexity and heterogeneity.


Relaxing vs Stimulating Acupressure for Fatigue Among Breast Cancer Patients: Lessons to be Learned

  • A chance to test your rules of thumb for quickly evaluating clinical trials of alternative or integrative  medicine in prestigious journals.
  • A chance to increase your understanding of the importance of  well-defined control groups and blinding in evaluating the risk of bias of clinical trials.
  • A chance to understand the difference between merely evidence-based treatments versus science-based treatments.
  • Lessons learned can be readily applied to many wasteful evaluations of psychotherapy with shared characteristics.

A press release from the University of Michigan about a study of acupressure for fatigue in cancer patients was churnaled  – echoed – throughout the media. It was reproduced dozens of times, with little more than an editor’s title change from one report to the next.

Fortunately, the article that inspired all the fuss was freely available from the prestigious JAMA: Oncology. But when I gained access, I quickly saw that it was not worth my attention, based on what I already knew or, as I often say, my prior probabilities. Rules of thumb is a good enough term.

So the article became another occasion for us to practice our critical appraisal skills, including, importantly, being able to make reliable and valid judgments that some attention in the media is worth dismissing out of hand, even when tied to an article in a prestigious medical journal.

The press release is here: Acupressure reduced fatigue in breast cancer survivors: Relaxing acupressure improved sleep, quality of life.

A sampling of the coverage:

sample coverage

As we’ve come to expect, the UK Daily Mail editor added its own bit of spin:

daily mailHere is the article:

Zick SM, Sen A, Wyatt GK, Murphy SL, Arnedt J, Harris RE. Investigation of 2 Types of Self-administered Acupressure for Persistent Cancer-Related Fatigue in Breast Cancer Survivors: A Randomized Clinical Trial. JAMA Oncol. Published online July 07, 2016. doi:10.1001/jamaoncol.2016.1867.

Here is the Trial registration:

All I needed to know was contained in a succinct summary at the Journal website:

key points

This is a randomized clinical trial (RCT) in which two active treatments that

  • Lacked credible scientific mechanisms
  • Were predictably shown to be better than
  • A routine care that lacked the positive expectations and support.
  • A primary outcome assessed by  subjectiveself-report amplified the illusory effectiveness of the treatments.

But wait!

The original research appeared in a prestigious peer-reviewed journal published by the American Medical Association, not a  disreputable journal on Beall’s List of Predatory Publishers.

Maybe  this means publication in a peer-reviewed prestigious journal is insufficient to erase our doubts about the validity of claims.

The original research was performed with a $2.65 million peer-reviewed grant from the National Cancer Institute.

Maybe NIH is wasting scarce money on useless research.

What is acupressure?

 According to the article

Acupressure, a method derived from traditional Chinese medicine (TCM), is a treatment in which pressure is applied with fingers, thumbs, or a device to acupoints on the body. Acupressure has shown promise for treating fatigue in patients with cancer,23 and in a study24 of 43 cancer survivors with persistent fatigue, our group found that acupressure decreased fatigue by approximately 45% to 70%. Furthermore, acupressure points termed relaxing (for their use in TCM to treat insomnia) were significantly better at improving fatigue than another distinct set of acupressure points termed stimulating (used in TCM to increase energy).24 Despite such promise, only 5 small studies24– 28 have examined the effect of acupressure for cancer fatigue.

290px-Acupuncture_point_Hegu_(LI_4)You can learn more about acupressure here. It is a derivative of acupuncture, that does not involve needles, but the same acupuncture pressure points or acupoints as acupuncture.

Don’t be fooled by references to traditional Chinese medicine (TCM) as a basis for claiming a scientific mechanism.

See Chairman Mao Invented Traditional Chinese Medicine.

Chairman Mao is quoted as saying “Even though I believe we should promote Chinese medicine, I personally do not believe in it. I don’t take Chinese medicine.”


Alan Levinovitz, author of the Slate article further argues:


In truth, skepticism, empiricism, and logic are not uniquely Western, and we should feel free to apply them to Chinese medicine.

After all, that’s what Wang Qingren did during the Qing Dynasty when he wrote Correcting the Errors of Medical Literature. Wang’s work on the book began in 1797, when an epidemic broke out in his town and killed hundreds of children. The children were buried in shallow graves in a public cemetery, allowing stray dogs to dig them up and devour them, a custom thought to protect the next child in the family from premature death. On daily walks past the graveyard, Wang systematically studied the anatomy of the children’s corpses, discovering significant differences between what he saw and the content of Chinese classics.

And nearly 2,000 years ago, the philosopher Wang Chong mounted a devastating (and hilarious) critique of yin-yang five phases theory: “The horse is connected with wu (fire), the rat with zi (water). If water really conquers fire, [it would be much more convincing if] rats normally attacked horses and drove them away. Then the cock is connected with ya (metal) and the hare with mao (wood). If metal really conquers wood, why do cocks not devour hares?” (The translation of Wang Chong and the account of Wang Qingren come from Paul Unschuld’s Medicine in China: A History of Ideas.)

Trial design

A 10-week randomized, single-blind trial comparing self-administered relaxing acupressure with stimulating acupressure once daily for 6 weeks vs usual care with a 4-week follow-up was conducted. There were 5 research visits: at screening, baseline, 3 weeks, 6 weeks (end of treatment), and 10 weeks (end of washout phase). The Pittsburgh Sleep Quality Index (PSQI) and Long-Term Quality of Life Instrument (LTQL) were administered at baseline and weeks 6 and 10. The Brief Fatigue Inventory (BFI) score was collected at baseline and weeks 1 through 10.

Note that the trial was “single-blind.” It compared two forms of acupressure, relaxing versus stimulating. Only the patient was blinded to which of these two treatments was being provided, except patients clearly knew whether or not they were randomized to usual care. The providers were not blinded and were carefully supervised by the investigators and provided feedback on their performance.

The combination of providers not being blinded, patients knowing whether they were randomized to routine care, and subjective self-report outcomes together are the makings of a highly biased trial.


Usual care was defined as any treatment women were receiving from health care professionals for fatigue. At baseline, women were taught to self-administer acupressure by a trained acupressure educator.29 The 13 acupressure educators were taught by one of the study’s principal investigators (R.E.H.), an acupuncturist with National Certification Commission for Acupuncture and Oriental Medicine training. This training included a 30-minute session in which educators were taught point location, stimulation techniques, and pressure intensity.

Relaxing acupressure points consisted of yin tang, anmian, heart 7, spleen 6, and liver 3. Four acupoints were performed bilaterally, with yin tang done centrally. Stimulating acupressure points consisted of du 20, conception vessel 6, large intestine 4, stomach 36, spleen 6, and kidney 3. Points were administered bilaterally except for du 20 and conception vessel 6, which were done centrally (eFigure in Supplement 2). Women were told to perform acupressure once per day and to stimulate each point in a circular motion for 3 minutes.

Note that the control/comparison condition was an ill-defined usual care in which it is not clear that patients received any attention and support for their fatigue. As I have discussed before, we need to ask just what was being controlled by this condition. There is no evidence presented that patients had similar positive expectations and felt similar support in this condition to what was provided in the two active treatment conditions. There is no evidence of equivalence of time with a provider devoted exclusively to the patients’ fatigue. Unlike patients assigned to usual care, patients assigned to one of the acupressure conditions received a ritual delivered with enthusiasm by a supervised educator.

Note the absurdity of the  naming of the acupressure points,  for which the authority of traditional Chinese medicine is invoked, not evidence. This absurdity is reinforced by a look at a diagram of acupressure points provided as a supplement to the article.

relaxation acupuncture pointsstimulation acupressure points


Among the many problems with “acupuncture pressure points” is that sham stimulation generally works as well as actual stimulation, especially when the sham is delivered with appropriate blinding of both providers and patients. Another is that targeting places of the body that are not defined as acupuncture pressure points can produce the same results. For more elaborate discussion see Can we finally just say that acupuncture is nothing more than an elaborate placebo?

 Worth looking back at credible placebo versus weak control condition

In a recent blog post   I discussed an unusual study in the New England Journal of Medicine  that compared an established active treatment for asthma to two credible control conditions, one, an inert spray that was indistinguishable from the active treatment and the other, acupuncture. Additionally, the study involved a no-treatment control. For subjective self-report outcomes, the active treatment, the inert spray and acupuncture were indistinguishable, but all were superior to the no treatment control condition. However, for the objective outcome measure, the active treatment was more effective than all of the three comparison conditions. The message is that credible placebo control conditions are superior to control conditions lacking and positive expectations, including no treatment and, I would argue, ill-defined usual care that lacks positive expectations. A further message is ‘beware of relying on subjective self-report measures to distinguish between active treatments and placebo control conditions’.


At week 6, the change in BFI score from baseline was significantly greater in relaxing acupressure and stimulating acupressure compared with usual care (mean [SD], −2.6 [1.5] for relaxing acupressure, −2.0 [1.5] for stimulating acupressure, and −1.1 [1.6] for usual care; P < .001 for both acupressure arms vs usual care), and there was no significant difference between acupressure arms (P  = .29). At week 10, the change in BFI score from baseline was greater in relaxing acupressure and stimulating acupressure compared with usual care (mean [SD], −2.3 [1.4] for relaxing acupressure, −2.0 [1.5] for stimulating acupressure, and −1.0 [1.5] for usual care; P < .001 for both acupressure arms vs usual care), and there was no significant difference between acupressure arms (P > .99) (Figure 2). The mean percentage fatigue reductions at 6 weeks were 34%, 27%, and −1% in relaxing acupressure, stimulating acupressure, and usual care, respectively.

These are entirely expectable results. Nothing new was learned in this study.

The bottom line for this study is that there was absolutely nothing to be gained by comparing an inert placebo condition to another inert placebo condition to an uninformative condition without clear evidence the control condition offered control of nonspecific factors – positive expectations, support, and attention. This was a waste of patient time and effort, as well as government funds, and produced results that were potentially misleading to patients. Namely, results are likely to be misinterpreted the acupressure is an effective, evidence-based treatment for cancer-related fatigue.

How the authors explained their results

Why might both acupressure arms significantly improve fatigue? In our group’s previous work, we had seen that cancer fatigue may arise through multiple distinct mechanisms.15 Similarly, it is also known in the acupuncture literature that true and sham acupuncture can improve symptoms equally, but they appear to work via different mechanisms.40 Therefore, relaxing acupressure and stimulating acupressure could elicit improvements in symptoms through distinct mechanisms, including both specific and nonspecific effects. These results are also consistent with TCM theory for these 2 acupoint formulas, whereby the relaxing acupressure acupoints were selected to treat insomnia by providing more restorative sleep and improving fatigue and the stimulating acupressure acupoints were chosen to improve daytime activity levels by targeting alertness.

How could acupressure lead to improvements in fatigue? The etiology of persistent fatigue in cancer survivors is related to elevations in brain glutamate levels, as well as total creatine levels in the insula.15 Studies in acupuncture research have demonstrated that brain physiology,41 chemistry,42 and function43 can also be altered with acupoint stimulation. We posit that self-administered acupressure may have similar effects.

Among the fallacies of the authors’ explanation is the key assumption that they are dealing with a specific, active treatment effect rather than a nonspecific placebo intervention. Supposed differences between relaxing versus stimulating acupressure arise in trials with a high risk of bias due to unblinded providers of treatment and inadequate control/comparison conditions. ‘There is no there there’ to be explained, to paraphrase a quote attributed to Gertrude Stein

How much did this project cost?

 According to the NIH Research Portfolios Online Reporting Tools website, this five-year project involved support by the federal government of $2,265,212 in direct and indirect costs. The NCI program officer for investigator-initiated  R01CA151445 is Ann O’Marawho serves ina similar role for a number of integrative medicine projects.

How can expenditure of this money be justified for determining whether so-called stimulating acupressure is better than relaxing acupressure for cancer-related fatigue?

 Consider what could otherwise have been done with these monies.

 Evidence-based versus science based medicine

Proponents of unproven “integrative cancer treatments” can claim on the basis of the study the acupressure is an evidence-based treatment. Future Cochrane Collaboration Reviews may even cite this study as evidence for this conclusion.

I normally label myself as an evidence-based skeptic. I require evidence for claims of the efficacy of treatments and am skeptical of the quality of the evidence that is typically provided, especially when it comes from enthusiasts of particular treatments. However, in other contexts, I describe myself as a science based medicine skeptic. The stricter criteria for this term is that not only do I require evidence of efficacy for treatments, I require evidence for the plausibility of the science-based claims of mechanism. Acupressure might be defined by some as an evidence-based treatment, but it is certainly not a science-based treatment.

For further discussion of this important distinction, see Why “Science”-Based Instead of “Evidence”-Based?

Broader relevance to psychotherapy research

The efficacy of psychotherapy is often overestimated because of overreliance on RCTs that involve inadequate comparison/control groups. Adequately powered studies of the comparative efficacy of psychotherapy that include active comparison/control groups are infrequent and uniformly provide lower estimates of just how efficacious psychotherapy is. Most psychotherapy research includes subjective patient self-report measures as the primary outcomes, although some RCTs provide independent, blinded interview measures. A dependence on subjective patient self-report measures amplifies the bias associated with inadequate comparison/control groups.

I have raised these issues with respect to mindfulness-based stress reduction (MBSR) for physical health problems  and for prevention of relapse in recurrence in patients being tapered from antidepressants .

However, there is a broader relevance to trials of psychotherapy provided to medically ill patients with a comparison/control condition that is inadequate in terms of positive expectations and support, along with a reliance on subjective patient self-report outcomes. The relevance is particularly important to note for conditions in which objective measures are appropriate, but not obtained, or obtained but suppressed in reports of the trial in the literature.

Mindfulness-based stress reduction versus cognitive behavior therapy for chronic back pain

The most interesting things to be learned from a recent clinical trial of mindfulness-based stress reduction to cognitive behavior therapy for chronic back pain are not what the authors intend.

Noticing that some key information is missing from the study illustrates why we don’t need more studies like it.

  • We need more studies of mindfulness-based therapies with meaningful comparison/control groups.
  • We need evidence that patients assigned to mindfulness-based treatments actually practice mindfulness in their everyday lives.
  • We need to demonstrate that any efficacy of mindfulness depends upon patients assigned to it actually showing up.
  • We need to be alert how boundaries of the concept of mindfulness-based therapies are expanding. Reviewers should be cautious in integrating results from different studies claiming to evaluate “mindfulness.” There is growing clinical heterogeneity – different interventions, sometimes with very different components–that should be distinguished.

mindfulnessdefn4For only the second time in its history, the flagship journal of the American Medical Association, JAMA has published a clinical trial of mindfulness. [Apparently the only other trial of mindfulness was one for PTSD among veterans, with only modest differences over a present-centered group therapy comparison/control group].

audiovideo promotion mindfulnessThe importance of this study was underscored by (1) an accompanying editorial commentary, (2) free access and continue education credit for reading it, and (3) three multimedia links –a JAMA Report on the study, and audio in video interviews with the author.

The article is

Cherkin DC, Sherman KJ, Balderson BH, Cook AJ, Anderson ML, Hawkes RJ, Hansen KE, Turner JA. Effect of Mindfulness-Based Stress Reduction vs Cognitive Behavioral Therapy or Usual Care on Back Pain and Functional Limitations in Adults With Chronic Low Back Pain: A Randomized Clinical Trial. JAMA. 2016 Mar 22;315(12):1240-9.

The trial registration is Comparison of CAM and Conventional Mind-Body Therapies for Chronic Back Pain.

The protocol is available here.

The editorial commentary by Madhav Goyal and Jennifer Haythornthwaite JA asked:

Is It Time to Make Mind-Body Approaches Available for Chronic Low Back Pain?

My recent discussions [1]  [2]  of articles in JAMA network journals that are accompanied by editorial commentaries have contemplated why particular studies were chosen for JAMA journals and the conflicts of interest that characterize editorial commentaries. This discussion will be somewhat different.

This commentary is definitely written by authors who have reasons to promote mindfulness. The commentary ends with a predictable non sequitur:

High-quality studies such as the clinical trial by Cherkin et al create a compelling argument for ensuring that an evidence-based health care system should provide access to affordable mind-body therapies.

Not exactly, if you stick to the evidence.

I will eventually comment on my usual questions of:

  • Why was this article published in a prestigious, generalist medical journal?
  • Why was it accompanied by an invited editorial commentary?
  • Why were the particular authors chosen for the commentary?

But the commentary isn’t that bad. It makes some reasonable points that might be overlooked. I will mainly focus on the article itself.


Importance. Mindfulness-based stress reduction (MBSR) has not been rigorously evaluated for young and middle-aged adults with chronic low back pain.

Objective. To evaluate the effectiveness for chronic low back pain of MBSR vs cognitive behavioral therapy (CBT) or usual care.

Design, Setting, and Participants.  Randomized, interviewer-blind, clinical trial in an integrated health care system in Washington State of 342 adults aged 20 to 70 years with chronic low back pain enrolled between September 2012 and April 2014 and randomly assigned to receive MBSR (n = 116), CBT (n = 113), or usual care (n = 113).

Interventions. CBT (training to change pain-related thoughts and behaviors) and MBSR (training in mindfulness meditation and yoga) were delivered in 8 weekly 2-hour groups. Usual care included whatever care participants received.

Main Outcomes and Measures. Coprimary outcomes were the percentages of participants with clinically meaningful (≥30%) improvement from baseline in functional limitations (modified Roland Disability Questionnaire [RDQ]; range, 0-23) and in self-reported back pain bothersomeness (scale, 0-10) at 26 weeks. Outcomes were also assessed at 4, 8, and 52 weeks.

Results. There were 342 randomized participants, the mean (SD) [range] age was 49.3 (12.3) [20-70] years, 224 (65.7%) were women, mean duration of back pain was 7.3 years (range, 3 months-50 years), 123 (53.7%) attended 6 or more of the 8 sessions, 294 (86.0%) completed the study at 26 weeks, and 290 (84.8%) completed the study at 52 weeks. In intent-to-treat analyses at 26 weeks, the percentage of participants with clinically meaningful improvement on the RDQ was higher for those who received MBSR (60.5%) and CBT (57.7%) than for usual care (44.1%) (overall P = .04; relative risk [RR] for MBSR vs usual care, 1.37 [95% CI, 1.06-1.77]; RR for MBSR vs CBT, 0.95 [95% CI, 0.77-1.18]; and RR for CBT vs usual care, 1.31 [95% CI, 1.01-1.69]). The percentage of participants with clinically meaningful improvement in pain bothersomeness at 26 weeks was 43.6% in the MBSR group and 44.9% in the CBT group, vs 26.6% in the usual care group (overall P = .01; RR for MBSR vs usual care, 1.64 [95% CI, 1.15-2.34]; RR for MBSR vs CBT, 1.03 [95% CI, 0.78-1.36]; and RR for CBT vs usual care, 1.69 [95% CI, 1.18-2.41]). Findings for MBSR persisted with little change at 52 weeks for both primary outcomes.

Conclusions and Relevance Among adults with chronic low back pain, treatment with MBSR or CBT, compared with usual care, resulted in greater improvement in back pain and functional limitations at 26 weeks, with no significant differences in outcomes between MBSR and CBT. These findings suggest that MBSR may be an effective treatment option for patients with chronic low back pain.

Among the interesting things to note in the abstract is that there were only modest (p <.04) differences between either MBSR or CBT and usual care, which was described as “whatever participants received.” The MBSR was augmented by yoga. We cannot distinguish the effects of mindfulness from this added component.

back-pain-in-seniors-helped-with-mindfulness-300x200Unfortunately, if you do a search for “usual care” or “yoga” in the article itself or in the trial registration or protocol, you won’t learn about what the nature of the usual care or yoga. You will learn, however, in the article that:

Thirty of the 103 (29%) participants attending at least 1 MBSR session reported an adverse event (mostly temporarily increased pain with yoga). Ten of the 100 (10%) participants who attended at least 1 CBT session reported an adverse event (mostly temporarily increased pain with progressive muscle relaxation). No serious adverse events were reported.

Secondary outcomes

Some outcomes that would be of interest to policy makers, clinicians, in patients are relegated to a secondary status: whether medication was used in the past week, whether back exercises were done for at least three days, and whether there was general exercise for more than three days.

There were no consistent effects of these interventions versus routine care for these variables.

Intensity of treatment

Unless a study is focusing simply on differences in intensity of treatment, comparisons of treatments should ensure that the conditions being compared are equivalent in the intensity and frequency of clinical contact. In this trial:

The interventions were comparable in format (group), duration (2 hours/week for 8 weeks, although the MBSR program also included an optional 6-hour retreat), frequency (weekly), and number of participants per group.

Only about a quarter of the patients assigned to MBSR attended the six hour retreat, compounding the problems of adherence (around half of patients assigned to either to MBSR or CBT attended at least six group sessions), which also suggests that the 20% of patients lost to follow-up may not be random. That poses issues for the fancy statistical techniques used to compensate for attrition, which assume the missing data are random.

But the bigger issue is that the interventions provide a lot more contact than is typically available in routine care for chronic pain. There are lots of opportunities for important differences between the interventions and control group in nonspecific factors, like supportive accountability.

More contact communicates the patients that they matter more. Getting more interaction with providers means patients have more of a sense that their adherence matters (i.e., they are accountable) to someone besides themselves for activities like daily back exercises. The more intensive treatment also influences self-reported subjective outcomes, even when effects are not shown for other important variables, like decreased use of medication.

Distinguishing MBSR from CBT

MSBR is described as

MBSR was modeled closely after the original MBSR program—adapted from the 2009 MBSR instructor’s manual by a senior MBSR instructor. The MBSR program does not focus specifically on a particular condition such as pain. All classes included didactic content and mindfulness practice (body scan, yoga, meditation [attention to thoughts, emotions, and sensations in the present moment without trying to change them, sitting meditation with awareness of breathing, and walking meditation]).

The original manual that is cited comes from the University of Massachusetts Medical School. If you go the website you can find Mindfulness-Based Stress Reduction (MBSR): Standards of Practice .

yoga posesThe standards describe the yoga component as

“Formal” Mindfulness Meditation Methods

Body Scan Meditation – a supine meditation

Gentle Hatha Yoga – practiced with mindful awareness of the body

Sitting Meditation – mindfulness of breath, body, feelings, thoughts, emotions, and choiceless awareness

Walking Meditation

My concern is that an RCT has been published in JAMA concludes that a combined mindfulness and yoga treatment “may be an effective treatment option for patients with chronic low back pain.” Past research by some of the authors this JAMA article suggests that yoga by itself provides only short-term benefits for patients with chronic pain. This particular study had worrisome adverse effect from the yoga component. Why add something unnecessary to treatments if they may have adverse effects?

Although the providers of MBSR are described as having training in MBSR, there is no mention of training specifically for yoga for patients with chronic back pain.

Practitioners of yoga who have intermittent chronic pain tell me that it has been very important for them to find yoga instructors who are competent to deal with pain. A single, ill-chosen exercise can inflict long-term damage on patient who already has chronic back pain.

CBT is described as

The CBT protocol included CBT techniques most commonly applied and studied for chronic low back pain. The intervention included (1) education about chronic pain, relationships between thoughts and emotional and physical reactions, sleep hygiene, relapse prevention, and maintenance of gains; and (2) instruction and practice in changing dysfunctional thoughts, setting and working toward behavioral goals, relaxation skills (abdominal breathing, progressive muscle relaxation, and guided imagery), activity pacing, and pain-coping strategies. Between-session activities included reading chapters of The Pain Survival Guide: How to Reclaim Your Life. Mindfulness, meditation, and yoga techniques were proscribed in CBT; methods to challenge dysfunctional thoughts were proscribed in MBSR.

Many stripped-down versions of CBT offered in primary care do not have all these components, leaving out the abdominal breathing, progressive muscle relaxation, and guided imagery. Many eclectic versions of mindfulness training incorporate progressive muscle relaxation.

Given the about 50% attendance to at least six sessions in the modest uptake of the mindfulness retreat, I’m not sure that that these two interventions often distinctly different experiences. It’s doubtful that questions of whether these two treatments are characterized by distinctly different mechanisms could be addressed in this trial.

Routine Care for Chronic Pain in the US

Routine care for chronic back pain differs widely in the United States. Episodes of care – a clustering of visits around a complaint – do not typically occur beyond a month or couple of visits.

Routine care can be no care at all after initial evaluation in which diagnosis of chronic back pain is recorded.

But routine care for chronic back pain that is guideline-congruent can ironically prove iatrogenic. It can involve overtreatment, unnecessary exposure to opioids and antidepressants without adequate evaluation or follow up, and unnecessary surgeries.

We are living in the aftermath of pain being identified as the Fifth Vital Sign. In some settings, every patient has to be assessed with a simple rating scale of pain, regardless of the reason for visit. Providers have to document that they asked about pain and what procedures or referrals they provided if the patient reported other than “no pain.” Providers are penalized for not recording interventions when there is any pain indicated. They may lose insurance reimbursement for the visit.

There is currently a campaign to overturn these ridiculous and harmful guidelines, which are not evidence-based. The effect of the guidelines having that prescribed opioid pain medications rivaled heroin in terms of its negative public health impact. There is also been an epidemic of unnecessary back surgery, sometimes with crippling adverse effects.

But the guidelines have also induced despair and an unwillingness to address a condition that often must be endured with minimal intervention, rather than burdening clinicians and patients with the unrealistic expectation that it will be cured or eliminated. Clinicians are not good at dealing with conditions for which they do not have solutions.

I suspect that many of the patients in this study who remained assigned to routine care were getting minimal or no care. They were being provided little or no monitoring or reassessment of pain medications; little encouragement to engage in back exercises with regularity needed for them to be effective; and little support in the face of success and failures of getting on with their life in the face of chronic back pain.

Once again, we have an expensive study of mindfulness that does not address the question of whether any apparent effectiveness is simply due to increased intensity and frequency contact with the medical system and support.

We don’t know if the intervention is simply correcting the inadequacies or lack of routine care.

We cannot determine whether a better use of funds would be to improve the overall quality of routine care for chronic pain, including for the bulk of patients who have no interest in devoting the necessary time in the daily lives to practicing mindfulness.

The editorial commentary

The intended answer to the question posed by the title is obviously yes: Is It Time to Make Mind-Body Approaches Available for Chronic Low Back Pain?

The assessment provided by the commentary is:

A compelling argument for ensuring that an evidence-based health care system should provide access to affordable mind-body therapies.

Like the authors of the trial itself, the commentators are trying to get reimbursement for treatment that is provided through a designated mind-body center. Whether or not mind-body centers improve patient outcomes, they are useful for the intensive competitive marketing of medical centers.

NCCIHLike the authors, the commentators are not only competing for funds from the National Center for Complementary and Integrative Health [NCCIH}, formerly known as The National Center for Complementary and Alternative Medicine [NCCAM], they hoping to get more funds to this National Institute of Health.

The authors of the trial are connected. They have previously co-authored a study of acupuncture for chronic back pain with NCCAM program officers who are listed in the article as influencing and revising interpretations of the data. We have ample evidence acupuncture is not a science based medicine intervention chronic back pain. Any apparent effects are nonspecific. An illusion of effectiveness is likely to emerge in a comparison with routine care that lacks these nonspecific effects. I can’t believe the authors don’t know that.

So we’ve come in another route, but we’ve arrived at the same old story.

  • Authors with connections get their articles into prestigious, generalist medical journals.
  • Even though the evidence does not report the strong claims that are made, they are amplified with goodies like the article been freely available, having free continuing education, and other promotions like audio and video links.
  • Authors of the invited commentaries are written by persons with similar connections and similar vested interest.

I don’t think this article should have made it into JAMA. I don’t think it deserved an editorial commentary. If one were nonetheless provided, it should interpret for a general medical audience issues of the inadequacies of routine care, and inadequacy of routine care as a comparison group, and the practical issues of allocating scarce resources. An accompanying editorial should be reserved for articles more special than this one, and should offer a more detached, objective assessment of the strengths and weaknesses of a study and their implications.

MBSR spans New Age religious and science, as well as, evidence-based versus alternative, non-evidence-based treatments. The new agey aspect is emphasized in the titling of the trial registration including a designation as “CAM [complementary and alternative medicine] and Conventional Mind-Body Therapies.”

We must be alert to MBSR being hyped, promoted beyond what is justified by available evidence, – and now – it leading the charge of non-evidence-based treatments into reimbursement and competition for scarce resources in an already overexpensive and malfunctioning health system.