Headspace mindfulness training app no better than a fake mindfulness procedure for improving critical thinking, open-mindedness, and well-being.

The Headspace app increased users’ critical thinking and being open-minded. So did practicing a shame mindfulness procedure- participants simply sat with their eyes closed, but thought they were meditating.

mind the brain logo

The Headspace app increased users’ critical thinking and open-mindedness. So did practicing a sham mindfulness procedure. Participants simply sat with their eyes closed, but thought they were meditating.

cat_ dreamstime_164683 (300 x 225)Results call into question claims about Headspace  coming from other studies that did not have such a credible, active control group comparison.

Results also call into question the widespread use of standardized self-report measures of mindfulness to establish whether someone is in the state of mindfulness. These measures don’t distinguish between the practice of standard versus fake mindfulness.

Results can be seen as further evidence that practicing mindfulness depends on nonspecific factors (AKA placebo), rather than any active, distinctive ingredient.

Hopefully this study will prompt better studies evaluating the Headspace App, as well as evaluations of mindfulness training more generally, using credible active treatments, rather than no treatment or waitlist controls.

Maybe it is time for a moratorium on trials of mindfulness without such an active control or at least a tempering of claims based on poorly controlled  trials.

This study points to the need for development of more psychometrically sophisticated measures of mindfulness that are not so vulnerable to experiment expectations and demand characteristics.

Until the accumulation of better studies with better measures, claims about the effects of practicing mindfulness ought to be recognized as based on relatively weak evidence.

The study

Noone, C & Hogan,M. Randomised active-controlled trial of effects of online mindfulness intervention on executive control, critical thinking and key thinking dispositionsBMC Psychology, 2018

Trial registration

The study was initially registered in the AEA Social Science Registry before the recruitment was initiated (RCT ID: AEARCTR-0000756; 14/11/2015) and retrospectively registered in the ISRCTN registry (RCT ID: ISRCTN16588423) in line with requirements for publishing the study protocol.

Excerpts from the Abstract

The aim of this study was…investigating the effects of an online mindfulness intervention on executive function, critical thinking skills, and associated thinking dispositions.

Method

Participants recruited from a university were randomly allocated, following screening, to either a mindfulness meditation group or a sham meditation group. Both the researchers and the participants were blind to group allocation. The intervention content for both groups was delivered through the Headspace online application, an application which provides guided meditations to users.

And

Primary outcome measures assessed mindfulness, executive functioning, critical thinking, actively open-minded thinking, and need for cognition. Secondary outcome measures assessed wellbeing, positive and negative affect, and real-world outcomes.

Results

Significant increases in mindfulness dispositions and critical thinking scores were observed in both the mindfulness meditation and sham meditation groups. However, no significant effects of group allocation were observed for either primary or secondary measures. Furthermore, mediation analyses testing the indirect effect of group allocation through executive functioning performance did not reveal a significant result and moderation analyses showed that the effect of the intervention did not depend on baseline levels of the key thinking dispositions, actively open-minded thinking, and need for cognition.

The authors conclude

While further research is warranted, claims regarding the benefits of mindfulness practice for critical thinking should be tempered in the meantime.

Headscape Be used on an iPhone

The active control condition

The sham treatment control condition was embarrassingly straightforward and simple. But as we will see, participants found it credible.

This condition presented the participants with guided breathing exercises. Each session began by inviting the participants to sit with their eyes closed. These exercises were referred to as meditation but participants were not given guidance on how to control their awareness of their body or breath. This approach was designed to control for the effects of expectations surrounding mindfulness and physiological relaxation to ensure that the effect size could be attributed to mindfulness practice specifically. This content was also delivered by Andy Puddicombe and was developed based on previous work by Zeidan and colleagues [55, 57, 58].

What can we conclude about the standard self-report measures of the state of mindfulness?

The study used the Five Facet Mindfulness Questionnaire, which is widely used to assess whether people are in a state of mindfulness. It has been cited almost 4000 times.

Participants assigned to the mindfulness condition had significant changes for all five facets from baseline to follow up: observing, non-reactivity, non-judgment, acting with awareness, and describing. In the absence of a comparison with change in the sham mindfulness group, these pre-post results would seem to suggest that the measure was sensitive to whether participants had practiced mindfulness. However, there were no differences from the changes observed for the participants assigned to mindfulness and those which were simply asked to sit with their eyes closed.

I asked Chris Noone about the questionnaires his group used to assess mindfulness:

The participants genuinely thought they were meditating in the sham condition so I think both non-specific and demand characteristics were roughly equivalent across both groups. I’m also skeptical regarding the ability of the Five-Facet Mindfulness Questionnaire (or any mindfulness questionnaire for that matter) to capture anything other than “perceived mindfulness”. The items used in these questionnaires feature similar content to the scripts used by the people delivering the mindfulness (and sham) guided meditations. The improvement in critical thinking across both groups is just a mix of learning across a semester and habituation to the task (as the same problems were posed at both measurements).

What I like about this trial

The trial provides a critical test of a key claim for mindfulness:

Mindfulness should facilitate critical thinking in higher-education, based on early Buddhist conceptualizations of mindfulness as clarity of thought.

The trial was registered before recruitment and departures from protocol were noted.

Sample size was determined by power analysis.

The study had a closely matched, active control condition, a sham mindfulness treatment.

The credibility and equivalence of this sham condition versus the active treatment under study was repeatedly assessed.

“Manipulation checks were carried out to assess intervention acceptability, technology acceptance and meditation quality 2 weeks after baseline and 4 weeks after baseline.”

The study tested some a priori hypotheses about mediators and moderation:

Analyses were intention to treat.

 How the study conflicts with past studies

Previous studies claimed to show positive effects of mindfulness on aspects of executive functioning [25 and  26]

How the contradiction of past studies by these results is resolved

 “There are many studies using guided meditations similar to those in our mindfulness meditation condition, delivered through smartphone applications [49, 50, 52, 90, 91], websites [92, 93, 94, 95, 96, 97] and CDs [98, 99], which show effects on measures of outcomes reliably associated with increases in mindfulness such as depression, anxiety, stress, wellbeing and compassion. There are two things to note about these studies – they tend not to include a measure of dispositional mindfulness (e.g. only 4% of all mindfulness intervention studies reviewed in a recent meta-analysis include such measures at baseline and follow-up; [54]) and they usually employ a weak form of control group such as a no-treatment control or waitlist control [54]. Therefore, even when change in mindfulness is assessed in mindfulness meditation intervention studies, it is usually overestimated and this must be borne in mind when comparing the results of this study with those of previous studies. This combined with generally only moderate correlations with behavioural outcomes [54] suggests that when mindfulness interventions are effective, dispositional measures do not fully capture what has changed.”

The broader take away messages

“Our results show that, for most outcomes, there were significant changes from baseline to follow-up but none which can be specifically attributed to the practice of mindfulness.’

This creative use of a sham mindfulness control condition is a breakthrough that should be widely followed. First, it allowed a fair test of whether mindfulness is any better than another active, credible treatment. Second, because the active treatment was a sham, results provide a challenge to the notion that apparent effects of mindfulness on critical thinking are anything more than a placebo effect.

The Headspace App is enormously popular and successful, based on claims about what benefits its use will provide. Some of these claims may need to be tempered, not only in terms of critical thinking, but effects on well-being.

The Headspace App platform lends itself to such critical evaluations with respect to a sham treatment with a degree of standardization that is not readily possible with face-to-face mindfulness training. This opportunity should be exploited further with other active control groups constructed on the basis of specific hypotheses.

There is far too much research on the practice of mindfulness being done that does not advance understanding of what works or how it works. We need a lot fewer studies, and more with adequate control/comparison groups.

Perhaps we should have a moratorium on evaluations of mindfulness without adequate control groups.

Perhaps articles being aimed at audiences making enthusiastic claims for the benefits of mindfulness should routinely note whether these claims are based on adequately controlled studies. Most are not.

Creating TED talks from peer-reviewed growth mindset research papers with colored brain pictures

The TED talk fallacy – When you confuse what presenters say about a peer-reviewed article – the breathtaking, ‘breakthrough’ strength of findings demanded for a TED talk – with what a transparent, straightforward analysis and reporting of relevant findings would reveal. 

mind the brain logo

The TED talk fallacy – When you confuse what presenters say about a peer-reviewed article – the breathtaking, ‘breakthrough’ strength of findings demanded for a TED talk – with what a transparent, straightforward analysis and reporting of relevant findings would reveal. 

 fixed vs growth mind setA reminder that consumers, policymakers, and other stakeholders should not rely on TED talks for their views of what constitutes solid “science’ or “best evidence,” even when presenters are established scientists.

The authors of this modest, but overhyped paper do not give TED talks. But this article became the basis for a number of TED and TED-related talks by a psychologist who integrated a story of its findings with stories about her own publications. She has a booking agent for expensive talks and a line of self-help products. This raises the question:  Should such information routinely be a reported conflict of interests in in publications?  

We will contrast the message of  the paper under discussion in this post, along with the TED talk with a new pair of comprehensive meta analyses. The meta analyses show that growth mindset and academic achievement are weak and interventions to improve mindset are ineffectual.

The study

 Moser JS, Schroder HS, Heeter C, Moran TP, Lee YH. Mind your errors: Evidence for a neural mechanism linking growth mind-set to adaptive posterror adjustments. Psychological Science. 2011 Dec;22(12):1484-9.

 Key issues with the study.

The abstract is uninformative as a guide to what was done and what was found in this study. It ends with a rousing promotion of growth mind set as a way of understanding and improving academic achievement.

A study with N = 25 is grossly underpowered for most purposes and should not be used to generate estimates of associations.

Key details of methods and results needed for independent evaluation are not available in article.

The colored brain graphics in the article were labeled “for illustrative purposes only.”

Where would you find such images of the brain not tied to the data in a credible neuroscience journal?  Articles in real such journals are increasingly retracted because of the discovery of suspected pasted-in or altered brain graphics.

The discussion has a strong confirmation bias, ignoring relevant literature and overselling the use of event-related potentials for monitoring and evaluating the determinants of academic achievement.

The press release issued by Association for Psychological Science.

How Your Brain Reacts To Mistakes Depends On Your Mindset

Concludes:

The research shows that these people are different on a fundamental level, Moser says. “This might help us understand why exactly the two types of individuals show different behaviors after mistakes.” People who think they can learn from their mistakes have brains that are tuned to pay more attention to mistakes, he says. This research could help in training people to believe that they can work harder and learn more, by showing how their brain is reacting to mistakes.

The abstract.

The abstract does not report basic details of methods and results, except what is consistent with the authors’ intended message. The crucial final sentence is quote worthy and headed for clickbait. When we look at what was done and what was found in this study, this conclusion is grossly overstated.

How well people bounce back from mistakes depends on their beliefs about learning and intelligence. For individuals with a growth mind-set, who believe intelligence develops through effort, mistakes are seen as opportunities to learn and improve. For individuals with a fixed mind-set, who believe intelligence is a stable characteristic, mistakes indicate lack of ability. We examined performance-monitoring event-related potentials (ERPs) to probe the neural mechanisms underlying these different reactions to mistakes. Findings revealed that a growth mind-set was associated with enhancement of the error positivity component (Pe), which reflects awareness of and allocation of attention to mistakes. More growth-minded individuals also showed superior accuracy after mistakes compared with individuals endorsing a more fixed mind-set. It is critical to note that Pe amplitude mediated the relationship between mind-set and posterror accuracy. These results suggest that neural mechanisms indexing on-line awareness of and attention to mistakes are intimately involved in growth-minded individuals’ ability to rebound from mistakes.

The introduction.

The introduction opens with:

Decades of research by Dweck and her colleagues indicate that academic and occupational success depend not only on cognitive ability, but also on beliefs about learning and intelligence (e.g., Dweck, 2006).

This sentence echoes the Amazon blurb for the pop psychology book  that is being cited:

After decades of research, world-renowned Stanford University psychologist Carol S. Dweck, Ph.D., discovered a simple but groundbreaking idea: the power of mindset. In this brilliant book, she shows how success in school, work, sports, the arts, and almost every area of human endeavor can be dramatically influenced by how we think about our talents and abilities.

Nowhere in the introduction are there balancing references to studies investigating Carol Dweck’s theory independently, from outside her group, nor any citing of any inconsistent findings. This is a selective, strongly confirmation-driven review of the relevant literature. (Contrast this view with an independent assessment from a recent comprehensive meta analysis at the end of the this post).

The method.

Twenty-five native-English-speaking undergraduates (20 female, 5 male; mean age = 20.25 years) participated for course credit.

There is no discussion of why a sample of only 25 participants was chosen or any mention of a power analysis.

If we stick to simple bivariate correlations with the full sample of N= 25:

R = .40 p <.05  (p= 0.0475)

R=  .51  p <.01 (p = 0.0092)

N = 25 does not allow reliable detection of a small to moderate sized,  statistically significant relationship where one exists.

Any significant findings will of necessity be large, r >.40 for p<.05 and  r> .51 for p<.01.

As been noted elsewhere:

In systematic studies of psychological and biomedical effect sizes (e.g., Meyer et al., 2001)  one rarely encounters correlations greater than .4.

How growth mindset scores were calculated is crucially important, but the information that is presented about the measure is inadequate. There is no reference to an established scale with psychometric data and cross validation. Rather:

Following the flanker [a noise letter version of the Eriksen flanker task (Eriksen & Eriksen,  1974)  task, participants completed a TOI scale that asked respondents to rate the extent to which they agreed with four fixed-mind-set statements on a 6-point Likert-type scale (1 = strongly disagree, 6 = strongly agree). These statements (e.g., “You have a certain amount of intelligence and you really cannot do much to change it”) were drawn from previous studies measuring TOI (e.g., Hong, Chiu, Dweck, Lin, & Wan, 1999). TOI items were reverse-scored so that higher scores indicated more endorsement of a growth mind-set, and lower scores indicated more of a fixed mind-set,

Details in the referenced Hong et al (1999) study are difficult to follow, but the paper lays out the following requirement:

Those participants who believe that intelligence is fixed (entity theorists) should consistently endorse responses at the lower (agree) end of the scale (yielding a mean score of 3.0 or lower), whereas participants who believe that intelligence is malleable (incremental theorists) should consistently endorse responses at the upper (disagree) end of the scale (yielding a mean score of 4.0 or above).

If this distribution occurred naturally, it would be an extraordinary set of questions. In the Hong et al (1999) study, this distribution was achieved by throwing away data in the middle of the distribution that didn’t fit the investigators’ preconceived notion.

Excluding the middle third of a distribution of scores with only N = 25 compounds the errors associated with the practice with a larger sample. With the small number of scores now reduced to N= 17, the influence of single outlier participant would be increased. Any generalization to the larger population would be even more problematic.  We cannot readily evaluate whether scores in the present sample were neatly and naturally bimodal. We are not provided the basic data, not even the means and standard deviations in text or table. However, as we will see, one graphic representation leaves some doubts.

Overview of data analyses.

Repeated measures analyses of variance (ANOVAs) were first conducted on behavioral and ERP measures without regard to individual differences in TOIs in order to establish baseline experimental effects. ANOVAs conducted on behavioral measures and the ERN included one 2-level factor: accuracy (error vs. correct response). The Pe [error positivity component ]was analyzed using a 2 (accuracy: error vs. correct response) × 2 (time window: 150–350 ms vs. 350–550 ms) ANOVA. Subsequently, TOI scores were entered into ANOVAs as covariates to assess the main and interactive effects of mind-set on behavioral and ERP measures. When significant effects of TOI score were detected, we conducted follow-up correlational analyses to aid in the interpretation of results.

Thus, multiple post hoc analyses examine the effects of the growth mindset (TOI), based on whether significant main and interaction effects were obtained in other analyses, which in turn, were followed up with correlational analyses.

Highlights of the results.

 Only a few of numerous analyses produced significant results for TOI. Given the sample size and multiple tests without correction, we probably should not attach substantive interpretations to them.

Behavioral data.

Overall accuracy was not correlated with TOI (r = .06, p > .79).

[Speed on error vs correct trials]  trials] When TOI was entered into the ANOVA as a covariate, there were no significant effects (Fs < 1.78, ps > .19, ηp 2s < .08) [where ‘ps’ and ‘no significant effects’ refer to either a main or interaction effects].

[Posterror adjustments] When TOI was entered into the ANOVA as a covariate, there were no significant effects (Fs <1.15, ps > .29, ηp 2 s  < .05).

When entered into the ANOVA as a covariate, however, TOI scores interacted with postresponse accuracy, F(1, 23) = 5.22, p < .05, ηp2= .19. Correlational analysis showed that as TOI scores increased, indicating a growth mind-set, so did accuracy on trials immediately following errors relative to accuracy on trials immediately following correct responses (i.e., posterror accuracy – postcorrect-response accuracy; r = .43, p < .05).

ERPs (event-related potentials).

As expected, the ANOVA confirmed greater ERP negativity on error trials (M = –3.43 μV, SD = 4.76 μV) relative to correct trials (M = –0.23 μV, SD = 4.20 μV), F(1, 24) = 24.05, p < .001, ηp2 = .50, in the 0- to 100-ms postresponse time window. This result is consistent with the presence of an ERN. There were no significant effects involving TOI (Fs < 1.24, ps > .27, ηp2s < .06).

When entered as a covariate, TOI showed a significant interaction with accuracy, F(1, 23) = 8.64, p < .01, ηp2 = .27. Correlational analysis demonstrated that as TOI scores increased so did positivity on error trials relative to correct trials averaged across both time windows (i.e., error activity – correct-response activity; r = .52,1 p < .01)

Mediation analysis.

As Figure 2 illustrates, controlling for Pe amplitude significantly attenuated the relationship between TOI scores and posterror accuracy. The 95% confidence intervals derived from the bootstrapping test did not include zero (.01–.04), and thus indicated significant mediation.

So, a priori conditions for testing for a significant mediation was met because a statistical test barely excluded zero (.01–.04, with no correction for the many tests of TOI in the study. But what are we doing exploring mediation with N = 25?

Distribution of TOI [growth mindset] scores.

Let’s look at the distribution of TOI scores in a graph available as the x-axis in Figure 1.

graph with outlier

Any dichotomization of these continuous scores would be arbitrary. Close scores clustered around different sides of the median would  be considered  different, but  diverging  scores on the same side of the median  would be treated as the same.  Any association between TOI and ERPs (event-related potentials) could be due to one or a few interindividual differences in brains or intraindividual variability of ERP over occasions. These are not the kind of data from which generalizable estimates of effects can be obtained.

The depiction of brains with fixed versus growth mind sets.

The one picture of brains in the main body of this article supposedly contrasts fixed versus growth mindsets. The differences appear dramatic, in sharply contrasting colors. But in the article itself, no such dichotomization is discussed. Nor should it be. Furthermore, the simulation is based on an isolation of one of the few significant effects of TOI. Readers are cautioned that the picture is “for illustrative purposes only.”

fixed vs growth mind set

The discussion.

Similar to the introduction, there is a selective citation of the literature with a strong confirmation bias. There is no reference to weak or null findings or any controversy concerning growth mindset that might have accumulated over a decade of research. There is no acknowledgment of the folly of making substantive interpretations of significant findings from such a small, underpowered study. Results of the mediation analysis are confidently presented, with no indication of doubts whether they should even have been conducted. Or that, even under the best of circumstances, such mediational analyses remain correlational  and provide only weak evidence of causal mechanisms. Event-related evoked potentials are proposed as biomarkers and as surrogate outcomes in implementations of growth mindset interventions. A lot of misunderstanding and neurononsense are crammed into a few sentences. There is no mention of any limitations to the study.

The APS Observer press release revisited.

Why was this article recognized with a special press release by the APS? The press release is much more tied to the author’s claims about their study, rather than to their actual methods and results. The press release provides an opportunity to publicize the study with further exaggeration of what it accomplished.

This is an unfortunate message to authors about what they need to do to be promoted by APS. Your intended message can override your actual results if you strategically emphasize the message and downplay any discrepancy with the results. Don’t mention any limitations of your study.

The TED talks.

A number of TED and TED-related talks incorporate a discussion of the study, with its picture of fixed versus growth mindset brains. There is remarkable overlap among these talks. I have chosen TEDxNorrkoping The power of believing that you can improve  because it had a handy transcript available.

 same screenshot in TED talk1

On the left, you see the fixed-mindset students. There’s hardly any activity. They run from the error. They don’t engage with it. But on the right, you have the students with the growth mindset, the idea that abilities can be developed. They engage deeply. Their brain is on fire with yet. They engage deeply. They process the error. They learn from it and they correct it.

“On fire”? The presented exploits the arbitrary red color chosen for the for-illustrative-purposes-only picture.

The brain graphic is reduced to a cartoon in a comic book level account of action heroes engaging their errors deeply, learning from them, and correcting their next response when ordinary mortals are running, like cowards.

The presenter soon introduces another cartoon for her comic book depiction of the effects of growth mindset on the brain. But first, here is an overview of how this talk fits the predictable structure of a TED talk.

The TED talk begins with a personal testimony concerning  “a critical event early in my career, a real turning point.” It is recognizable to TED talk devotees as an epiphany (an “epiphimony” if you like ) through which the speaker shares a personal journey of insight and realisation, its triumphs and tribulations. In telling the story, the presenter introduces an epic struggle between the children of the darkness (the “now” of a fixed mindset) versus children of the light (the “yet” or “not yet” of a growth mindset).

There is much more of a sense of a televangelist than academic presenting an accurate summary of her research to a lay audience. Sure, the live audience and the millions of viewers of this and related talks were not seeking a colloquium or even a Cafe Scientifique. The audience came to be entertained with a good story. But how much license can be taken with the background science? After all, the information being discussed is relevant to their personal decisions as parents and as citizens and communities making important choices about how to improve academic performance. The issue becomes more serious when the presenter gets to claims of dramatic transformations of impoverished students in economically deprived school settings.

The presenter cites one of her studies for an account of what students “gripped with the tyranny of now” did in difficult learning experiences:

So what do they do next? I’ll tell you what they do next. In one study, they told us they would probably cheat the next time instead of studying more if they failed a test. In another study, after a failure, they looked for someone who did worse than they did so they could feel really good about themselves.

cheat vs study

We are encouraged to think ‘Students with a fixed mind set cheat instead of studying more. How horrible!’ But I looked up the study:

Blackwell LS, Trzesniewski KH, Dweck CS. Implicit Theories of Intelligence Predict Achievement Across an Adolescent Transition: A Longitudinal Study and an InterventionChild Development. 2007 Jan 1;78(1):246-63.

I searched for “cheat” and found one mention:

Students rated how likely they would be to engage in positive, effort-based strategies (e.g., ‘‘I would work harder in this class from now on’’ ‘‘I would spend more time studying for tests’’) or negative, effort-avoidant strategies (e.g., ‘‘I would try not to take this subject ever again’’ ‘‘I would spend less time on this subject from now on’’ ‘‘I would try to cheat on the next test’’). Positive and negative items were combined to form a mean Positive Strategies score.

All subsequent reporting of results was in terms of this composite Positive Strategies. So, I was unable to evaluate how common endorsement occurred of “I would try to cheat…”

Three minutes into the talk, the speaker introduces an element of moral panic about a threat to Western civilization as we know it:

How are we raising our children? Are we raising them for now instead of yet? Are we raising kids who are obsessed with getting As? Are we raising kids who don’t know how to dream big dreams? Their biggest goal is getting the next A, or the next test score? And are they carrying this need for constant validation with them into their future lives? Maybe, because employers are coming to me and saying, “We have already raised a generation of young workers who can’t get through the day without an award.”

Less than a minute later, the presenter gets ready to roll out her solution.

So what can we do? How can we build that bridge to yet?

Praising performance in terms of fixed characteristics like IQ or ability is ridiculed. However, great promises are made for praising process, regardless of outcome.

Here are some things we can do. First of all, we can praise wisely, not praising intelligence or talent. That has failed. Don’t do that anymore. But praising the process that kids engage in, their effort, their strategies, their focus, their perseverance, their improvement. This process praise creates kids who are hardy and resilient.

“Yet” or “not yet” becomes a magical incantation.  The presenter builds on her comic book science of the effects of growth mindset, by introducing by cartoon of a synapse (mislabeled as a neuron),  linked to her own research only by some wild speculation.

build stronger connections synapse

Just the words “yet” or “not yet,” we’re finding, give kids greater confidence, give them a path into the future that creates greater persistence. And we can actually change students’ mindsets. In one study, we taught them that every time they push out of their comfort zone to learn something new and difficult, the neurons in their brain can form new, stronger connections, and over time, they can get smarter.

I found no relevant measurements of brain activity in Dweck’s studies, but let’s not ruin a good story.

Look what happened: In this study, students who were not taught this growth mindset continued to show declining grades over this difficult school transition, but those who were taught this lesson showed a sharp rebound in their grades. We have shown this now, this kind of improvement, with thousands and thousands of kids, especially struggling students.

Up until now, we have disappointingly hyped and inaccurate accounts of how to foster academic achievement. But soon turns into a cruel hoax when claims are made about improving the performance of under privileged children in under resource settings.

So let’s talk about equality. In our country, there are groups of students who chronically underperform, for example, children in inner cities, or children on Native American reservations. And they’ve done so poorly for so long that many people think it’s inevitable. But when educators create growth mindset classrooms steeped in yet, equality happens. And here are just a few examples. In one year, a kindergarten class in Harlem, New York scored in the 95th percentile on the national achievement test. Many of those kids could not hold a pencil when they arrived at school. In one year, fourth-grade students in the South Bronx, way behind, became the number one fourth-grade class in the state of New York on the state math test. In a year, to a year and a half, Native American students in a school on a reservation went from the bottom of their district to the top, and that district included affluent sections of Seattle. So the Native kids outdid the Microsoft kids.

This happened because the meaning of effort and difficulty were transformed. Before, effort and difficulty made them feel dumb, made them feel like giving up, but now, effort and difficulty, that’s when their neurons are making new connections, stronger connections. That’s when they’re getting smarter.

So the Native kids outdid the Microsoft kids.” There is some kind of poetic license being taken here in describing the results of an intervention. The message is that subjective mindset can trump entrenched structural inequalities and accumulated deficits in skills and knowledge, as well as limits on ability. All school staff and parents need to do is wave the magic wand and recite the incantation “Not yet.” How reassuring to those in politics who control resources who don’t want to adequately fund the school settings. They just need to exhort anyone who wants to improve outcomes to recite the magic.

And what do we say when we don’t witness dramatic improvements? Who is to blame when such failures need to be explained. . The cruel irony is that school boards will blame principals, who blame teachers, and parents will blame schools and their children. All will be held to unrealistic expectations.

But it gets worse. The presenter ends with a call to action arguing that that not buying into her program would violate the human rights of vulnerable children.

Let’s not waste any more lives, because once we know that abilities are capable of such growth, it becomes a basic human right for children, all children, to live in places that create that growth, to live in places filled with “yet”.

Paradox: Do poor kids with a growth mindset suffer negative consequences?

Maybe so, suggests some recent research concerning the longer term outcomes of disadvantaged African American children.

A newly published study in the peer-reviewed journal Child Development …finds traditionally marginalized youth who grew up believing in the American ideal that hard work and perseverance naturally lead to success show a decline in self-esteem and an increase in risky behaviors during their middle-school years. The research is considered the first evidence linking preteens’ emotional and behavioral outcomes to their belief in meritocracy, the widely held assertion that individual merit is always rewarded.

“If you’re in an advantaged position in society, believing the system is fair and that everyone could just get ahead if they just tried hard enough doesn’t create any conflict for you … [you] can feel good about how [you] made it,” said Erin Godfrey, the study’s lead author and an assistant professor of applied psychology at New York University’s Steinhardt School. But for those marginalized by the system—economically, racially, and ethnically—believing the system is fair puts them in conflict with themselves and can have negative consequences.

We know surprisingly little about the adverse events associated with growth mindset interventions or their negative unintended consequences for children and school systems. Cost/benefit analyses of mindset interventions should be done with respect to academic interventions known to be effective when conducted with the equivalent resources, not no treatment.

Overall associations of growth mind set with academic achievement are weak and interventions are not effective.

Sisk VF, Burgoyne AP, Sun J, Butler JL, Macnamara BN. To What Extent and Under Which Circumstances Are Growth Mind-Sets Important to Academic Achievement? Two Meta-Analyses. Psychological Science. 2018 Mar 1:0956797617739704.

This newly published article published in Psychological Science started by noting  the influence of growth mind set.

These ideas have led to the establishment of nonprofit organizations (e.g., Project for Education Research that Scales [PERTS]), for-profit entities (e.g., Mindset Works, Inc.), schools purchasing mind-set intervention programs (e.g., Brainology), and millions of dollars in funding to individual researchers, nonprofit organizations, and for-profit companies (e.g., Bill and Melinda Gates Foundation,1 Department of Education,2 Institute of Educational Sciences3).

In our first meta-analysis (k = 273, N = 365,915), we examined the strength of the relationship between mind-set and academic achievement and potential moderating factors. In our second meta-analysis (k = 43, N = 57,155), we examined the effectiveness of mind-set interventions on academic achievement and potential moderating factors. Overall effects were weak for both meta-analyses.

The first meta analysis integrated 273 effect sizes. The overall effect was very weak, by conventional standards, hardly consistent with the TED talks.

The meta-analytic average correlation (i.e., the average of various population effects) between growth mind-set and academic achievement is r⎯⎯ = .10, 95% confidence interval (CI) = [.08, .13], p < .001.

The data set of effects of growth mindset interventions integrated 43 effect sizes and 37 of the 43 effect sizes (86%) are not significantly different from zero.

The authors conclude:

Some researchers have claimed that mind-set interventions can “lead to large gains in student achievement” and have “striking effects on educational achievement” (Yeager & Walton, 2011, pp. 267 and 268, respectively). Overall, our results do not support these claims. Mind-set interventions on academic achievement were nonsignificant for adolescents, typical students, and students facing situational challenges (transitioning to a new school, experiencing stereotype threat). However, our results support claims that academically high-risk students and economically disadvantaged students may benefit from growth-mind-set interventions (see Paunesku et al., 2015; Raizada & Kishiyama, 2010), although these results should be interpreted with caution because (a) few effect sizes contributed to these results, (b) high-risk students did not differ significantly from non-high-risk students, and (c) relatively small sample sizes contributed to the low-SES group.

Part of the reshaping effort has been to make funding mind-set research a “national education priority” (Rattan et al., 2015, p. 723) because mind-sets have “profound effects” on school achievement (Dweck, 2008, para. 2). Our meta-analyses do not support this claim.

And

From a practical perspective, resources might be better allocated elsewhere than mind-set interventions. Across a range of treatment types, Hattie, Biggs, and Purdie (1996) [https://www.teachertoolkit.co.uk/wp-content/uploads/2014/04/effect-of-learning-skills.pdf ] found that the meta-analytic average effect size for a typical educational intervention on academic performance is 0.57. All meta-analytic effects of mind-set interventions on academic performance were < 0.35, and most were null. The evidence suggests that the “mindset revolution” might not be the best avenue to reshape our education system.

The presenter’s speaker fees.

Presenters of TED talks are not paid, but a successful talk can lead to lucrative speaking engagements. It is informative to Google the speaking fees of the presenters of highly accessed Ted talks. In the case of Carol Dweck, I found the booking agency,  All American Speakers.

carol dweck speaking

fee range

Mindsetonline provides products for sale as well as success stories about people and organizations adopting a growth mindset.

buy the bookbuy the software

businessa nd leadership

There is even a 4-item measure of mindset you can complete on line.  Each of the items is some paraphrasing of ‘you can’t change your intelligence very much’ either stated straightforwardly or reverse, ‘you can.’

Consumers beware! TED talks are not reliable dissemination of best evidence.

TED talks are to best evidence like historical fiction is to history.

Even TED talks by eminent psychologists often are little more than informercials for the self-help and lucrative speaking engagements and workshops.

Academics are under increasing pressure to demonstrate that there is more to the  impact of their work, in terms of citations of publications in prestigious journals. Social impact is being used to balance journal impact factors.

It is also being recognized that outreach involves the need to equip lay audiences to be able to grasp what are initially difficult or confusing concepts.

But pictures of color brains can be used to dumb down consumers and to disarm their intuitive skepticism about behavioral science working magic and miracles. Even PhD psychologists are inclined to be  overly impressed with references to neuroscience and pictures of color brains are introduced into the discussion. The vulnerability of lay audiences to neurononsense or neurobollocks is even greater.

False and exaggerated claims about academic interventions harm school systems, teachers, and ultimately, students. In communicating to lay audiences, psychologists need to be sensitive to the possible misunderstandings they are reinforcing. They have an ethical responsibility to do their best to critical thinking skills of their audiences, not damage it.

TED talks and declarations of potential conflicts of interest.

Personally, I found that calling out the pseudoscience behind claims for unproven medicine like acupuncture or homeopathy does not produce much blowback except mostly from proponents of these treatments. Similarly, campaigning for better disclosure of potential conflicts of interest does not meet much resistance when the focus is on pharmaceutical companies.

However, it’s a whole different matter to call out the pseudoscience behind self-help and exaggerated outbreak false claims about behavioral science being able to work miracles and magic. It seems to be a double standard in psychology by which is inappropriate to exaggerate the strength of findings when communicating with other professionals. On the other hand, in communicating with lay audiences, it’s perfectly okay.

We need to think about TED talks more like we think about talks by opinion leaders with ties to the pharmaceutical industry. Presenters  should start with a standard slide disclosing financial interests that may influence opinions offered about specific products mentioned in the talk. Given the pressure to get findings that will fit into the next TED talk, presenters should routinely disclose in their peer review articles that they give TED talks or have a booking agent.

 

A science-based medicine skeptic struggles with his as-yet medically unexplained pain and resists alternative quack treatments

Paul: “For three years I kept my faith that relief had to be just around the corner, but my disappointment is now as chronic as my pain. Hope has become a distraction.”

mind the brain logo

Chronic pain and tragic irony…

Paul: “For three years I kept my faith that relief had to be just around the corner, but my disappointment is now as chronic as my pain. Hope has become a distraction.”

Paul Ingraham is quite important in the Science-Based Skeptics movement and in my becoming involved in it. He emailed me after a long spell without contact. He wanted to explain how he had been out of touch. His life had been devastated by as-yet medically unexplained pain and other mysterious symptoms.

Paul  modestly describes himself at his blog site as “a health writer in Vancouver, Canada, best known for my work debunking common myths about treating common pain problems on PainScience.com. I actually make a living doing that. On this blog, I just mess around.  ~ Paul Ingraham (@painsci, Facebook).”

Some of Paul’s posts at his own blog site

massage

on fire

stretching

Paul’s Big Self-Help Tutorials for Pain Problems are solidly tied to the best peer-reviewed evidence.

Detailed, readable tutorials about common stubborn pain problems & injuries, like back pain or runner’s knee.

Many common painful problems are often misunderstood, misdiagnosed, and mistreated. Made for patients, but strong enough for professionals, these book-length tutorials are crammed with tips, tricks, and insights about what works, what doesn’t, and why. No miracle cures are for sale here — just sensible information, scientifically current, backed up by hundreds of free articles and a huge pain and injury science bibliography.

 

science-based-medicine-logo

Paul offered me invaluable assistance and support when I began blogging at the prestigious Science Based Medicine. See for instance, my:

Systematic Review claims acupuncture as effective as antidepressants: Part 1: Checking the past literature

And

Is acupuncture as effective as antidepressants? Part 2. Blinding readers who try to get an answer

I have not consistently blogged there, because my topics don’t always fit. Whenever I do blog there, I learn a lot from  the wealth of thoughtful comments I received.

I have great respect for Science Based Medicine’s authoritative, well documented and evidence-based analyses. I highly recommend the blog for those who are looking for sophistication  delivered in a way that an intelligent lay person could understand.

What’s the difference between Sciencebased medicine (SBM) versus evidence-based medicine (EBM)?

I get some puzzlement every time I bring up this important distinction – Bloggers at SBM frequently make a distinction between science-based- and evidence-based- medicine. They offer careful analyses of unproven treatments like acupuncture and homeopathy. Proponents of these treatment increasingly sell them as evidence-based, citing randomized trials that do not involve an active treatment. The illusion of efficacy is often created by the positive expectations and mysterious rituals with which these treatments are delivered. Comparison treatments in these studies often lack this boost, particularly when tested in in unblinded comparisons.

The SBM bloggers like to point out that there are no plausible tested scientific mechanisms by which these treatments might conceivably work. The name of  blog,  Science-Based Medicine calls  attention to their higher standards for considering treatments efficacious: to be considered science based medicine, they have to be proven as effective as evidence-based active treatments, and have to have a mechanism beyond nonspecific, placebo effects.

Paul Ingram reappears from a disappearance.

Paul mysteriously disappeared for a while. Now he’s reemerged with a tale that is getting a lot of attention. He gave me permission to blog about excerpts. I enclose a link to the full story that I strongly recommend.

Paul Ingram title

http://www.paulingraham.com/chronic-pain-tragic-irony.html

A decade ago I devoted myself to helping people with chronic pain, and now it’s time to face my ironic new reality: I have serious unexplained chronic pain myself. It may never stop, and I need to start learning to live with it rather than trying to fix it.

I have always been “prone” to aches and pains, and that’s why I became a massage therapist and then moved on to publishing PainScience.com. But that tendency was a pain puppy humping my leg compared to the Cerberus of suffering that’s mauling me now. I’ve graduated to the pain big leagues.

For three years I kept my faith that relief had to be just around the corner, but my disappointment is now as chronic as my pain. Hope has become a distraction. I’ve been like a blind man waiting for my sight to return instead of learning braille. It’s acceptance time.

Paul describes how is pain drove him into hiding.

… why I’ve become one of those irritating people who answers every invitation with a “maybe” and bails on half the things I commit to. I never know what I’m going to be able to cope with on a given day until it’s right in front of me.

He struggled to define the problem:

Mostly widespread soreness and joint pain like the early stages of the flu, a parade of agonizing hot spots that are always on the verge of breaking my spirit, and a lot of sickly fatigue. All of which is easily provoked by exercise.

But there was a dizzying array of other symptoms…

Any diagnosis would be simply a label, not an explanation.

Nothing turned up in a few phases of medical investigation in 2015 and 2016. My “MS hug” is not caused by MS. My thunderclap headaches are not brain bleeds. My tremors are not Parkinsonian. I am not deficient in vitamins B or D. There is no tumour lurking in my chest or skull, nor any markers of inflammation in my blood. My heart beats as steadily as an atomic clock, and my nerves conduct impulses like champs.

Paul was not seriously tempted by alternative and complementary medicine

I am not tempted to try alternative medicine. The best of alt-med is arguably not alternative at all — e.g. nutrition, mindfulness, relaxation, massage, and so on — and the rest of what alt-med offers ranges from dubious at best to insane bollocks at the worst. You can’t fool a magician with his own tricks, and you can’t give false hope to an alt-med apostate like me: I’ve seen how the sausage is made, and I feel no surge of false hope when someone tells me (and they have) “it’s all coming from your jaw, you should see this guy in Seattle, he’s a Level 17 TMJ Epic Master, namaste.” Most of what sounds promising to the layperson just sounds like a line of bull to me.

Fascinating how many people clearly think Paul’s story was almost identical to their own.

All these seemingly “identical” cases have got me pondering: syndromes consist of non-specific symptoms by definition, and batches of such symptoms will always seem more similar than they actually are… because blurry pictures look more alike than sharp and clear ones. Non-specific symptoms are generalized biological reactions to adversity. Anxiety can cause any of them, and so can cancer. Any complex cases without pathognomic (specific, defining) symptoms are bound to have extensive overlap of their non-specific symptoms.

There are many ways to be sick, and relatively few ways to feel bad.

Do check out his full blog post. http://www.paulingraham.com/chronic-pain-tragic-irony.html

Flawed meta-analysis reveals just how limited the evidence is mapping meditation into specific regions of the brain

The article put meaningless, but reassuring effect sizes into the literature where these numbers will be widely and uncritically cited.

mind the brain logo

“The only totally incontrovertible conclusion is that much work remains to be done…”.

lit up brain not in telegraph article PNG

Authors of a systematic review and meta-analysis of functional neuroanatomical studies (fMRI and PET) of meditation were exceptionally frank in acknowledging problems relating the practice of meditation to differences in specific regions of the brain. However, they did not adequately deal with problems hiding in plain sight. These problems should have discouraged integration of this literature into a meta-analysis and the authors’ expressing the strength of the association between meditation and the brain in terms of a small set of moderate effect sizes.

The article put meaningless, but reassuring effect sizes into the literature where these numbers will be widely and uncritically cited.

An amazing set of overly small studies with evidence that null findings are being suppressed.

Many in the multibillion mindfulness industry are naive or simply indifferent to what constitutes quality evidence. Their false confidence that “meditation changes the brain*” can be bolstered by selective quotes from this review seemingly claiming that the associations are well-established and practically significant. Readers who are more sophisticated may nonetheless be mislead by this review, unless they read beyond the abstract and with appropriate skepticism.

Read on. I suspect you will be surprised as I was about the small quantity and poor quality of the literature relating the practice of meditation to specific areas of the brain. The colored pictures of the brain widely used to illustrate discussions of meditation are premature and misleading.

As noted in another article :

Brightly coloured brain scans are a media favourite as they are both attractive to the eye and apparently easy to understand but in reality they represent some of the most complex scientific information we have. They are not maps of activity but maps of the outcome of complex statistical comparisons of blood flow that unevenly relate to actual brain function. This is a problem that scientists are painfully aware of but it is often glossed over when the results get into the press.

The article is

Fox KC, Dixon ML, Nijeboer S, Girn M, Floman JL, Lifshitz M, Ellamil M, Sedlmeier P, Christoff K. Functional neuroanatomy of meditation: A review and meta-analysis of 78 functional neuroimaging investigations. Neuroscience & Biobehavioral Reviews. 2016 Jun 30;65:208-28.

Abstract.

Keep in mind how few readers go beyond an abstract in forming an impression of what an article shows. More readers “know” what the meta analysis found solely based on their reading the abstract , relative to the fewer people who read both the article and the supplementary material).

Meditation is a family of mental practices that encompasses a wide array of techniques employing distinctive mental strategies. We systematically reviewed 78 functional neuroimaging (fMRI and PET) studies of meditation, and used activation likelihood estimation to meta-analyze 257 peak foci from 31 experiments involving 527 participants. We found reliably dissociable patterns of brain activation and deactivation for four common styles of meditation (focused attention, mantra recitation, open monitoring, and compassion/loving-kindness), and suggestive differences for three others (visualization, sense-withdrawal, and non-dual awareness practices). Overall, dissociable activation patterns are congruent with the psychological and behavioral aims of each practice. Some brain areas are recruited consistently across multiple techniques—including insula, pre/supplementary motor cortices, dorsal anterior cingulate cortex, and frontopolar cortex—but convergence is the exception rather than the rule. A preliminary effect-size meta-analysis found medium effects for both activations (d = 0.59) and deactivations (d = −0.74), suggesting potential practical significance. Our meta-analysis supports the neurophysiological dissociability of meditation practices, but also raises many methodological concerns and suggests avenues for future research.

The positive claims in the abstract

“…Found reliably dissociable patterns of brain activation and deactivation for four common styles of meditation.”

“Dissociable activation patterns are congruent with the psychological and behavioral aims of each practice.”

“Some brain areas are recruited consistently across multiple techniques”

“A preliminary effect-size meta-analysis found medium effects for both activations (d = 0.59) and deactivations (d = −0.74), suggesting potential practical significance.”

“Our meta-analysis supports the neurophysiological dissociability of meditation practices…”

 And hedges and qualifications in the abstract

“Convergence is the exception rather than the rule”

“[Our meta-analysis] also raises many methodological concerns and suggests avenues for future research.

Why was this systematic review and meta-analysis undertaken now?

A figure provided in the article showed a rapid accumulation of studies of mindfulness in the brain in the past few years, with over 100 studies now available.

However, the authors systematic search yielded “78 functional neuroimaging (fMRI and PET) studies of meditation, and used activation likelihood estimation to meta-analyze 257 peak foci from 31 experiments involving 527 participants.” About a third of the studies identified in a search provided usable data.

What did the authors want to accomplish?

Taken together, our central aims were to: (i) comprehensively review and meta-analyze the existing functional neuroimaging studies of meditation (using the meta-analytic method known as activation likelihood estimation, or ALE), and compare consistencies in brain activation and deactivation both within and across psychologically distinct meditation techniques; (ii) examine the magnitude of the effects that characterize these activation patterns, and address whether they suggest any practical significance; and (iii) articulate the various methodological challenges facing the emerging field of contemplative neuroscience (Caspi and Burleson, 2005; Thompson, 2009; Davidson, 2010; Davidson and Kaszniak, 2015), particularly with respect to functional neuroimaging studies of meditation.

Said elsewhere in the article:

Our central hypothesis was a simple one: meditation practices distinct at the psychological level (Ψ) may be accompanied by dissociable activation patterns at the neurophysiological level (Φ). Such a model describes a ‘one-to-many’ isomorphism between mind and brain: a particular psychological state or process is expected to have many neurophysiological correlates from which, ideally, a consistent pattern can be discerned (Cacioppo and Tassinary, 1990).

The assumption is meditating versus not-meditating brains should be characterized by  distinct, observable neurophysiological pattern. There should also be distinct, enduring changes in the brain in people who have been practicing meditation for some time.

I would wager that many meditation enthusiasts believe that links to specific regions are already well established. Confronted with evidence to the contrary, they would suggest that links between the experience of meditating and changes in the brain are predictable and are waiting to be found. It is that kind of confidence that leads to the significance chasing and confirmatory bias currently infecting this literature.

Types of meditation available for study

Quantitative analyses focused on four types of meditation. Additional terms of meditation did not have sufficient studies and so were examined qualitatively. Some studies of the four provided within-group effect size, whereas other studies provided between-group effect sizes.

Focused attention (7 studies)

Directing attention to one specific object (e.g., the breath or a mantra) while monitoring and disengaging from extraneous thoughts or stimuli (Harvey, 1990, Hanh, 1991, Kabat-Zinn, 2005, Lutz et al., 2008b, Wangyal and Turner, 2011).

Mantra recitation (8 studies)

Repetition of a sound, word, or sentence (spoken aloud or silently in one’s head) with the goals of calming the mind, maintaining focus, and avoiding mind-wandering.

Open monitoring (10 studies)

Bringing attention to the present moment and impartially observing all mental contents (thoughts, emotions, sensations, etc.) as they naturally arise and subside.

Loving-kindness/compassion (6 studies)

L-K involves:

Generating feelings of kindness, love, and joy toward themselves, then progressively extend these feelings to imagined loved ones, acquaintances, strangers, enemies, and eventually all living beings (Harvey, 1990, Kabat_Zinn, 2005, Lutz et al., 2008a).

Similar but not identical, compassion meditation

Takes this practice a step further: practitioners imagine the physical and/or psychological suffering of others (ranging from loved ones to all humanity) and cultivate compassionate attitudes and responses to this suffering.

In addition to these four types of meditation, three others can be identified, but so far have only limited studies of the brain: Visualization, Sense-withdrawal and Non-dual awareness practices.

A dog’s breakfast: A table of the included studies quickly reveals a meta-analysis in deep trouble

studies included

This is not a suitable collection of studies to enter into a meta-analysis with any expectation that a meaningful, generalizable effect size will be obtained.

Most studies (14) furnish only pre-post, within-group effects for mindfulness practiced by long time practitioners. Of these 14 studies, there are two outliers with 20 and 31 practitioners. Otherwise the sample size ranges from 4 to 14.

There are 11 studies furnishing between-group comparisons between experienced and novice meditators. The number of participants in the smaller cell is key for the power of between-group effect sizes, not the overall sample size. In these 11 studies, this ranged from 10 to 22.

It is well-known that one should not combine within- and between- group effect sizes in meta analysis.  Pre-/post-within-group differences capture not only the effects of the active ingredients of an intervention, but nonspecific effects of the conditions under which data are gathered, including regression to the mean. These within-group differences will typically overestimate between-group differences. Adding a  comparison group and calculating between-group differences has the potential for  controlling nonspecific effects, if the comparison condition is appropriate.

The effect sizes based on between-group differences in these studies have their own problems as estimates of the effects of meditation on the brain. Participants were not randomized to the groups, but were selected because they were already either experienced or novice meditators. Yet these two groups could differ on a lot of variables that cannot be controlled: meditation could be confounded with other lifestyle variables: sleeping better or having a better diet. There might be pre-existing differences in the brain that made it easier for the experienced meditators to have committed to long term practice. The authors acknowledge these problems late in the article, but they do so only after discussing the effect sizes they obtained as having substantive importance.

There is good reason to be skeptical that these poorly controlled between-group differences are directly comparable to whatever changes would occur in experienced meditators’ brains in the course of practicing meditation.

It has been widely appreciated that neuroimaging studies are typically grossly underpowered, and that the result is low reproducibility of findings. Having too few participants in a  study will likely yield false negatives because of an inability to achieve the effects needed to obtain significant findings. Small sample size means a stronger association is needed to be significant.

Yet, what positive findings (i.e., significant) are obtained will of necessity be larger likely to be exaggerated and not reproducible with a larger sample.

Another problem with such small cell sizes is that it cannot be assumed that effects are due to one or more participants’ differences in brain size or anatomy. One or a small subgroup of outliers could drive all significant findings in an already small sample. The assumption that statistical techniques can smooth these interindividual differences depends on having much larger samples.

It has been noted elsewhere:

Brains are different so the measure in corresponding voxels across subjects may not sample comparable information.

How did the samples get so small? Neuroanatomical studies are expensive, but why did Lazar et al (2000) have 5 rather 6 participants, or only the 4 participants that Davanger et had? Were from some participants dropped after a peeking at the data? Were studies compromised by authors not being able to recruit intended numbers of participants and having to relax entry criteria? What selection bias is there in these small samples? We just don’t know.

I am reminded of all the contentious debate that has occurred when psychoanalysts insisted on mixing uncontrolled case-series with randomized trials in the same meta-analyses of psychotherapy. My colleagues and I showed this introduces great distortion  into the literature . Undoubtedly, the same is occurring in these studies of meditation, but there is so much else wrong with this meta analysis.

The authors acknowledge that in calculating effect sizes, they combined studies measuring cerebral blood flow (positron emission tomography; PET) and blood oxygenation level (functional magnetic resonance imaging; fMRI). Furthermore, the meta-analyses combined studies that varied in the experimental tasks for which neuroanatomical data were obtained.

One problem is that even studies examining a similar form of meditation might be comparing a meditation practice to very different baseline or comparison tasks and conditions. However, collapsing across numerous different baselines or control conditions is a common (in fact, usually inevitable) practice in meta_analyses of functional neuroimaging studies…

So, there are other important sources of heterogeneity between these studies.

Generic_forest_plot
A generic forest plot. This article did not provide one.

It’s a pity that the authors did not provide a forest plot [How to read  a forest plot.]  graphically showing the confidence intervals around the effect sizes being entered into the meta-analysis.

But the authors did provide a funnel plot that I found shocking. [Recommendations for examining and interpreting funnel plot] I have never seen one like, except when someone has constructed an artificial funnel plot to make a point.

funnel plot

Notice two things about this funnel plot. Rather than a smooth, unbroken distribution, studies with effect sizes between -.45 and +.45 are entirely missing. Studies with smaller sample sizes have the largest effect sizes, whereas the smallest effect sizes all come from the larger samples.

For me, this adds to the overwhelming evidence there is something gone wrong in this literature and any effect sizes should be ignored. There must have been considerable suppression of null findings so large effects from smaller studies will not generalize. Yet, the authors find the differences between small and larger sample studies encouraging

This suggests, encouragingly, that despite potential publication bias or inflationary bias due to neuroimaging analysis methods, nonetheless studies with larger samples tend to converge on similar and more reasonable (medium) effect sizes. Although such a conclusion is tentative, the results to date (Fig. 6) suggest that a sample size of approximately n = 25 is sufficient to reliably produce effect sizes that accord with those reported in studies with much larger samples (up to n = 46).

I and others have long argued that studies of this small sample size in evaluating psychotherapy should be left as pilot feasibility studies and not used to generate effect sizes. I think the same logic applies to this literature.

Distinctive patterns of regional activation and deactivation

The first part of the results section is devoted to studies examining particular forms of meditation. Seeing the apparent consistency of results, one needs to keep in mind the small number of studies being examined and the considerable differences among them. For instance, results presented for focused attention combine three between-group comparisons with four within-group studies. Focused attention includes both pre-post meditation differences from experienced Tibetan Buddhist practitioners to differences between novice and experienced practitioners of mindfulness-based stress reduction (MBSR). In almost all cases, meaningful statistically significant differences are found in both activation and deactivation regions that would make a lot of sense in terms of the functions that are known to be associated with them. There is not much noting of anomalous brain regions being identified by significant effects There is a high ratio of significant findings to number of participants comparisons. There is little discussion of anomalies.

Meta-analysis of focused attention studies resulted in 2 significant clusters of activation, both in prefrontal cortex (Table 3;Fig. 2). Activations were observed in regions associated with the voluntary regulation of thought and action, including the premotor cortex (BA 6; Fig. 2b) and dorsal anterior cingulate cortex (BA24; Fig. 2a). Slightly sub-threshold clusters were also observed in the dorsolateral prefrontal cortex (BA 8/9; Fig. 2c) and left midinsula (BA 13; Fig. 2e); we display these somewhat sub-threshod results here because of the obvious interest of these findings in practices that involve top-down focusing of attention, typically focused on respiration. We also observed clusters of deactivation in regions associated with episodic memory and conceptual processing, including the ventral posterior cingulate cortex (BA 31; Fig. 2d)and left inferior parietal lobule (BA 39; Fig. 2f).

How can such meaningful, practically significant findings obtains when so many conditions mitigate against finding them? John Ioannidis once remarked that in hot areas of research, consistency of positive findings from small studies often reflects only the strength of bias with which they are sought. The strength of findings will decrease when larger, more methodologically sophisticated studies become available, conducted by investigators who are less committed to having to get confirmation.

The article concludes:

Many have understandably viewed the nascent neuroscience of meditation with skepticism (Andresen, 2000; Horgan, 2004), burecent years have seen an increasing number of high-quality, controlled studies that are suitable for inclusion in meta-analyses and that can advance our cumulative knowledge of the neural basis of various meditation practices (Tang et al., 2015). With nearly a hundred functional neuroimaging studies of meditation now reported, we can conclude with some confidence that different practices show relatively distinct patterns of brain activity, and that the magnitude of associated effects on brain function may have some practical significance. The only totally incontrovertible conclusion, however, is that much work remains to be done to confirm and build upon these initial findings.

“Increasing number of high-quality, controlled studies that are suitable for inclusion in meta-analyses” ?…” “Conclude with some confidence…? “Relatively distinct patterns”?… “Some practical significance”?

In all of this premature enthusiasm about findings relating the practice of meditation to activation of particular regions of the brain and deactivation of others, we should not lose track of some other issues.

Although the authors talk about mapping one-to-one relationships between psychological states and regions of the brain, none of the studies would be of sufficient size to document some relationships, given the expected size of the relationship, based on what is typically found between psychological states and other biological variables.

Many differences between techniques could be artifactual –due to the technique altering breathing, involving verbalization, or focused attention. Observed differences in the brain regions activated and deactivated might simply reflect these differences without them being related to psychological functioning.

Even if the association were found, it would be a long way to establishing that the association reflected a causal mechanism, rather than simply being correlational or even artifactual. Think of the analogy of discovering a relationship between the amount of sweat while exercising in concluding that any weight loss was due to sweating it out.

We still have not established that meditation has more psychological and physical health benefits than other active interventions with presumably different mechanisms. After lots of studies, we still don’t know whether mindfulness meditation is anything more than a placebo. While I was finishing up this blog post, I came across a new study:

The limited prosocial effects of meditation: A systematic review and meta-analysis. 

Although we found a moderate increase in prosociality following meditation, further analysis indicated that this effect was qualified by two factors: type of prosociality and methodological quality. Meditation interventions had an effect on compassion and empathy, but not on aggression, connectedness or prejudice. We further found that compassion levels only increased under two conditions: when the teacher in the meditation intervention was a co-author in the published study; and when the study employed a passive (waiting list) control group but not an active one. Contrary to popular beliefs that meditation will lead to prosocial changes, the results of this meta-analysis showed that the effects of meditation on prosociality were qualified by the type of prosociality and methodological quality of the study. We conclude by highlighting a number of biases and theoretical problems that need addressing to improve quality of research in this area. [Emphasis added].

 

 

 

When psychotherapy trials have multiple flaws…

Multiple flaws pose more threats to the validity of psychotherapy studies than would be inferred when the individual flaws are considered independently.

mind the brain logo

Multiple flaws pose more threats to the validity of psychotherapy studies than would be inferred when the individual flaws are considered independently.

We can learn to spot features of psychotherapy trials that are likely to lead to exaggerated claims of efficacy for treatments or claims that will not generalize beyond the sample that is being studied in a particular clinical trial. We can look to the adequacy of sample size, and spot what Cochrane collaboration has defined as risk of bias in their handy assessment tool.

We can look at the case-mix in the particular sites where patients were recruited.  We can examine the adequacy of diagnostic criteria that were used for entering patients to a trial. We can examine how blinded the trial was in terms of whoever assigned patients to particular conditions, but also what the patients, the treatment providers, and their evaluaters knew which condition to which particular patients were assigned.

And so on. But what about combinations of these factors?

We typically do not pay enough attention multiple flaws in the same trial. I include myself among the guilty. We may suspect that flaws are seldom simply additive in their effect, but we don’t consider whether they may be even synergism in the negative effects on the validity of a trial. As we will see in this analysis of a clinical trial, multiple flaws can provide more threats to the validity trial than what we might infer when the individual flaws are considered independently.

The particular paper we are probing is described in its discussion section as the “largest RCT to date testing the efficacy of group CBT for patients with CFS.” It also takes on added importance because two of the authors, Gijs Bleijenberg and Hans Knoop, are considered leading experts in the Netherlands. The treatment protocol was developed over time by the Dutch Expert Centre for Chronic Fatigue (NKCV, http://www.nkcv.nl; Knoop and Bleijenberg, 2010). Moreover, these senior authors dismiss any criticism and even ridicule critics. This study is cited as support for their overall assessment of their own work.  Gijs Bleijenberg claims:

Cognitive behavioural therapy is still an effective treatment, even the preferential treatment for chronic fatigue syndrome.

But

Not everybody endorses these conclusions, however their objections are mostly baseless.

Spoiler alert

This is a long read blog post. I will offer a summary for those who don’t want to read through it, but who still want the gist of what I will be saying. However, as always, I encourage readers to be skeptical of what I say and to look to my evidence and arguments and decide for themselves.

Authors of this trial stacked the deck to demonstrate that their treatment is effective. They are striving to support the extraordinary claim that group cognitive behavior therapy fosters not only better adaptation, but actually recovery from what is internationally considered a physical condition.

There are some obvious features of the study that contribute to the likelihood of a positive effect, but these features need to be considered collectively, in combination, to appreciate the strength of this effort to guarantee positive results.

This study represents the perfect storm of design features that operate synergistically:

perfect storm

 Referral bias – Trial conducted in a single specialized treatment setting known for advocating psychological factors maintaining physical illness.

Strong self-selection bias of a minority of patients enrolling in the trial seeking a treatment they otherwise cannot get.

Broad, overinclusive diagnostic criteria for entry into the trial.

Active treatment condition carry strong message how patients should respond to outcome assessment with improvement.

An unblinded trial with a waitlist control lacking the nonspecific elements (placebo) that confound the active treatment.

Subjective self-report outcomes.

Specifying a clinically significant improvement that required only that a primary outcome be less than needed for entry into the trial

Deliberate exclusion of relevant objective outcomes.

Avoidance of any recording of negative effects.

Despite the prestige attached to this trial in Europe, the US Agency for Healthcare Research and Quality (AHRQ) excludes this trial from providing evidence for its database of treatments for chronic fatigue syndrome/myalgic encephalomyelitis. We will see why in this post.

factsThe take away message: Although not many psychotherapy trials incorporate all of these factors, most trials have some. We should be more sensitive to when multiple factors occur in the same trial, like bias in the site for patient recruitment; lacking of blinding; lack of balance between active treatment and control condition in terms of nonspecific factors, and subjective self-report measures.

The article reporting the trial is

Wiborg JF, van Bussel J, van Dijk A, Bleijenberg G, Knoop H. Randomised controlled trial of cognitive behaviour therapy delivered in groups of patients with chronic fatigue syndrome. Psychotherapy and Psychosomatics. 2015;84(6):368-76.

Unfortunately, the article is currently behind a pay wall. Perhaps readers could contact the corresponding author Hans.knoop@radboudumc.nl  and request a PDF.

The abstract

Background: Meta-analyses have been inconclusive about the efficacy of cognitive behaviour therapies (CBTs) delivered in groups of patients with chronic fatigue syndrome (CFS) due to a lack of adequate studies. Methods: We conducted a pragmatic randomised controlled trial with 204 adult CFS patients from our routine clinical practice who were willing to receive group therapy. Patients were equally allocated to therapy groups of 8 patients and 2 therapists, 4 patients and 1 therapist or a waiting list control condition. Primary analysis was based on the intention-to-treat principle and compared the intervention group (n = 136) with the waiting list condition (n = 68). The study was open label. Results: Thirty-four (17%) patients were lost to follow-up during the course of the trial. Missing data were imputed using mean proportions of improvement based on the outcome scores of similar patients with a second assessment. Large and significant improvement in favour of the intervention group was found on fatigue severity (effect size = 1.1) and overall impairment (effect size = 0.9) at the second assessment. Physical functioning and psychological distress improved moderately (effect size = 0.5). Treatment effects remained significant in sensitivity and per-protocol analyses. Subgroup analysis revealed that the effects of the intervention also remained significant when both group sizes (i.e. 4 and 8 patients) were compared separately with the waiting list condition. Conclusions: CBT can be effectively delivered in groups of CFS patients. Group size does not seem to affect the general efficacy of the intervention which is of importance for settings in which large treatment groups are not feasible due to limited referral

The trial registration

http://www.isrctn.com/ISRCTN15823716

Who was enrolled into the trial?

Who gets into a psychotherapy trial is a function of the particular treatment setting of the study, the diagnostic criteria for entry, and patient preferences for getting their care through a trial, rather than what is being routinely provided in that setting.

 We need to pay particular attention to when patients enter psychotherapy trials hoping they will receive a treatment they prefer and not to be assigned to the other condition. Patients may be in a clinical trial for the betterment of science, but in some settings, they are willing to enroll because of a probability of getting treatment they otherwise could not get. This in turn also affects the evaluation of both the condition in which they get the preferred treatment, but also their evaluation of the condition in which they are denied it. Simply put, they register being pleased with what they wanted or not being pleased if they did not get what they wanted.

The setting is relevant to evaluating who was enrolled in a trial.

The authors’ own outpatient clinic at the Radboud University Medical Center was the site of the study. The group has an international reputation for promoting the biopsychosocial model, in which psychological factors are assumed to be the decisive factor in maintaining somatic complaints.

All patients were referred to our outpatient clinic for the management of chronic fatigue.

There is thus a clear referral bias  or case-mix bias but we are not provided a ready basis for quantifying it or even estimating its effects.

The diagnostic criteria.

The article states:

In accordance with the US Center for Disease Control [9], CFS was defined as severe and unexplained fatigue which lasts for at least 6 months and which is accompanied by substantial impairment in functioning and 4 or more additional complaints such as pain or concentration problems.

Actually, the US Center for Disease Control would now reject this trial because these entry criteria are considered obsolete, overinclusive, and not sufficiently exclusive of other conditions that might be associated with chronic fatigue.*

There is a real paradigm shift happening in America. Both the 2015 IOM Report and the Centers for Disease Control and Prevention (CDC) website emphasize Post Exertional Malaise and getting more ill after any effort with M.E. CBT is no longer recommended by the CDC as treatment.

cdc criteriaThe only mandatory symptom for inclusion in this study is fatigue lasting 6 months. Most properly, this trial targets chronic fatigue [period] and not the condition, chronic fatigue syndrome.

Current US CDC recommendations  (See box  7-1 from the IoM document, above) for diagnosis require postexertional malaise for a diagnosis of myalgic encephalomyelitis (ME). See below.

pemPatients meeting the current American criteria for ME would be eligible for enrollment in this trial, but it’s unclear what proportion of the patients enrolled actually met the American criteria. Because of the over-inclusiveness of the entry diagnostic criteria, it is doubtful whether the results would generalize to American sample. A look at patient flow into the study will be informative.

Patient flow

Let’s look at what is said in the text, but also in the chart depicting patient flow into the trial for any self-selection that might be revealed.

In total, 485 adult patients were diagnosed with CFS during the inclusion period at our clinic (fig. 1). One hundred and fifty-seven patients were excluded from the trial because they declined treatment at our clinic, were already asked to participate in research incompatible with inclusion (e.g. research focusing on individual CBT for CFS) or had a clinical reason for exclusion (i.e. they received specifically tailored interventions because they were already unsuccessfully treated with individual CBT for CFS outside our clinic or were between 18 and 21 years of age and the family had to be involved in the therapy). Of the 328 patients who were asked to engage in group therapy, 99 (30%) patients indicated that they were unwilling to receive group therapy. In 25 patients, the reason for refusal was not recorded. Two hundred and four patients were randomly allocated to one of the three trial conditions. Baseline characteristics of the study sample are presented in table 1. In total, 34 (17%) patients were lost to follow-up. Of the remaining 170 patients, 1 patient had incomplete primary outcome data and 6 patients had incomplete secondary outcome data.

flow chart

We see that the investigators invited two thirds of patients attending the clinic to enroll in the trial. Of these, 41% refused. We don’t know the reason for some of the refusals, but almost a third of the patients approached declined because they did not want group therapy. The authors left being able to randomize 42% of patients coming to the clinic or less than two thirds of patients they actually asked. Of these patients, a little more than two thirds received the treatment to which were randomized and were available for follow-up.

These patients receiving treatment to which they were randomized and who were available for follow-up are self-selected minority of the patients coming to the clinic. This self-selection process likely reduced the proportion of patients with myalgic encephalomyelitis. It is estimated that 25% of patients meeting the American criteria a housebound and 75% are unable to work. It’s reasonably to infer that patients being the full criteria would opt out of a treatment that require regular attendance of a group session.

The trial is biased to ambulatory patients with fatigue and not ME. Their fatigue is likely due to some combinations of factors such as multiple co-morbidities, as-yet-undiagnosed medical conditions, drug interactions, and the common mild and subsyndromal  anxiety and depressive symptoms that characterize primary care populations.

The treatment being evaluated

Group cognitive behavior therapy for chronic fatigue syndrome, either delivered in a small (4 patients and 1 therapist) or larger (8 patients and 2 therapists) group format.

The intervention consisted of 14 group sessions of 2 h within a period of 6 months followed by a second assessment. Before the intervention started, patients were introduced to their group therapist in an individual session. The intervention was based on previous work of our research group [4,13] and included personal goal setting, fixing sleep-wake cycles, reducing the focus on bodily symptoms, a systematic challenge of fatigue-related beliefs, regulation and gradual increase in activities, and accomplishment of personal goals. A formal exercise programme was not part of the intervention.

Patients received a workbook with the content of the therapy. During sessions, patients were explicitly invited to give feedback about fatigue-related cognitions and behaviours to fellow patients. This aspect was introduced to facilitate a pro-active attitude and to avoid misperceptions of the sessions as support group meetings which have been shown to be insufficient for the treatment of CFS.

And note:

In contrast to our previous work [4], we communicated recovery in terms of fatigue and disabilities as general goal of the intervention.

Some impressions of the intensity of this treatment. This is a rather intensive treatment with patients having considerable opportunities for interactions with providers. This factor alone distinguishes being assigned to the intervention group versus being left in the wait list control group and could prove powerful. It will be difficult to distinguish intensity of contact from any content or active ingredients of the therapy.

I’ll leave for another time a fuller discussion of the extent to which what was labeled as cognitive behavior therapy in this study is consistent with cognitive therapy as practiced by Aaron Beck and other leaders of the field. However, a few comments are warranted. What is offered in this trial does not sound like cognitive therapy as Americans practice it. What is often in this trial seems emphasize challenging beliefs, pushing patients to get more active, along with psychoeducational activities. I don’t see indications of the supportive, collaborative relationship in which patients are encouraged to work on what they want to work on, engage in outside activities (homework assignments) and get feedback.

What is missing in this treatment is what Beck calls collaborative empiricism, “a systemic process of therapist and patient working together to establish common goals in treatment, has been found to be one of the primary change agents in cognitive-behavioral therapy (CBT).”

Importantly, in Beck’s approach, the therapist does not assume cognitive distortions on the part of the patient. Rather, in collaboration with the patient, the therapist introduces alternatives to the interpretations that the patient has been making and encourages the patient to consider the difference. In contrast, rather than eliciting goal statements from patients, therapist in this study imposes the goal of increased activity. Therapists in this study also seem ready to impose their views that the patients’ fatigue-related beliefs are maladaptive.

The treatment offered in this trial is complex, with multiple components making multiple assumptions that seem quite different from what is called cognitive therapy or cognitive behavioral therapy in the US.

The authors’ communication of recovery from fatigue and disability seems a radical departure not only from cognitive behavior therapy for anxiety and depression and pain, but for cognitive behavior therapy offered for adaptation to acute and chronic physical illnesses. We will return to this “communication” later.

The control group

Patients not randomized to group CBT were placed on a waiting list.

Think about it! What do patients think about having gotten involved in all the inconvenience and burden of a clinical trial in hope that they would get treatment and then being assigned to the control group with just waiting? Not only are they going to be disappointed and register that in their subjective evaluations of the outcome assessments patients may worry about jeopardizing the right to the treatment they are waiting for if they overly endorse positive outcomes. There is a potential for  nocebo effect , compounding the placebo effect of assignment to the CBT active treatment groups.

What are informative comparisons between active treatments and  control conditions?

We need to ask more often what inclusion of a control group accomplishes for the evaluation of a psychotherapy. In doing so, we need to keep in mind that psychotherapies do not have effect sizes, only comparisons of psychotherapies and control condition have effect sizes.

A pre-post evaluation of psychotherapy from baseline to follow-up includes the effects of any active ingredient in the psychotherapy, a host of nonspecific (placebo) factors, and any changes that would’ve occurred in the absence of the intervention. These include regression to the mean– patients are more likely to enter a clinical trial now, rather than later or previously, if there has been exacerbation of their symptoms.

So, a proper comparison/control condition includes everything that the patients randomized to the intervention group get except for the active treatment. Ideally, the intervention and the comparison/control group are equivalent on all these factors, except the active ingredient of the intervention.

That is clearly not what is happening in this trial. Patients randomized to the intervention group get the intervention, the added intensity and frequency of contact with professionals that the intervention provides, and all the support that goes with it; and the positive expectations that come with getting a therapy that they wanted.

Attempts to evaluate the group CBT versus the wait-list control group involved confounding the active ingredients of the CBT and all these nonspecific effects. The deck is clearly being stacked in favor of CBT.

This may be a randomized trial, but properly speaking, this is not a randomized controlled trial, because the comparison group does not control for nonspecific factors, which are imbalanced.

The unblinded nature of the trial

In RCTs of psychotropic drugs, the ideal is to compare the psychotropic drug to an inert pill placebo with providers, patients, and evaluate being blinded as to whether the patients received psychotropic drug or the comparison pill.

While it is difficult to achieve a comparable level of blindness and a psychotherapy trial, more of an effort to achieve blindness is desirable. For instance, in this trial, the authors took pains to distinguish the CBT from what would’ve happened in a support group. A much more adequate comparison would therefore be CBT versus either a professional or peer-led support group with equivalent amounts of contact time. Further blinding would be possible if patients were told only two forms of group therapy were being compared. If that was the information available to patients contemplating consenting to the trial, it wouldn’t have been so obvious from the outset to the patients being randomly assigned that one group was preferable to the other.

Subjective self-report outcomes.

The primary outcomes for the trial were the fatigue subscale of the Checklist Individual Strength;  the physical functioning subscale of the Short Health Survey 36 (SF-36); and overall impairment as measured by the Sickness Impact Profile (SIP).

Realistically, self-report outcomes are often all that is available in many psychotherapy trials. Commonly these are self-report assessments of anxiety and depressive symptoms, although these may be supplemented by interviewer-based assessments. We don’t have objective biomarkers with which to evaluate psychotherapy.

These three self-report measures are relatively nonspecific, particularly in a population that is not characterized by ME. Self-reported fatigue in a primary care population lacks discriminative validity with respect to pain, anxiety and depressive symptoms, and general demoralization.  The measures are susceptible to receipt of support and re-moralization, as well as gratitude for obtaining a treatment that was sought.

Self-report entry criteria include a score 35 or higher on the fatigue severity subscale. Yet, a score of less than 35 on this scale at follow up is part of what is defined as a clinically significant improvement with a composite score from combined self-report measures.

We know from medical trials that differences can be observed with subjective self-report measures that will not be found with objective measures. Thus, mildly asthmatic patients will fail to distinguish in their subjective self-reports between [  between the effective inhalant albuterol, an inert inhalant, and sham acupuncture, but will rate improvement better than getting no intervention.  However,  there will be a strong advantage over the other three conditions with an objective measure, maximum forced expiratory volume in 1 second (FEV1) as assessed  with spirometry.

The suppression of objective outcome measures

We cannot let these the authors of this trial off the hook in their dependence on subjective self-report outcomes. They are instructing patients that recovery is the goal, which implies that it is an attainable goal. We can reasonably be skeptical about acclaim of recovery based on changes in self-report measures. Were the patients actually able to exercise? What was their exercise capacity, as objectively measured? Did they return to work?

These authors have included such objective measurements in past studies, but not included them as primary outcomes, nor, even in some cases, reported them in the main paper reporting the trial.

Wiborg JF, Knoop H, Stulemeijer M, Prins JB, Bleijenberg G. How does cognitive behaviour therapy reduce fatigue in patients with chronic fatigue syndrome? The role of physical activity. Psychol Med. 2010 Jan 5:1

The senior authors’ review fails to mention their three studies using actigraphy that did not find effects for CBT. I am unaware of any studies that did find enduring effects.

Perhaps this is what they mean when they say the protocol has been developed over time – they removed what they found to be threats to the findings that they wanted to claim.

Dismissing of any need to consider negative effects of treatment

Most psychotherapy fail to assess any adverse effects of treatment, but this is usually done discretely, without mention. In contrast, this article states

Potential harms of the intervention were not assessed. Previous research has shown that cognitive behavioural interventions for CFS are safe and unlikely to produce detrimental effects.

Patients who meet stringent criteria for ME would be put at risk for pressure to exert themselves. By definition they are vulnerable to postexertional malaise (PEM). Any trail of this nature needs to assess that risk. Maybe no adverse effects would be found. If that were so, it would strongly indicate the absence of patients with appropriate diagnoses.

Timing of assessment of outcomes varied between intervention and control group.

I at first did not believe what I was reading when I encountered this statement in the results section.

The mean time between baseline and second assessment was 6.2 months (SD = 0.9) in the control condition and 12.0 months (SD = 2.4) in the intervention group. This difference in assessment duration was significant (p < 0.001) and was mainly due to the fact that the start of the therapy groups had to be frequently postponed because of an irregular patient flow and limited treatment capacities for group therapy at our clinic. In accordance with the treatment manual, the second assessment was postponed until the fourteenth group session was accomplished. The mean time between the last group session and the second assessment was 3.3 weeks (SD = 3.5).

So, outcomes were assessed for the intervention group shortly after completion of therapy, when nonspecific (placebo) effects would be stronger, but a mean of six months later than for patients assigned to the control condition.

Post-hoc statistical controls are not sufficient to rescue the study from this important group difference, and it compounds other problems in the study.

Take away lessons

Pay more attention to how limitations any clinical trial may compound each other in terms of the trial provide exaggerated estimates of the effects of treatment or the generalizability of the results to other settings.

Be careful of loose diagnostic criteria because a trial may not generalize to the same criteria being applied in settings that are different either in terms of patient population of the availability of different treatments. This is particularly important when a treatment setting has a bias in referrals and only a minority of patients being invited to participate in the trial actually agree and are enrolled.

Ask questions about just what information is obtained in comparing active treatment group and the study to its control/comparison. For start, just what is being controlled and how might that affect the estimates of the effectiveness of the active treatment?

Pay particular attention to the potent combination of the trial being unblinded, a weak comparision/control, and an active treatment that is not otherwise available to patients.

Note

*The means of determining whether the six months of fatigue might be accounted for by other medical factors was specific to the setting. Note that a review of medical records for sufficient for an unknown proportion of patients, with no further examination or medical tests.

The Department of Internal Medicine at the Radboud University Medical Center assessed the medical examination status of all patients and decided whether patients had been sufficiently examined by a medical doctor to rule out relevant medical explanations for the complaints. If patients had not been sufficiently examined, they were seen for standard medical tests at the Department of Internal Medicine prior to referral to our outpatient clinic. In accordance with recommendations by the Centers for Disease Control, sufficient medical examination included evaluation of somatic parameters that may provide evidence for a plausible somatic explanation for prolonged fatigue [for a list, see [9]. When abnormalities were detected in these tests, additional tests were made based on the judgement of the clinician of the Department of Internal Medicine who ultimately decided about the appropriateness of referral to our clinic. Trained therapists at our clinic ruled out psychiatric comorbidity as potential explanation for the complaints in unstructured clinical interviews.

workup

Is Donald Trump suffering from Pick’s Disease (frontotemporal dementia)?

Changing the conversation about Donald Trump’s fitness for office from whether he has a personality disorder to whether he has an organic brain disorder.

mind the brain logoChanging the conversation about Donald Trump’s fitness for office from whether he has a personality disorder to whether he has an organic brain disorder.

Trump.jpgFor a long while there has been an ongoing debate about whether Donald Trump suffers from a personality disorder that might contribute to his being unfit the President of the United States. Psychiatrists have ethical constraints in what they say because of the so-called Goldwater rule, barring them from commenting on the mental health of political figures that they have not personally  interviewed.

I am a clinical psychologist, not a psychiatrist. I feel the need to speak out that the behavior of Donald Trump is abnormal and we should caution against normalizing it. The problem with settling on his behavior being simply that of a bad person or con man is it doesn’t prepare us for just how erratic his behavior can be.

I’ll refrain from making a formal psychiatric diagnosis. I actually think that in clinical practice, a lot of mental health professionals too casually make diagnoses of personality disorders for patients (or privately, even for colleagues) they find difficult or annoying.  If they ever gave these people a structured interview,  I suspect they would be found to fall  below the threshold for any particular personality disorder.

Changing the conversation

But now an article in Stat has changed the conversation to whether Donald Trump suffers from personality disorder to whether he is developing an organic brain disorder.

I’m a brain specialist. I think Trump should be tested for a degenerative brain disease

When President Trump slurred his words during a news conference this week, some Trump watchers speculated that he was having a stroke. I watched the clip and, as a physician who specializes in brain function and disability, I don’t think a stroke was behind the slurred words. But having evaluated the chief executive’s remarkable behavior through my clinical lens for almost a year, I do believe he is displaying signs that could indicate a degenerative brain disorder.

As the president’s demeanor and unusual decisions raise the potential for military conflict in two regions of the world, the questions surrounding his mental competence have become urgent and demand investigation.

And

I see worrisome symptoms that fall into three main categories: problems with language and executive function; problems with social cognition and behavior; and problems with memory, attention, and concentration. None of these are symptoms of being a bad or mean person. Nor do they require spelunking into the depths of his psyche to understand. Instead, they raise concern for a neurocognitive disease process in the same sense that wheezing raises the alarm for asthma.

In addition to being a medical journalist, the author Ford Vox of the article is a neurorehabilitation physician who is board-certified physical medicine and rehabilitation physician with additional subspecialty board certification in brain injury medicine.

I was alerted by the possibility of a diagnosis of frontotemporal dementia by a tweet by Barney Carroll. He is a senior psychiatrist whom I have come to trust as a mentor on social media, even though we’ve never overlapped in the same department at the same time.

barney forget psychnoanalysis

And then there was this tweet about the Stat story, but I could judge its credibility because I did not know the tweeter or her source:

trump's disease

I followed up with a Google search and came across an article from August 2016, before the election:

Finally figured out Trump’s medical diagnosis after watching this:

It’s called Pick’s Disease, or frontotemporal dementia

Look at the symptoms, all of these which fit Trump quite closely:

  • Impulsivity and poor judgment
  • Extreme restlessness (early stages)
  • Overeating or drinking to excess
  • Sexual exhibitionism or promiscuity
  • Decline in function at work and home
  • Repetitive or obsessive behavior

And especially these, listed earlier in the article:

Excess protein build-up causes the frontal and temporal lobes of the brain, which control speech and personality, to slowly atrophy. 

Then I followed up with more Google searches, hitting MedLine Plus,  the website maintained by the National Institutes of Health’s Web site for patients and their families and friends and produced by the National Library of Medicine.

Pick disease

Pick disease is a rare form of dementia that is similar to Alzheimer disease, except that it tends to affect only certain areas of the brain.

Causes

People with Pick disease have abnormal substances (called Pick bodies and Pick cells) inside nerve cells in the damaged areas of the brain.

Pick bodies and Pick cells contain an abnormal form of a protein called tau. This protein is found in all nerve cells. But some people with Pick disease have an abnormal amount or type of this protein.

The exact cause of the abnormal form of the protein is unknown. Many different abnormal genes have been found that can cause Pick disease. Some cases of Pick disease are passed down through families.

Pick disease is rare. It can occur in people as young as 20. But it usually begins between ages 40 and 60. The average age at which it begins is 54.

Symptoms

The disease gets worse slowly. Tissues in parts of the brain shrink over time. Symptoms such as behavior changes, speech difficulty, and problems thinking occur slowly and get worse.

Early personality changes can help doctors tell Pick disease apart from Alzheimer disease. (Memory loss is often the main, and earliest, symptom of Alzheimer disease.)

People with Pick disease tend to behave the wrong way in different social settings. The changes in behavior continue to get worse and are often one of the most disturbing symptoms of the disease. Some persons have more difficulty with decision making, complex tasks, or language (trouble finding or understanding words or writing).

The website notes

A brain biopsy is the only test that can confirm the diagnosis.

However, some alternative diagnoses can be ruled out:

Your doctor might order tests to help rule out other causes of dementia, including dementia due to metabolic causes. Pick disease is diagnosed based on symptoms and results of tests, including:

Assessment of the mind and behavior (neuropsychological assessment)

Brain MRI

Electroencephalogram (EEG)

Examination of the brain and nervous system (neurological exam)

Examination of the fluid around the central nervous system (cerebrospinal fluid) after a lumbar puncture

Head CT scan

Tests of sensation, thinking and reasoning (cognitive function), and motor function

Back to Ford Vox in his Stats article:

In Trump’s case, we have no relevant testing to review. His personal physician issued a thoroughly unsatisfying letter before the election that didn’t contain much in the way of hard data. That’s a situation many people want to correct via an independent medical panel that can objectively evaluate the president’s fitness to serve. But the prospects for getting Congress to use the 25th Amendment in this way seem poor at the moment.

What we do have are a growing array of signs and symptoms displayed in public for all to see. It’s time to discuss these issues in a clinical context, even if this is a very atypical form of examination. It’s all we have. And even if the president has a physical exam early next year and releases the records, as announced by the White House, what he really needs is thorough cognitive testing.

So?

Before biting the bullet, I also spoke with Dr. Dennis Agliano, who chairs the AMA’s Council on Ethical and Judicial Affairs, the panel that wrote the new ethical guidance. He advised me to be careful: “You can get yourself into hot water, since there are people who like Trump, and they may submit a complaint to the AMA,” the Tampa otolaryngologist told me. Ultimately, he reassured me that I should just do what I think is right.

Which is warn the president that he needs to be evaluated for a brain disease.

Good luck, Dr Vox, but at least we have a reasonable hypothesis on the table. As Barney Carroll says “Time will tell.”

slurred speech

Using F1000 “peer review” to promote politics over evidence about delivering psychosocial care to cancer patients

The F 1000 platform allowed authors and the reviewers whom they nominated to collaborate in crafting more of their special interest advocacy that they have widely disseminated elsewhere. Nothing original in this article and certainly not best evidence!

 

mind the brain logo

A newly posted article on the F1000 website raises questions about what the website claims is a “peer-reviewed” open research platform.

Infomercial? The F1000 platform allowed authors and the reviewers whom they nominated to collaborate in crafting more of their special interest advocacy that they have widely disseminated elsewhere. Nothing original in this article and certainly not best evidence!

I challenge the authors and the reviewers they picked to identify something said in the F1000 article that they have not said numerous times before either alone or in papers co-authored by some combination of authors and the reviewers they picked for this paper.

F1000 makes the attractive and misleading claim that versions of articles that are posted on its website reflect the response to reviewers.

Readers should be aware of uncritically accepting articles on the F 1000 website as having been peer-reviewed in any conventional sense of the term.

Will other special interests groups exploit this opportunity to brand their claims as “peer-reviewed” without the risk of having to tone down their claims in peer review? Is this already happening?

In the case of this article, reviewers were all chosen by the authors and have a history of co-authoring papers with the authors of the target paper in active advocacy of a shared political perspective, one that is contrary to available evidence.

Cynically, future authors might be motivated to divide their team, with some remaining authors and others dropping off to become nominated as reviewers. They could then suggest content that had already been agreed would be included, but was left off for the purposes being suggested in the review process

F1000

F1000Research bills itself as

An Open Research publishing platform for life scientists, offering immediate publication of articles and other research outputs without editorial bias. All articles benefit from transparent refereeing and the inclusion of all source data.

Material posted on this website is labeled as having received rapid peer-review:

Articles are published rapidly as soon as they are accepted, after passing an in-house quality check. Peer review by invited experts, suggested by the authors, takes place openly after publication.

My recent Google Scholar alert call attention to an article posted on F1000

Advancing psychosocial care in cancer patients [version 1; referees: 3 approved]

 Who were the reviewers?

open peer review of Advancing psychosocial care

Google the names of authors and reviewers. You will discover a pattern of co-authorship; leadership positions in international Psycho-Oncology society, a group promoting the mandating of specially mental health services for cancer patients, and lots of jointly and separately authored articles making a pitch for increased involvement of mental health professionals in routine cancer care. This article adds almost nothing to what is multiply available elsewhere in highly redundant publications

Given a choice of reviewers, these authors would be unlikely to nominate me. Nonetheless, here is my review of the article.

 As I might do in a review of a manuscript, I’m not providing citations for these comments, but support can readily be found by a search of blog posts at my website @CoyneoftheRealm.com and Google Scholar search of my publications. I welcome queries from anybody seeking documentation of these points below.

 Fighting Spirit

The notion that cancer patients having a fighting spirit improves survival is popular in the lay press and in promoting the power of the mind over cancer, but it has thoroughly been discredited.

Early on, the article identifies fighting spirit as an adaptive coping style. In actuality, fighting spirit was initially thought to predict mortality in a small methodologically flawed study. But that is no longer claimed.

Even one of the authors of the original study, Maggie Watson,  expressed relief when her own larger, better designed study failed to confirm the impression that a fighting spirit extended life after diagnosis  of cancer. Why? Dr. Watson was concerned that the concept was being abused in blaming cancer patients who were dying there was to their personal deficiency of not having enough fighting spirit.

Fighting spirit is rather useless as a measure of psychological adaptation. It confounds severity of cancer enrolled dysfunction with efforts to cope with cancer.

Distress as the sixth vital sign for cancer patients

distress thermometerBeware of a marketing slogan posing as an empirical statement. Its emptiness is similar to that of to “Pepsi is the one.” Can you imagine anyone conducting a serious study in which they conclude “Pepsi is not the one”?

Once again in this article, a vacuous marketing slogan is presented in impressive, but pseudo-medical terms. Distress cannot be a vital sign in the conventional sense. Thr  vital signs are objective measurements that do not depend on patient self-report: body temperature, pulse rate, and respiration rate (rate of breathing) (Blood pressure is not considered a vital sign, but is often measured along with the vital signs.).

Pain was declared a fifth vital sign, with physicians mandated  by guidelines to provide routine self-report screening of patients, regardless of their reasons for visit. Pain being the fifth vital sign seems to have been the inspiration for declaring distress as the sixth vital sign for cancer patients. However policy makers declaring pain  as the fifth vital sign did not result in improved patient levels of pain. Their subsequent making intervention mandatory for any reports of pain led to a rise in unnecessary back and knee surgery, with a substantial rise in associated morbidity and loss of function. The next shift to prescription of opioids that were claimed not to be addictive was the beginning of the current epidemic of addiction to prescription opioids. Making pain the fifth vital sign is killed a lot of patients and  turned others into addicts craving drugs on the street because they have lost their prescriptions for the opioids that addicted them.

pain as 5th vital signCDC launches

 Cancer as a mental health issue

There is a lack of evidence that cancer carries a risk of psychiatric disorder more than other chronic and catastrophic illnesses. However, the myth that there is something unique or unusual about cancer’s threat to mental health is commonly cited by mental health professional advocacy groups is commonly used to justify increased resources to them for specialized services.

The article provides an inflated estimate of psychiatric morbidity by counting adjustment disorders as psychiatric disorders. Essentially, a cancer patient who seeks mental health interventions for distress qualifies by virtue of help seeking being defined as impairment.

The conceptual and empirical muddle of “distress” in cancer patients

The article repeats the standard sloganeering definition of distress that the authors and reviewers have circulated elsewhere.

It has been very broadly defined as “a multifactorial, unpleasant, emotional experienceof a psychological (cognitive, behavioural, emotional), social and/or spiritual nature that may interfere with the ability to cope effectively with cancer, its physical symptoms and its treatment and that extends along a continuum, ranging from common normalfeelings of vulnerability, sadness and fears to problems that can become disabling, such as depression, anxiety, panic, social isolation and existential and spiritual crisis”5

[You might try googling this. I’m sure you’ll discover an amazing number of repetitions in similar articles advocating increasing psychosocial services for cancer patients organized around this broad definition.]

Distress is so broadly defined and all-encompassing, that there can be no meaningful independent validation of distress measures except for by other measures of distress, not conventional measures of adaptation or mental health. I have discussed that in a recent blog post.

If we restrict “distress” to the more conventional meaning of stress or negative affect, we find that any elevation in distress (usually 35% or so) associated with onset diagnosis of cancer tends to follow a natural trajectory of decline without formal intervention. Elevations in distress for most cancer patients, are resolved within 3 to 6 months without intervention. A residual 9 to 11% of cancer patients having elevated distress is likely attributed to pre-existing psychiatric disorder.

Routine screening for distress

The slogan “distress is the sixth vital sign” is used to justify mandatory routine screening of cancer patients for distress. In the United States, surgeons cannot close their electronic medical records for a patient and go on to the next patient without recording whether they had screened patients for distress, and if the patient reports distress, what intervention has been provided. Clinicians simply informally asking patients if they are distressed and responding to a “yes” by providing the patient with an antidepressant without further follow up allows surgeons to close the medical records.

As I have done so before, I challenge advocates of routine screening of cancer patients for distress to produce evidence that simply introducing routine screening without additional resources leads to better patient outcomes.

Routine screening for distress as uncovering unmet needs among cancer patients

 Studies in the Netherlands suggest that there is not a significant increase in need for services from mental health or allied health professionals associated with diagnosis of cancer. There is some disruption of such services that patients were receiving before diagnosis. It doesn’t take screening and discussion to suggest that patients that they at some point resume those services if they wish. There is also some increased need for physical therapy and nutritional counseling

If patients are simply asked a question whether they want a discussion of the services (in Dutch: Zou u met een deskundige willen praten over uw problemen?)  that are available, many patients will decline.

Much of demand for supportive services like counseling and support groups, especially among breast cancer patients is not from among the most distressed patients. One of the problems with clinical trials of psychosocial interventions is that most of the patients who seek enrollment are not distressed, and less they are prescreened. This poses dilemma: if you require elevated distress on a screening instrument, we end up rationing services and excluding many of the patients who would otherwise be receiving them.

I welcome clarification from F 1000 just what they offer over other preprint repositories. When one downloads a preprint from some other repositories, it clearly displays “not yet peer-reviewed.” F 1000 carries the advantage of the label of “peer-reviewed, but does not seem to be hard earned.

Notes

Slides are from two recent talks at Dutch International Congress on Insurance Medicine Thursday, November 9, 2017, Almere, Netherlands   :

Will primary care be automated screening and procedures or talking to patients and problem-solving? Invited presentation

and

Why you should not routinely screen your patients for depression and what you should do instead. Plenary Presentation