Stop using the Adverse Childhood Experiences Checklist to make claims about trauma causing physical and mental health problems

Scores on the adverse childhood experiences (ACE) checklist (or ACC) are widely used in making claims about the causal influence of childhood trauma on mental and physical health problems. Does anyone making these claims bother to look at how the checklist is put together and consider what a summary score might mean?

 

mind the brain logo

Scores on the adverse childhood experiences (ACE) checklist (or ACC) are widely used in making claims about the causal influence of childhood trauma on mental and physical health problems. Does anyone making these claims bother to look at how the checklist is put together and consider what a summary score might mean?

In this issue of Mind the Brain, we begin taking a skeptical look at the ACE checklist. We ponder some of the assumptions implicit in what items were included and how summary scores of the number of items checked are interpreted. Readers will be left with profound doubts that the ACE is suitable for making claims about trauma.

This blog will eventually be followed by another that presents the case that scores on the ACC do not represent a risk factor for health problems, only a relatively uninformative risk marker. In contrast to potentially modifiable risk factors, risk markers are best interpreted as calling attention to the influence of some combination of other risk factors, many of as yet unspecified, but undoubtedly of an entirely different nature than what is being studied. What?!! You will have to stay tuned, but I’ll give some hints about what I am talking about in the current blog post.

Summary of key points

 The ACE checklist is a collection of very diverse and ambiguous items that cannot be presumed to necessarily represent traumatic experiences.

Items variously

  • Represent circumstances that are not typically traumatic.
  • Reflect the respondent’s past or current psychopathology.
  • Make equivalent and traumatic vastly different experiences, many neutral and some that are positive.
  • Reinterpret a personal vulnerability due to familial transmission of psychopathology, either direct or indirect, rather than simply an exposure to events.
  • Ignore crucial contextual information, including timing of events.

There is reason not to assume that higher summed scores for the ACE represent more exposure to trauma than lower scores.

Are professionals misinterpreting the ACE checklist just careless or are they ideologues selectively identifying “evidence” for their positions which don’t depend on evidence at all?

ace-7Witness claims based on research with the ACE that migraines are caused by sexual abuse   and that psychotherapy addressing that abuse should be first line treatment. Or claims that childhood trauma is as strong a risk factor for psychosis and schizophrenia as smoking is for lung cancer [* ] and so psychotherapy is equivalent to medication in its effects. Or claims that myalgic encephalomyelitis, formerly known as chronic fatigue syndrome, is caused by childhood trauma and the psychological treatments can be recommended as the treatment of choice. These claims share a speculative, vague neo-cryptic pseudopsychoanalytic set of assumptions that is seldom articulated or explicitly confronted with evidence. Authors typically leap from claims about childhood trauma causing later problems to non sequitur claims about the efficacy of psychological intervention in treating these problems by addressing trauma. These claims about efficacy of trauma-focused treatment are not borne out in actually examining effects observed in randomized controlled trials.

Rather than attempting to address a provocative question about investigator motivation without a ready way of answering it, I will show most claims about trauma causing mental and physical health problems are, at best, based on very weak evidence, if they depend solely on the ACE checklist.

I will leave for my readers to decide if some authors who make such a fuss about the ACE have bothered to look at the instrument or care that is so inappropriate for the purposes to which they put it.

The ACE is reproduced at the bottom of this post and it is a good idea to compare what I’m saying about it to the actual checklist.

e5fc302ac1fabf0757e62a935b27800d
What “science” is behind such speculations?

The ACE was originally intended for educational purposes, not as a scientific instrument. Perhaps that explains its gross deficiencies as a key measure of psychological and epidemiological constructs.

The ACE checklist is a collection of very different and ambiguous items that cannot be presumed to represent traumatic experiences.

The ACE consists of ten dichotomous items for which the respondent is asked to indicate no/yes whether an experience occurred before the age of 18.  However, for six of the 10 items, the respondent is given further choices  that often differ greatly in the kind of experience to which the items refer. Scoring of the instrument does not take which of these experiences is the basis of a response. For example,

5. Did you often feel that … You didn’t have enough to eat, had to wear dirty clothes, and had no one to protect you? or

Your parents were too drunk or high to take care of you or take you to the doctor if you needed it?

Yes   No     If yes enter 1     ________

This item treats some very different circumstances as equivalent. The first half is complex, but largely covers the experience of living in poverty, but combines that with “having no one to protect you.” In contrast, the second half refers to substance abuse on the part of parents. In neither case, is there any room for interpreting what mitigating circumstances in the respondent’s life might have influenced effects of exposure. Presumably, the timing of this exposure would be important. If the exposure only occurred at the end of the 18 year period covered by the checklist, effects could be mitigated by other individual and social resources the respondent had.

Single items that are added together in a summary score.  We have to ask whether there is an equivalency between the two halves of the item that will be treated as the same. This will be an accumulating concern as we go through the 10 item questionnaire

The items vary greatly in the likelihood that they refer to an experience that was traumatic. Seldom do any of the researchers who use the ACE explain what they mean by trauma. If they did, I doubt that they could make a good argument that in endorsing many of these items would indicate that a respondent had faced a trauma.

From the third edition of the American Psychiatric Association Diagnostic and Statistical Manual (DSM-III) onward to DSM-5, the assumption has been that a traumatic event is a catastrophic stressor outside the range of usual human experience.

With that criteria in mind we have to ask if items are likely to represent a traumatic experience for most people. In answering this question, we also have to ask how we willing to consider a particular item is equivalent to other items in arriving at an overall score reflecting exposure to trauma before age 18. Yet, if summary scores are to be meaningful, assumption has to be made that items contribute equally if they are endorsed

6. Were your parents ever separated or divorced?

Yes   No     If yes enter 1     ________

The item refers to a highly prevalent and complex event, the nature and consequences of which are likely to unfold over time. Importantly, we need a sense of context to judge whether the event is traumatic and, if so how severe. Presumably, it would matter greatly when, across the 18 year span, the event that occurred. No timing or other information is asked of the respondent, only whether or not this event occurred. Neither the respondent nor anyone interpreting a score on the inventory has further information as to what is meant.

Other problems with ambiguous items.

Questions can be raised about the validity of all the individual items and the wisdom of combining them as equivalent in creating a summary score.

Items 1 and 2: Items raise questions about what role the respondent played eliciting the event.

 Did an event simply befall a respondent? Was it related to some pre-existing characteristic of the respondent? Or did the respondent have an active role in generating the event?

Did a parent or other adult member of the household often…

Swear at you, insult you, put you down, or humiliate you?

or

Act in a way that made you afraid that you might be physically hurt?

Yes   No     If yes enter 1     ________

And

Did a parent or other adult in the household often …

Push, grab, slap, or throw something at you?

or

Ever hit you so hard that you had marks or were injured?

Yes   No     If yes enter 1     ________

 Here, as throughout the rest of the checklist, questions can be raised about whether these items refer simply to an environmental exposure in epidemiological terms, say, equivalent to asbestos or tobacco. We don’t know the frequency, intensity or context of a the behavior in question, all of which may be crucial in evaluating whether a trauma occurred. For instance, it matters greatly if the behavior happened frequently when the respondent as a toddler or was limited to a struggle that occurred when the respondent was a teen high on drugs  attempting to take the car keys and go for a after midnight drive.

Like most of the rest of the questionnaire, there is the question of timing.

Item 3: There is so much ambiguity in endorsments of (ostensible) sexual abuse. Maybe it was a positive, liberating experience.

This is a crucial item and discussions of the ACE often assume that it is endorsed and represents a traumatic experience:

Did an adult or person at least 5 years older than you ever…

Touch or fondle you or have you touch their body in a sexual way?

or

Try to or actually have oral, anal, or vaginal sex with you?

Note that this is a complex item for which endorsement could be on the basis of a single instance of a person at least 5 years older touching or fondling the respondent. What if the presumed “perpetrator” is the 20 year old boyfriend or girlfriend of a 14 year old?

Are we willing to treat as equivalent “touch” or ‘fondle you” and “having anal sex” in all instances?

Arguably, the event which construed as trauma could actually be quite positive, as in the respondent  forming a secure attachment with a somewhat older, but nonetheless appropriate partner. All that is unconventional is not traumatic. What if the respondent and  alleged “perpetrator” were in a deeply intimate relationship or already married?

The research that attempts to link endorsement of such an item to lasting mental and physical health problems is remarkably contradictory and inconsistent 

Item 4:  Does this  item reflect the respondent’s serious clinical depression or other mental disorder before age 18 or currently, when the checklist is being completed?

Did you often feel that …  No one in your family loved you or thought you were important or special?    or

Your family didn’t look out for each other, feel close to each other, or support each other?

Yes   No     If yes enter 1     ________

As elsewhere in the checklist, there is no place for the respondent or someone interpreting a “yes” response for taking into account timing or contextual factors that might mitigate or compound effects of this “exposure.”

Item 5: Is this a  traumatic exposure or an enduring set of circumstances conferring multiple known risks to mental and physical health?

Did you often feel that …

You didn’t have enough to eat, had to wear dirty clothes, and had no one to protect you?

or

Your parents were too drunk or high to take care of you or take you to the doctor if you needed it?

Yes   No     If yes enter 1     ________

This item has already been discussed above, but is worth revisiting in terms of raising issues whether particular items refer either directly or indirectly to enduring sets of circumstances that pose their own enduring threat. The relevant question is whether items which ostensibly represent “traumatic events” and risk for subsequent problems are not risk factors, but only risk indicators, and not particularly informative ones.

Item 7: Could an ostensibly a traumatic exposure actually be no actual exposure?

Was your mother or stepmother:

Often pushed, grabbed, slapped, or had something thrown at her?    or

Sometimes or often kicked, bitten, hit with a fist, or hit with something hard?    or

Ever repeatedly hit over at least a few minutes or threatened with a gun or knife?

Yes   No     If yes enter 1     ________

Like item four, which refers to ostensible sexual abuse, this item seems to be one of the least ambiguous in terms of representing exposure to risk. But does it? We don’t know the timing, duration, or context. For instance, the mother might no longer be in the home and the respondent might not have known what happened at the time. There is even the possibility that the respondent was the “perpetrator” of such violence against the mother.

Items 8 and 9: Are traumatic exposures or indications of familial transmission of psychopathology?

Did you live with anyone who was a problem drinker or alcoholic or who used street drugs?

Yes   No

If yes enter 1     ________

And

Was a household member depressed or mentally ill or did a household member attempt suicide?    Yes   No     If yes enter 1     ________

These items are highly ambiguous. They don’t take in consideration whether the person was a biological relative, or whether they were a parent, sibling, or someone not biologically related. They don’t take into account timing. There may not have even been any direct exposure to the substance misuse or the attempted suicide, but the respondent only later learned of something that was closeted.

Item 10: traumatic exposure or relief from exposure?

Did a household member go to prison?

Yes   No

If yes enter 1     ________

The implications of endorsement of this item depend greatly on whom the household member was and the circumstances of them going to prison.

There may be a familial relationship with this person, but it could have been an abusive stepparents or stepsiblings, with the incarceration representing a lasting relief from some impressive situations. Or the person who became incarcerated was not an immediate family member, but somewhat more transient, maybe someone who was just renting a room or given a place to stay. We just don’t know.

Does adding up all these endorsements in a summary score clarify or confuse further?

Now add up your “Yes” answers:   _______   This is your ACE Score

 It would be useful to briefly review the assumptions involved in summing across items of a checklist and entering the summary score as a continuous variable in statistical analyses.

Classical test theory recognizes that the individual items may imperfectly reflect the underlying construct, in this case, traumatic exposure. However, in constructing a sum, the expectation is that the imperfections or errors of measurement in particular items cancel each other out. The summed score becomes a purer a representation of the underlying construct than any of the original items. Thus, the summary score will be more reliable and valid than any of the individual items would be.

There are a number of problems in applying this assumption to a summary ACE score. The items are quite heterogeneous, i.e., they vary wildly in whether they are likely to represent a traumatic exposure, and if so, the severity of that exposure. More importantly, there is a huge amount of variation in what these brief items would represent for particular individuals in the contexts they found themselves in the first 18 years of their lives. Undoubtedly, most endorsements of these items would represent false positives, if we hold ourselves to any strict definitions of trauma. If we don’t do so, we risk equating the only normative experiences that may have neutral or even positive effects on the respondent with serious exposures to traumatic events with lasting consequences

We are not in a position to know whether a score of five or even eight necessarily represents more traumatic exposure than a score of one.

Moreover, there is important empirical research of the clustering of events. We certainly cannot consider them random and unrelated. One classic study found 

In our data, total CCA was related to depressive symptoms, drug use, and antisocial behavior in a quadratic manner. Without further elucidation, this higher order relationship could have been interpreted as support for a sensitization process in which the long-term impact of each additional adversity on mental health compounds as childhood adversity accumulates. However, further analysis revealed that this acceleration effect was an artifact of the confounding of high cumulative adversity scores with the experience of more severe events. Thus, respondents with higher total CCA had disproportionately poorer emotional and behavioral functioning because of both the number and severity of the adversities they were exposed to, not the cumulative number of different types of adversities experienced.

And

Because low-impact adversities did not present a cumulative hazard to young adult mental health, they functioned as suppressor events in the total sum score, consistent with Turner and Wheaton’s (1997) expectation. Their inclusion increased the “noise” in the score and greatly watered down the influence of high-impact events. Thus, in addition to decreasing efficiency, total scores may seriously underestimate the cumulative effects of severe forms of childhood adversity, such as abuse and serious neglect.

But what if many or most of the high scores in a particular sample represent only a clustering of low- or no-impact adversities?

Another large-sample, key study cautioned:

Significant effects of parental separation}divorce in predicting subsequent mood disorders and addictive disorders are powerfully affected by whether or not there was parental violence and psychopathology in the household prior to the break-up and whether exposure to these adversities was reduced as a result of the separation (Kessler et al. 1997a). There are some situations – such as one in which the father was a violent alcoholic – where our data suggest that parental divorce and subsequent removal of the respondent from exposure to the father might actually be associated with a significant improvement in the respondent’s subsequent disorder risk profile, a possibility that has important social policy implications.

Finding Your ACE Score-page-0

NOTE

*Richard Bentall commonly interprets summed ACE scores in peer reviewed articles  as having a traditional dose-response association with mental health outcomes, and therefore as representing a modifiable causal factor in psychosis. In books and in social media, his claims become simply absurd.

bentall

I don’t think his interpretations withstand a scrutiny of the items and what a summed score might conceivably represent.

eBook_Mindfulness_345x550Preorders are being accepted for e-books providing skeptical looks at mindfulness and positive psychology, and arming citizen scientists with critical thinking skills. 

I will also be offering scientific writing courses on the web as I have been doing face-to-face for almost a decade. I want to give researchers the tools to get into the journals where their work will get the attention it deserves.

Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.
 

 

 

Did a placebo affect allergic reactions to a pin prick or only in the authors’ minds?

Can placebo effects be harnessed to improve treatment outcomes? Stories of a placebo changing bodily function are important in promoting mind-body medicine, but mostly turn out to be false positives. Was this one an exception?

mind the brain logoCan placebo effects be harnessed to improve treatment outcomes? Stories of a placebo changing bodily function are important in promoting mind-body medicine, but mostly turn out to be false positives. Was this one an exception?

A lesson in critical appraisal: How to screen complicated studies in order to decide whether to put the time and energy into a closer look.

The study:

Howe LC, Goyer JP, Crum AJ. Harnessing the Placebo Effect: Exploring the Influence of Physician Characteristics on Placebo Response. Health Psychology Vol 36(11), Nov 2017, 1074-1082 http://dx.doi.org/10.1037/hea0000499

From the Abstract:

After inducing an allergic reaction in participants through a histamine skin prick test, a health care provider administered a cream with no active ingredients and set either positive expectations (cream will reduce reaction) or negative expectations (cream will increase reaction).

The provider demonstrated either high or low warmth, or either high or low competence.

Results: The impact of expectations on allergic response was enhanced when the provider acted both warmer and more competent and negated when the provider acted colder and less competent.

Conclusion: This study suggests that placebo effects should be construed not as a nuisance variable with mysterious impact but instead as a psychological phenomenon that can be understood and harnessed to improve treatment outcomes.

Why I dismissed this study

bigger skin prickThe small sample size was set in a power analysis based on the authors hopes of finding a moderate effect size, not any existing results. With only 20 participants per cell, most significant findings are likely to be false positives.

The authors had a complicated design with multiple manipulations and  time points, They examined 2 physiological measures, but only reported results for one of them in the paper, the one with stronger results.

The authors did not report a key overall test of whether there was a significant main or interaction effect. Without such a finding, jumping down to significant comparisons between groups is likely to a false positive.

The authors did not adjust for multiple comparisons, despite doing a huge number.

The authors did not report raw mean differences for comparisons, only differences at two time points controlling for gender, race, and the first two time points. No rationale is given.

The authors used language like ‘marginally significant, and ‘different, but not significantly so,’ which might suggest they were chasing and selectively reporting significant findings.

The phenomena under study was mild allergic reaction in the short term:  three time points,  9-15 minutes, with data for 2 earlier time points not reported as outcomes. It is unclear the mechanism by which an experimental manipulation could have an observable effect on such a mild reaction in such a short period of time.

Overview

Claims of placebo effects figures heavily in discussions of the power of the mind over the body. Yet, this power is greatly exaggerated by lay persons and in the lay press and social media. Effects of a placebo manipulation on objective physiological measures, as opposed to subjective self-report measures are uncommon and usually turn out to be false positives.

A New England Journal of Medicine review  of 130 clinical trials found

Little evidence in general that placebos had powerful clinical effects. Although placebos had no significant effects on objective or binary outcomes, they had possible small benefits in studies with continuous subjective outcomes and for the treatment of pain. Outside the setting of clinical trials, there is no justification for the use of placebos.

I often cite another great NEJM study  showing the sharp contrast in positive results obtained subjective self-report versus negative results with objective physical functioning measures.

That is probably the case with a recent report of effects of expectancies and interpersonal relationship on a mild allergic reaction induced by a histamine skin prick test (SPT). The study involved manipulation of the perceived warmth and competence of a provider, as well as whether research participants were told that an inert cream being applied would have a positive or negative effect.

The authors invoke in claiming support that psychological variables do indeed influence a mild allergic reaction. Examining all of the numerous pairwise comparisons,  would be a long and tedious task. However, I decided from some details of the design and analysis of the study, I would not proceed.

Some notable features of the study.

The key manipulations of high versus low warmth and high versus low competence were in the behavior of a single unblinded experimenter.

The design is described as 2x2x2 with a cell size of n= 20 (19 in one cell).

It is more properly described as 2x2x2x(5) because of the 5 time points after the provider administeried the skin prick:

(T1 = 3 min post-SPT, T2 = 6 mi  post-SPT and cream administered directly afterward, T3 = 9 min post-SPTand 3 min post-cream,T4 = 12 min post-SPT and 6 min post-cream, T5 =15 min post-SPT and 9 min. post-cream).

The small number of participants per cell was set in a power analysis based on hope a moderate effect size could be shown, not on past results.

The physiological reaction was measured in terms of size of a wheal (raised bump) and size of the flare (redness surrounding the bump).

Numerous other physiological measures were obtained, including blood pressure and pre-post session saliva samples. It is not stated what was done with these data, but they could have been used to evaluate further the manipulation of experimenter behavior.

No simple correlation between participants’ perceptions of warm and competence are reported, which would have been helpful in interpreting the 2×2 crossing of warmth and competence.

In the supplementary materials, readers are told ratings of itchiness and mood were obtained after the skin prick. No effects of the experimental manipulation were observed, which would seem not to support the effectiveness of the intervention.

No overall ANOVA or test for significance of interactions is presented.

Instead, numerous paired comparisons are presented without correction for post hoc multiplicity.

Further comparisons were conducted with a sample that was constructed post hoc:

To better understand the mechanism by which expectations differed, within a setting of high warmth and high competence, we compared the wheal and flare size for the positive and negative expectations conditions to a follow-up sample who received neutral expectations. This resulted in a total sample of N=62.

Differences arising using this sample were discussed, despite significance levels being p = .095 and p =  .155.

Raw mean scores are not presented nor discussed. Instead, all comparisons controlled for gender and race and size of the wheal at Times 1 and 2,

Only the size of the wheal is reported in the body of the paper, but it was reported

The results on the flare of the reaction were mostly similar (see the supplemental material available online).

Actually, the results reported in the supplemental material were considerably weaker, with claims of differences being marginally significant and favoring results that were only significant at particular time points.

So, what do you think? If you are interested, take a look at the study and let me know if I was premature to dismiss it.

Preorders are being accepted for e-books providing skeptical lookseBook_PositivePsychology_345x550 at mindfulness and positive psychology, and arming citizen scientists with critical thinking skills. Right now there is a special offer for free access to a Mindfulness Master Class. But hurry, it won’t last.

I will also be offering scientific writing courses on the web as I have been doing face-to-face for almost a decade. I want to give researchers the tools to get into the journals where their work will get the attention it deserves.

Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.

“It’s certainly not bareknuckle:” Comments to a journalist about a critique of mindfulness research

We can’t assume authors of mindfulness studies are striving to do the best possible science, including being prepared for the possibility of being proven incorrect by their results.

mind the brain logo

I recently had a Skype interview with science journalist Peter Hess concerning an article in Psychological Science.

Peter was exceptionally prepared, had a definite point of view, but was open to what I said. In the end seem to be persuaded by me on a number of points.  The resulting article in Inverse  faithfully conveyed my perspective and juxtaposed quotes from me with those from an author of the Psych Science piece in a kind of debate.

My point of view

larger dogWhen evaluating an article about mindfulness in a peer-reviewed journal, we need to take into account that authors may not necessarily be striving to do the best science, but to maximally benefit their particular brand of mindfulness, their products, or the settings in which they operate. Many studies of mindfulness are a little more than infomercials, weak research intended only to get mindfulness promoters’ advertisement of themselves into print or to allow the labeling of claims as “peer-reviewed”. Caveat Lector.

We cannot assume authors of mindfulness studies are striving to do the best possible science, including being prepared for the possibility of being proven incorrect by their results. Rather they may be simply try to get the strongest possible claims through peer review, ignoring best research practices and best publication practices.

Psychologists Express Growing Concern With Mindfulness Meditation

“It’s not bare-knuckle, that’s for sure.”

There was much from the author of the Psych Science article with which  I would agree:

“In my opinion, there are far too many organizations, companies, and therapists moving forward with the implementation of ‘mindfulness-based’ treatments, apps, et cetera before the research can actually tell us whether it actually works, and what the risk-reward ratio is,” corresponding author and University of Melbourne research fellow Nicholas Van Dam, Ph.D. tells Inverse.

Bravo! And

“People are spending a lot of money and time learning to meditate, listening to guest speakers about corporate integration of mindfulness, and watching TED talks about how mindfulness is going to supercharge their brain and help them live longer. Best case scenario, some of the advertising is true. Worst case scenario: very little to none of the advertising is true and people may actually get hurt (e.g., experience serious adverse effects).”

But there were some statements that renewed the discomfort and disappointment I experienced when I read the original article in Psychological Science:

 “I think the biggest concern among my co-authors and I is that people will give up on mindfulness and/or meditation because they try it and it doesn’t work as promised,” says Van Dam.

“There may really be something to mindfulness, but it will be hard for us to find out if everyone gives up before we’ve even started to explore its best potential uses.”

So, how long before we “give up” on thousands of studies pouring out of an industry? In the meantime, should consumers act on what seem to be extravagant claims?

The Inverse article segued into some quotes from me after delivering another statement from the author which I could agree:

The authors of the study make their attitudes clear when it comes to the current state of the mindfulness industry: “Misinformation and poor methodology associated with past studies of mindfulness may lead public consumers to be harmed, misled, and disappointed,” they write. And while this comes off as unequivocal, some think they don’t go far enough in calling out specific instances of quackery.

“It’s not bare-knuckle, that’s for sure. I’m sure it got watered down in the review process,” James Coyne, Ph.D., an outspoken psychologist who’s extensively criticized the mindfulness industry, tells Inverse.

Coyne agrees with the conceptual issues outlined in the paper, specifically the fact that many mindfulness therapies are based on science that doesn’t really prove their efficacy, as well as the fact that researchers with copyrights on mindfulness therapies have financial conflicts of interest that could influence their research. But he thinks the authors are too concerned with tone policing.

“I do appreciate that they acknowledged other views, but they kept out anybody who would have challenged their perspective,” he says.

Regarding Coyne’s criticism about calling out individuals, Van Dam says the authors avoided doing that so as not to alienate people and stifle dialogue.

“I honestly don’t think that my providing a list of ‘quacks’ would stop people from listening to them,” says Van Dam. “Moreover, I suspect my doing so would damage the possibility of having a real conversation with them and the people that have been charmed by them.” If you need any evidence of this, look at David “Avocado” Wolfe, whose notoriety as a quack seems to make him even more popular as a victim of “the establishment.” So yes, this paper may not go so far as some would like, but it is a first step toward drawing attention to the often flawed science underlying mindfulness therapies.

To whom is the dialogue directed about unwarranted claims from the mindfulness industry?

As one of the authors of an article claiming to be an authoritative review from a group of psychologists with diverse expertise, Van Dam says he is speaking to consumers. Why won’t he and his co-authors provide citations and name names so that readers can evaluate for themselves what they are being told? Is the risk of reputational damage and embarrassment to the psychologists so great as to cause Van Dam to protect them versus protecting consumers from the exaggerated and even fraudulent claims of psychologists hawking their products branded as ‘peer-reviewed psychological and brain science’.

I use the term ‘quack’ sparingly outside of discussing unproven and unlikely-to-be-proven products supposed to promote physical health and well-being or to prevent or cure disease and distress.

I think Harvard psychologist Ellen Langer deserves the term “quack” for her selling of expensive trips to spas in Mexico to women with advanced cancer so that they can change their mind set to reverse the course of their disease. Strong evidence, please! Given that this self-proclaimed mother of mindfulness gets her claims promoted through the Association for Psychological Science website, I think it particularly appropriate for Van Dam and his coauthors to name her in their publication in an APS journal. Were they censored or only censoring themselves?

Let’s put aside psychologists who can be readily named as quacks. How about Van Dam and co-authors naming names of psychologists claiming to alter the brains and immune systems of cancer patients with mindfulness practices so that they improve their physical health and fight cancer, not just cope better with a life-altering disease?

I simply don’t buy Van Dam’s suggestion that to name names promotes quackery any more than I believe exposing anti-vaxxers promotes the anti-vaccine cause.

Is Van Dam only engaged in a polite discussion with fellow psychologists that needs to be strictly tone-policed to avoid offense or is he trying to reach, educate, and protect consumers as citizen scientists looking after their health and well-being? Maybe that is where we parted ways.

Science Media Centre concedes negative reaction from scientific community to coverage of Esther Crawley’s SMILE trial.

“It was the criticism from within the scientific community that we had not anticipated.”

mind the brain logo

Editorial from the

science media centre logo

eat-crow-humble-pieSEPTEMBER 28, 2017

Inconvenient truths

http://www.sciencemediacentre.org/inconvenient-truths/

 

“It was the criticism from within the scientific community that we had not anticipated.”

“This time the SMC also came under fire from our friends in science…Quack buster extraordinaire David Colquhoun tweeted, ‘More reasons to be concerned about @SMC_London?’

Other friends wrote to us expressing concern about the unintended consequences of SMC briefings – with one saying that policy makers were furious at having to deal with the fallout from our climate briefing and others worried that the briefing on the CFS/ME trial would allow the only private company offering the treatment to profit by over-egging preliminary findings.

Eat more crowThose of us who are accustomed to the Science Media Centre UK (SMC) highly slanted coverage of select topics  can detect a familiar defensive, yet self-congratulatory tone to an editorial put out by the SMC in reaction to its broad coverage of Esther Crawley’s SMILE trial of the quack treatment, Phil Parker’s Lightning Process. Once again, critics, both patients and professionals, of ineffectual treatments being offered for chronic fatigue syndrome/myalgic encephalomyelitis  are lumped with climate change deniers. Ho-hum, this comparison is getting so clichéd.

Perhaps even better, the SMC editorial’s concessions of poor coverage of the SMILE trial drew sharp amplifications from commentators that SMC had botched the job.

b1f9cdb8747b66edb7587c798153d4bfHere are some comments below, with emphases added. But let’s not be lulled by SMC into assuming that these intelligent, highly articulate comments, not necessarily from the professional community. I wouldn’t be surprised if hiding behind the pseudonyms are some of the excellent citizen scientists that the patient community has had to grow in the face of vilification and stigmatization led by SMC.

I actually think I recognize a spokesperson from the patient community writing under the pseudonym ‘Scary vocal critic.’

Scary vocal critic says:

September 29, 2017 at 5:59 am

The way that this blog glosses over important details in order to promote a simplistic narrative is just another illustration of why so many are concerned by Fiona Fox’s work, and the impact [of] the Science Media Centre.

Let’ s look in a bit more detail at the SMILE trial, from Esther Crawley at Bristol University. This trial was intended to assess the efficacy of Phil Parker’s Lightning Process©. Phil Parker has a history of outlandish medical claims about his ability to heal others, selling training in “the use of divination medicine cards and tarot as a way of making predictions” and providing a biography which claimed: “Phil Parker is already known to many as an inspirational teacher, therapist, healer and author. His personal healing journey began when, whilst working with his patients as an osteopath. He discovered that their bodies would suddenly tell him important bits of information about them and their past, which to his surprise turned out to be factually correct! He further developed this ability to step into other people’s bodies over the years to assist them in their healing with amazing results. After working as a healer for 20 years, Phil Parker has developed a powerful and magical program to help you unlock your natural healing abilities. If you feel drawn to these courses then you are probably ready to join.” https://web.archive.org/web/20070615014926/http://www.healinghawk.com/prospectushealing.htm

While much of the teaching materials for the Lightning Process are not available for public scrutiny (LP being copyrighted and controlled by Phil Parker), it sells itself as being founded on neurolinguistic programming and osteopathy, which are themselves forms of quackery. Those who have been on the course have described a combination of strange rituals, intensive positive affirmations, and pseudoscientific neuro-babble; all adding up to promote the view that an individual’s ill-health can be controlled if only they are sufficiently committed to the Lightning Programme. Bristol University appears to have embraced the neurobabble, and in their press release about the SMILE results they describe LP thus: “It is a three-day training programme run by registered practitioners and designed to teach individuals a new set of techniques for improving life and health, through consciously switching on health promoting neurological pathways.”

https://www.bristol.ac.uk/news/2017/september/lightning-process.html

Unsurprisingly, many patients have complained about paying for LP and receiving manipulative quackery. This can have unpredictable consequences. This article reports a child attempting to kill themselves after going on the Lightning Process:  Before conducting a trial, the researchers involved had a responsibility to examine the course and training materials and remove all pseudo-science, yet this was not done. Instead, those patient groups raising concerns about the trial were smeared, and presented as being opposed to science.

The SMILE trial was always an unethical use of research funding, but if it had followed its original protocol, it would have been less likely to generate misleading results and headlines. The Skeptics Dictionary’s page on the Lightning Process features a contribution which explains that: “the Lightning Process RCT being carried out by Esther Crawley changed its primary outcome measure from school attendance to scores on a self-report questionnaire. Given that LP involves making claims to patients about their own ability to control symptoms in exactly the sort of way likely to lead to response bias, it seems very likely that this trial will now find LP to be ‘effective’. One of the problems with EBM is that it is often difficult to reliably measure the outcomes that are important to patients and account for the biases that occur in non-blinded trials, allowing for exaggerated claims of efficacy to be made to patients.”

The SMILE trial was a nonblinded, A vs A+B design, testing a ‘treatment’ which included positive affirmations, and then used subjective self-report questionnaires as a primary outcome. This is not a sensible way of conducting a trial, as anyone who has looked at how junk-science can be used to promote quackery will be aware.

You can see the original protocol for the SMILE trial here (although this protocol refers to merely a feasibility study, this is the same research, with the same ethical review code, the feasibility study having seemingly been converted to a full trial a year into the research):

The protocol that: “The primary outcome measure for the interventions will be school attendance/home tuition at 6 months.” It is worth noting that the new SMILE paper reported that there was no significant difference between groups for what was the trial’s primary outcome. There was a significant difference at 12 months, but by this point data on school attendance was missing for one third of the participants of the LP arm. The SMC failed to inform journalists of this outcome switching, instead presenting Prof Crawley as a critic converted by a rigorous examination of the evidence, despite her having told the ethics review board in 2010 that “she has worked before with the Bath [LP] practitioner who is good”. https://meagenda.wordpress.com/2011/01/06/letter-issued-by-nres-following-scrutiny-of-complaints-in-relation-to-smile-lighting-process-pilot-study/

Also, while the original protocol, and a later analysis plan, refer to verifying self-reported school attendance with school records, I could see no mention of this in the final paper, so it may be that even this more objective outcome measure has been rendered less useful and more prone to problems with response bias.

Back to Fiona Fox’s blog: “If you had only read the headlines for the CFS/ME story you may conclude that the treatment tested at Bristol might be worth a try if you are blighted by the illness, when in truth the author said repeatedly that the findings would first have to be replicated in a bigger trial.”

How terrible of sloppy headline writers to misrepresent research findings. This is from the abstract of Esther Crawley’s paper: “Conclusion The LP is effective and is probably cost-effective when provided in addition to SMC for mild/moderately affected adolescents with CFS/ME.” http://adc.bmj.com/content/early/2017/09/20/archdischild-2017-313375

Fox complains of “vocal critics of research” in the CFS and climate change fields. There has been a prolong campaign from the SMC to smear those patients and academics who have been pointing out the problems with poor quality UK research into CFS, attempting to lump them with climate change deniers, anti-vaccinationists and animal rights extremists. The SMC used this campaign as an example of when they had “engineered the coverage” by “seizing the agenda”:

http://www.sciencemediacentre.org/wp-content/uploads/2013/03/Review-of-the-first-three-years-of-the-mental-health-research-function-at-the-Science-Media-Centre.pdf

Despite dramatic claims of a fearsome group of dangerous extremists (“It’s safer to insult the Prophet Mohammed than to contradict the armed wing of the ME brigade”), a Freedom of Information request helped us gain some valuable information about exactly what behaviour most concerned victimised researchers such as Esther Crawley:

“Minutes from a 2013 meeting held at the Science Media Centre, an organisation that played an important role in promoting misleading claims about the PACE trial to the UK media, show these CFS researchers deciding that “harassment is most damaging in the form of vexatious FOIs [Freedom of Information requests]”.[13,16, 27-31] The other two examples of harassment provided were “complaints” and “House of Lords debates”.[13] It is questionable whether such acts should be considered forms of harassment.

http://www.centreforwelfarereform.org/news/major-breaktn-pace-trial/00296.html

[A full copy of the minutes is included at the above address.]

Since then, a seriously ill patient managed to win a legal battle against researchers attempting to release key trial data, picking apart the prejudices that were promoted and left the Judge to state that “assessment of activist behaviour was, in our view, grossly exaggerated and the only actual evidence was that an individual at a seminar had heckled Professor Chalder.” http://www.informationtribunal.gov.uk/DBFiles/Decision/i1854/Queen%20Mary%20University%20of%20London%20EA-2015-0269%20(12-8-16).PDF

So why would there be an attempt to present request for information, complaints, and mere debate, as forms of harassment? Rather embarrassingly for Fiona and the SMC, it has since become clear. Following the release of (still only some of) the data from the £5 million PACE trial it is now increasingly recognised within the academic community that patients were right to be concerned about the quality of these researchers’ work, and the way in which people had been misled about the trial’s rsults. The New York Times reported on calls for the retraction of a key PACE paper (Robin Murray, the journal’s editor and a close friend of Simon Wessely’s, does not seem keen to discuss and debate the problems with this work): https://www.nytimes.com/2017/03/18/opinion/sunday/getting-it-wrong-on-chronic-fatigue-syndrome.html The Journal of Health Psychology has published as special issue devoted to the PACE trial debacle: http://journals.sagepub.com/doi/full/10.1177/1359105317722370 The CDC has dropped promotion of CBT and GET: https://www.statnews.com/2017/09/25/chronic-fatigue-syndrome-cdc/ And NICE has decided to a full review of its guidelines for CFS is necessary, citing concerns about research such as PACE as one of the key reasons for this: https://www.nice.org.uk/guidance/cg53/resources/surveillance-report-2017-chronic-fatigue-syndromemyalgic-encephalomyelitis-or-encephalopathy-diagnosis-and-management-2007-nice-guideline-cg53-4602203537/chapter/how-we-made-the-decision https://www.thetimes.co.uk/edition/news/mutiny-by-me-sufferers-forces-a-climbdown-on-exercise-treatment-npj0spq0w

The SMC’s response to this has not been impressive.

Fox writes: “Both briefings fitted the usual mould: top quality scientists explaining their work to smart science journalists and making technical and complex studies accessible to readers.”

I’d be interested to know how it was Fox decided that Crawley was a top quality scientist. Also, it is worrying that the culture of UK science journalism seems to assume that making technical and complex studies (like SMILE?!) accessible for readers is their highest goal. It is not a surprise that it is foreign journalists who have produced more careful and accurate coverage of the PACE trial scandal.

Unlike the SMC and some CFS researchers, I do not consider complaints or debate to be a form of harassment, and would be quite happy to respond to anyone who disagrees with the concerns I have laid out here. I have had to simplify things, but believe that I have not done so in a way which favours my case. It seems that there are few people willing to try to publicly defend the PACE trial anymore, and I have never seen anyone from the SMC attempt to respond to anything other than a straw-man representation of their critics. Lets see what response these inconvenient truths receive.

Reply

Michael Emmans-Dean says:

October 2, 2017 at 8:22 am

The only point I would add to this excellent post is to ask why on earth the SMC decided to feature such a small, poorly-designed trial as SMILE. The most likely explanation is that it was intended as a smokescreen for an inconvenient truth. NICE’s retrieval of their CFS guideline from the long grass (the “static list”) is a far bigger story and it was announced in the same week that SMILE was published.

Reply

Fiona Roberts says:

September 29, 2017 at 9:03 am

Hear hear!

Power pose: I. Demonstrating that replication initiatives won’t salvage the trustworthiness of psychology

An ambitious multisite initiative showcases how inefficient and ineffective replication is in correcting bad science.

 

mind the brain logo

Bad publication practices keep good scientists unnecessarily busy, as in replicability projects.- Bjoern Brembs

Power-PoseAn ambitious multisite initiative showcases how inefficient and ineffective replication is in correcting bad science. Psychologists need to reconsider pitfalls of an exclusive reliance on this strategy to improve lay persons’ trust in their field.

Despite the consistency of null findings across seven attempted replications of the original power pose study, editorial commentaries in Comprehensive Results in Social Psychology left some claims intact and called for further research.

Editorial commentaries on the seven null studies set the stage for continued marketing of self-help products, mainly to women, grounded in junk psychological pseudoscience.

Watch for repackaging and rebranding in next year’s new and improved model. Marketing campaigns will undoubtedly include direct quotes from the commentaries as endorsements.

We need to re-examine basic assumptions behind replication initiatives. Currently, these efforts  suffer from prioritizing of the reputations and egos of those misusing psychological science to market junk and quack claims versus protecting the consumers whom these gurus target.

In the absence of a critical response from within the profession to these persons prominently identifying themselves as psychologists, it is inevitable that the void be filled from those outside the field who have no investment in preserving the image of psychology research.

In the case of power posing, watchdog critics might be recruited from:

Consumer advocates concerned about just another effort to defraud consumers.

Science-based skeptics who see in the marketing of the power posing familiar quackery in the same category as hawkers using pseudoscience to promote homeopathy, acupuncture, and detox supplements.

Feminists who decry the message that women need to get some balls (testosterone) if they want to compete with men and overcome gender disparities in pay. Feminists should be further outraged by the marketing of junk science to vulnerable women with an ugly message of self-blame: It is so easy to meet and overcome social inequalities that they have only themselves to blame if they do not do so by power posing.

As reported in Comprehensive Results in Social Psychology,  a coordinated effort to examine the replicability of results reported in Psychological Science concerning power posing left the phenomenon a candidate for future research.

I will be blogging more about that later, but for now let’s look at a commentary from three of the over 20 authors get reveals an inherent limitation to such ambitious initiatives in tackling the untrustworthiness of psychology.

Cesario J, Jonas KJ, Carney DR. CRSP special issue on power poses: what was the point and what did we learn?.  Comprehensive Results in Social Psychology. 2017

 

Let’s start with the wrap up:

The very costly expense (in terms of time, money, and effort) required to chip away at published effects, needed to attain a “critical mass” of evidence given current publishing and statistical standards, is a highly inefficient use of resources in psychological science. Of course, science is to advance incrementally, but it should do so efficiently if possible. One cannot help but wonder whether the field would look different today had peer-reviewed preregistration been widely implemented a decade ago.

 We should consider the first sentence with some recognition of just how much untrustworthy psychological science is out there. Must we mobilize similar resources in every instance or can we develop some criteria to decide what is on worthy of replication? As I have argued previously, there are excellent reasons for deciding that the original power pose study could not contribute a credible effect size to the literature. There is no there to replicate.

The authors assume preregistration of the power pose study would have solved problems. In clinical and health psychology, long-standing recommendations to preregister trials are acquiring new urgency. But the record is that motivated researchers routinely ignore requirements to preregister and ignore the primary outcomes and analytic plans to which they have committed themselves. Editors and journals let them get away with it.

What measures do the replicationados have to ensure the same things are not being said about bad psychological science a decade from now? Rather than urging uniform adoption and enforcement of preregistration, replicationados urged the gentle nudge of badges for studies which are preregistered.

Just prior to the last passage:

Moreover, it is obvious that the researchers contributing to this special issue framed their research as a productive and generative enterprise, not one designed to destroy or undermine past research. We are compelled to make this point given the tendency for researchers to react to failed replications by maligning the intentions or integrity of those researchers who fail to support past research, as though the desires of the researchers are fully responsible for the outcome of the research.

There are multiple reasons not to give the authors of the power pose paper such a break. There is abundant evidence of undeclared conflicts of interest in the huge financial rewards for publishing false and outrageous claims. Psychological Science about the abstract of the original paper to leave out any embarrassing details of the study design and results and end with a marketing slogan:

That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.

 Then the Association for Psychological Science gave a boost to the marketing of this junk science with a Rising Star Award to two of the authors of this paper for having “already made great advancements in science.”

As seen in this special issue of Comprehensive Results in Social Psychology, the replicationados share responsibility with Psychological Science and APS for keeping keep this system of perverse incentives intact. At least they are guaranteeing plenty of junk science in the pipeline to replicate.

But in the next installment on power posing I will raise the question of whether early career researchers are hurting their prospects for advancement by getting involved in such efforts.

How many replicationados does it take to change a lightbulb? Who knows, but a multisite initiative can be combined with a Bayesian meta-analysis to give a tentative and unsatisfying answer.

Coyne JC. Replication initiatives will not salvage the trustworthiness of psychology. BMC Psychology. 2016 May 31;4(1):28.

The following can be interpreted as a declaration of financial interests or a sales pitch:

eBook_PositivePsychology_345x550I will soon be offering e-books providing skeptical looks at positive psychology and mindfulness, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

 Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.

 

“ACT: The best thing [for pain] since sliced bread or the Emperor’s new clothes?”

Reflections on the debate with David Gillanders about Acceptance and Commitment Therapy at the British Pain Society, Glasgow, September 15, 2017

mind the brain logo

Reflections on the debate with David Gillanders about Acceptance and Commitment Therapy at the British Pain Society, Glasgow, September 15, 2017

my title slideDavid Gillanders  and I held our debate “ACT: best thing since sliced bread or the Emperor’s new clothes?” at the British Pain Society meeting on Thursday, September 15, 2017 in Glasgow. We will eventually make our slides and a digital recording of the debate available.

I enjoyed hanging out with David Gillanders. He is a great guy who talks the talk, but also walks the walk. He lives ACT as a life philosophy. He was an ACT trainer speaking before a sympathetic audience, many who had been trained by him.

Some reflections from a few days later.

I was surprised how much Acceptance and Commitment Therapy (along with #mindfulness) has taken over UK pain services. A pre-debste poll showed most of the  audience  came convinced that indeed, ACT was the best thing since sliced bread.

I was confident that my skepticism was firmly rooted in the evidence. I don’t think there is debate about that. David Gillanders agreed that higher quality studies were needed.

But in the end, even I did not convert many, I came away quite pleased with the debate.

Standards for evaluating the  evidence for ACT for pain

 I recently wrote that ACT may have moved into a post-evidence phase, with its chief proponents switching from citing evidence to making claims about love, suffering, and the meaning of life. Seriously.

Steve Hayes prompted me on Twitter to take a closer look at the most recent evidence for ACT. As reported in an earlier blog, I took a close look.  I was not impressed that proponents of ACT are making much progress in developing evidence in any way as strong as their claims. We need a lot less ACT research that doesn’t add any quality evidence despite ACT being promoted enthusiastically as if it does. We need more sobriety from the promoters of ACT, particularly those in academia, like Steve Hayes and Kelly Wilson who know something about how to evaluate evidence. They should not patronize workshop goers with fanciful claims.

David Gillanders talked a lot about the philosophy and values that are expressed in ACT, but he also made claims about its research base, echoing the claims made by Steve Hayes and other prominent ACT promoters.

Standards for evaluating research exist independent of any discussion of ACT

There are standards for interpreting clinical trials and integration of their results in meta analysis that exist independent of the ACT literature. It is not a good idea to challenge these standards in the context of defending ACT against unfavorable evaluations, although that is exactly how Hayes and his colleagues often respond. I will get around to blogging about the most recent example of this.

Atkins PW, Ciarrochi J, Gaudiano BA, Bricker JB, Donald J, Rovner G, Smout M, Livheim F, Lundgren T, Hayes SC. Departing from the essential features of a high quality systematic review of psychotherapy: A response to Öst (2014) and recommendations for improvement. Behaviour Research and Therapy. 2017 May 29.

Within-group (pre-post) differences in outcome. David Gillanders echoed Hayes in using within-group effects sizes to describe the effectiveness of ACT. Results presented in this way are better and may look impressive, but they are exaggerated when compared to results obtained between groups. I am not making that up. Changes within the group of patients who received ACT reflect the specific effects of ACT plus whatever nonspecific factors were operating. That is why we need an appropriate comparison-control group to examine between-group differences, which are always more modest than just looking at the within-group effects.

Compared to what? Most randomized trials of ACT involve a wait list, no-treatment, or ill-described standard care (which often represents no treatment). Such comparisons are methodologically weak, especially when patients and providers know what is going on-called an unblinded trial– and when outcomes are subjective self-report measures.

homeopathyA clever study in New England Journal of Medicine showed that with such subjective self-report measures, one cannot distinguish between a proven effective inhaled medication for asthma, an inert substance simply inhaled, and sham acupuncture. In contrast, objective measures of breathing clearly distinguish the medication from the comparison-control conditions.

So, it is not an exaggeration to say that most evaluations of ACT are conducted under circumstances that even sham acupuncture or homeopathy would look effective.

Not superior to other treatments. There are no trials comparing ACT to a credible active treatment in which ACT proves superior, either for pain or other clinical problems. So, we are left saying ACT is better than doing nothing, at least in trials where any nonspecific effects are concentrated among the patients receiving ACT.

Rampant investigator bias. A lot of trials of ACT are conducted by researchers having an investment in showing that ACT is effective. That is a conflict of interest. Sometimes it is called investigator allegiance, or a promoter or originator bias.

Regardless, when drugs are being evaluated in a clinical trial, it is recognized that there will be a bias toward the drug favored by the manufacturer conducting the trial. It is increasingly recognized that meta analyses conducted by promoters should also be viewed with extra skepticism. And that trials conducted with researchers having such conflicts of interest should be considered separately to see if they produced exaggerated.

ACT desperately needs randomized trials conducted by researchers who don’t have a dog in the fight, who lack the motivation to torture findings to give positive results when they are simply not present. There’s a strong confirmation bias in current ACT trials, with promoter/researchers embarrassing themselves in their maneuvers to show strong, positive effects when their only weak or null findings available. I have documented [ 1, 2 ] how this trend started with Steve Hayes dropping two patients from his study of effects of brief ACT on re-hospitalization of inpatients with Patricia Bach. One patient had died by suicide and another was in jail and so they couldn’t be rehospitalized and were drop from the analyses. The deed could only be noticed by comparing the published paper with Patricia Bach’s dissertation. It allowed an otherwise nonsignificant finding a small trial significant.

Trials that are too small to matter. A lot of ACT trials have too few patients to produce a reliable, generalizable effect size. Lots of us in situations far removed from ACT trials have shown justification for the rule of thumb that we should consider effect sizes from trials having less than 35 patients per treatment of comparison cell. Even this standard is quite liberal. Even if a moderate effect would be significantly larger trial, there is less than a 50% probability it be detected the trial this small. To be significant with such a small sample size, differences between treatments have to be large, and there probably either due to chance or something dodgy that the investigators did.

Many claims for the effectiveness of ACT for particular clinical problems come from trials too small to generate a reliable effect sizes. I invite readers to undertake the simple exercise of looking at the sample sizes in a study cited has support of the effectiveness of ACT. If you exclude such small studies, there is not much research left to talk about.

Too much flexibility in what researchers report in publications. Many trials of ACT involve researchers administering a whole battery of outcome measures and then emphasizing those that make ACT look best and either downplaying or not mentioning further the rest. Similarly, many trials of ACT deemphasize whether the time X treatment interaction is significant in and simply ignore it if it is not all focus on the within-group differences. I know, we’re getting a big tactical here. But this is another way of saying is that many trials of ACT gives researchers too much latitude in choosing what variables to report and what statistics are used to evaluate them.

Under similar circumstances, showed that listening to the Beatles song When I’m 64 left undergraduates 18 months younger than when they listen to the song Karamba. Of course, the researchers knew damn well that the Beatles song didn’t have this effect, but they indicated they were doing what lots of investigators due to get significant results, what they call p-hacking.

Many randomized trials of ACT are conducted with the same researcher flexibility that would allow a demonstration that listening to a Beatles song drops the age of undergraduates 18 months.

Many of the problems with ACT research could be avoided if researchers were required to publish ahead of time their primary outcome variables and plans for analyzing them. Such preregistration is increasingly recognized as best research practices, including by NIMH. There is  no excuse not to do it.

My take away message?

ACT gurus have been able to dodge the need to develop quality data to support their claims that their treatment is effective (and their sometime claim it is more effective than other approaches). A number of them are university-based academics and have ample resources to develop better quality evidence.

Workshop and weekend retreat attendees are convinced that ACT works on the strength of experiential learning and a lot of theoretical mumbo jumbo.

But the ACT promoters also make a lot of dodgy claims that there is strong evidence that the specific ingredients of ACT, techniques and values, account for the power of ACT. But some of the ACT gurus, Steve Hayes and Kelly Wilson at least, are academics and should limit their claims of being ‘evidence-based” to what is supported by strong, quality evidence. They don’t. I think they are being irresponsible in throwing in “evidence-based’ with all the

What should I do as an evidence-based skeptic wanting to improve the conversation about ACT?

 Earlier in my career, I spent six years in live supervision in some world-renowned therapists behind the one-way mirror including John Weakland, Paul Watzlawick, and Dick Fisch. I gave workshops world wide on how to do brief strategic therapies with individuals, couples, and families. I chose not to continue because (1) I didn’t like the pressure for drama and exciting interventions when I interviewed patients in front of large groups; (2) Even when there was a logic and appearance of effectiveness to what I did, I didn’t believe it could be manualized; and (3) My group didn’t have the resources to conduct proper outcome studies.

But I got it that workshop attendees like drama, exciting interventions, and emotional experiences. They go to trainings expecting to be entertained, as much as informed. I don’t think I can change that.

Many therapists have not had the training to evaluate claims about research, even if they accept that being backed by research findings is important. They depend on presenters to tell them about research and tend to trust what they say. Even therapist to know something about research, tennis and critical judgment when caught up in emotionality provided by some training experiences. Experiential learning can be powerful, even when it is used to promote interventions that are not supported by evidence.

I can’t change the training of therapists nor the culture of workshops and training experience. But I can reach out to therapist who want to develop skills to evaluate research for themselves. I think some of the things that point out in this blog post are quite teachable as things to look for.

I hope I can connect with therapists who want to become citizen scientists who are skeptical about what they hear and want to become equipped to think for themselves and look for effective resources when they don’t know how to interpret claims.

This is certainly not all therapists and may only be a minority. But such opinion leaders can be champions for the others in facilitating intelligent discussions of research concerning the effectiveness of psychotherapies. And they can prepare their colleagues to appreciate that most change in psychotherapy is not as dramatic or immediate as seen in therapy workshops.

eBook_PositivePsychology_345x550I will soon be offering e-books providing skeptical looks at positive psychology and mindfulness, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.