“It’s certainly not bareknuckle:” Comments to a journalist about a critique of mindfulness research

We can’t assume authors of mindfulness studies are striving to do the best possible science, including being prepared for the possibility of being proven incorrect by their results.

mind the brain logo

I recently had a Skype interview with science journalist Peter Hess concerning an article in Psychological Science.

Peter was exceptionally prepared, had a definite point of view, but was open to what I said. In the end seem to be persuaded by me on a number of points.  The resulting article in Inverse  faithfully conveyed my perspective and juxtaposed quotes from me with those from an author of the Psych Science piece in a kind of debate.

My point of view

larger dogWhen evaluating an article about mindfulness in a peer-reviewed journal, we need to take into account that authors may not necessarily be striving to do the best science, but to maximally benefit their particular brand of mindfulness, their products, or the settings in which they operate. Many studies of mindfulness are a little more than infomercials, weak research intended only to get mindfulness promoters’ advertisement of themselves into print or to allow the labeling of claims as “peer-reviewed”. Caveat Lector.

We cannot assume authors of mindfulness studies are striving to do the best possible science, including being prepared for the possibility of being proven incorrect by their results. Rather they may be simply try to get the strongest possible claims through peer review, ignoring best research practices and best publication practices.

Psychologists Express Growing Concern With Mindfulness Meditation

“It’s not bare-knuckle, that’s for sure.”

There was much from the author of the Psych Science article with which  I would agree:

“In my opinion, there are far too many organizations, companies, and therapists moving forward with the implementation of ‘mindfulness-based’ treatments, apps, et cetera before the research can actually tell us whether it actually works, and what the risk-reward ratio is,” corresponding author and University of Melbourne research fellow Nicholas Van Dam, Ph.D. tells Inverse.

Bravo! And

“People are spending a lot of money and time learning to meditate, listening to guest speakers about corporate integration of mindfulness, and watching TED talks about how mindfulness is going to supercharge their brain and help them live longer. Best case scenario, some of the advertising is true. Worst case scenario: very little to none of the advertising is true and people may actually get hurt (e.g., experience serious adverse effects).”

But there were some statements that renewed the discomfort and disappointment I experienced when I read the original article in Psychological Science:

 “I think the biggest concern among my co-authors and I is that people will give up on mindfulness and/or meditation because they try it and it doesn’t work as promised,” says Van Dam.

“There may really be something to mindfulness, but it will be hard for us to find out if everyone gives up before we’ve even started to explore its best potential uses.”

So, how long before we “give up” on thousands of studies pouring out of an industry? In the meantime, should consumers act on what seem to be extravagant claims?

The Inverse article segued into some quotes from me after delivering another statement from the author which I could agree:

The authors of the study make their attitudes clear when it comes to the current state of the mindfulness industry: “Misinformation and poor methodology associated with past studies of mindfulness may lead public consumers to be harmed, misled, and disappointed,” they write. And while this comes off as unequivocal, some think they don’t go far enough in calling out specific instances of quackery.

“It’s not bare-knuckle, that’s for sure. I’m sure it got watered down in the review process,” James Coyne, Ph.D., an outspoken psychologist who’s extensively criticized the mindfulness industry, tells Inverse.

Coyne agrees with the conceptual issues outlined in the paper, specifically the fact that many mindfulness therapies are based on science that doesn’t really prove their efficacy, as well as the fact that researchers with copyrights on mindfulness therapies have financial conflicts of interest that could influence their research. But he thinks the authors are too concerned with tone policing.

“I do appreciate that they acknowledged other views, but they kept out anybody who would have challenged their perspective,” he says.

Regarding Coyne’s criticism about calling out individuals, Van Dam says the authors avoided doing that so as not to alienate people and stifle dialogue.

“I honestly don’t think that my providing a list of ‘quacks’ would stop people from listening to them,” says Van Dam. “Moreover, I suspect my doing so would damage the possibility of having a real conversation with them and the people that have been charmed by them.” If you need any evidence of this, look at David “Avocado” Wolfe, whose notoriety as a quack seems to make him even more popular as a victim of “the establishment.” So yes, this paper may not go so far as some would like, but it is a first step toward drawing attention to the often flawed science underlying mindfulness therapies.

To whom is the dialogue directed about unwarranted claims from the mindfulness industry?

As one of the authors of an article claiming to be an authoritative review from a group of psychologists with diverse expertise, Van Dam says he is speaking to consumers. Why won’t he and his co-authors provide citations and name names so that readers can evaluate for themselves what they are being told? Is the risk of reputational damage and embarrassment to the psychologists so great as to cause Van Dam to protect them versus protecting consumers from the exaggerated and even fraudulent claims of psychologists hawking their products branded as ‘peer-reviewed psychological and brain science’.

I use the term ‘quack’ sparingly outside of discussing unproven and unlikely-to-be-proven products supposed to promote physical health and well-being or to prevent or cure disease and distress.

I think Harvard psychologist Ellen Langer deserves the term “quack” for her selling of expensive trips to spas in Mexico to women with advanced cancer so that they can change their mind set to reverse the course of their disease. Strong evidence, please! Given that this self-proclaimed mother of mindfulness gets her claims promoted through the Association for Psychological Science website, I think it particularly appropriate for Van Dam and his coauthors to name her in their publication in an APS journal. Were they censored or only censoring themselves?

Let’s put aside psychologists who can be readily named as quacks. How about Van Dam and co-authors naming names of psychologists claiming to alter the brains and immune systems of cancer patients with mindfulness practices so that they improve their physical health and fight cancer, not just cope better with a life-altering disease?

I simply don’t buy Van Dam’s suggestion that to name names promotes quackery any more than I believe exposing anti-vaxxers promotes the anti-vaccine cause.

Is Van Dam only engaged in a polite discussion with fellow psychologists that needs to be strictly tone-policed to avoid offense or is he trying to reach, educate, and protect consumers as citizen scientists looking after their health and well-being? Maybe that is where we parted ways.

Power pose: I. Demonstrating that replication initiatives won’t salvage the trustworthiness of psychology

An ambitious multisite initiative showcases how inefficient and ineffective replication is in correcting bad science.

 

mind the brain logo

Bad publication practices keep good scientists unnecessarily busy, as in replicability projects.- Bjoern Brembs

Power-PoseAn ambitious multisite initiative showcases how inefficient and ineffective replication is in correcting bad science. Psychologists need to reconsider pitfalls of an exclusive reliance on this strategy to improve lay persons’ trust in their field.

Despite the consistency of null findings across seven attempted replications of the original power pose study, editorial commentaries in Comprehensive Results in Social Psychology left some claims intact and called for further research.

Editorial commentaries on the seven null studies set the stage for continued marketing of self-help products, mainly to women, grounded in junk psychological pseudoscience.

Watch for repackaging and rebranding in next year’s new and improved model. Marketing campaigns will undoubtedly include direct quotes from the commentaries as endorsements.

We need to re-examine basic assumptions behind replication initiatives. Currently, these efforts  suffer from prioritizing of the reputations and egos of those misusing psychological science to market junk and quack claims versus protecting the consumers whom these gurus target.

In the absence of a critical response from within the profession to these persons prominently identifying themselves as psychologists, it is inevitable that the void be filled from those outside the field who have no investment in preserving the image of psychology research.

In the case of power posing, watchdog critics might be recruited from:

Consumer advocates concerned about just another effort to defraud consumers.

Science-based skeptics who see in the marketing of the power posing familiar quackery in the same category as hawkers using pseudoscience to promote homeopathy, acupuncture, and detox supplements.

Feminists who decry the message that women need to get some balls (testosterone) if they want to compete with men and overcome gender disparities in pay. Feminists should be further outraged by the marketing of junk science to vulnerable women with an ugly message of self-blame: It is so easy to meet and overcome social inequalities that they have only themselves to blame if they do not do so by power posing.

As reported in Comprehensive Results in Social Psychology,  a coordinated effort to examine the replicability of results reported in Psychological Science concerning power posing left the phenomenon a candidate for future research.

I will be blogging more about that later, but for now let’s look at a commentary from three of the over 20 authors get reveals an inherent limitation to such ambitious initiatives in tackling the untrustworthiness of psychology.

Cesario J, Jonas KJ, Carney DR. CRSP special issue on power poses: what was the point and what did we learn?.  Comprehensive Results in Social Psychology. 2017

 

Let’s start with the wrap up:

The very costly expense (in terms of time, money, and effort) required to chip away at published effects, needed to attain a “critical mass” of evidence given current publishing and statistical standards, is a highly inefficient use of resources in psychological science. Of course, science is to advance incrementally, but it should do so efficiently if possible. One cannot help but wonder whether the field would look different today had peer-reviewed preregistration been widely implemented a decade ago.

 We should consider the first sentence with some recognition of just how much untrustworthy psychological science is out there. Must we mobilize similar resources in every instance or can we develop some criteria to decide what is on worthy of replication? As I have argued previously, there are excellent reasons for deciding that the original power pose study could not contribute a credible effect size to the literature. There is no there to replicate.

The authors assume preregistration of the power pose study would have solved problems. In clinical and health psychology, long-standing recommendations to preregister trials are acquiring new urgency. But the record is that motivated researchers routinely ignore requirements to preregister and ignore the primary outcomes and analytic plans to which they have committed themselves. Editors and journals let them get away with it.

What measures do the replicationados have to ensure the same things are not being said about bad psychological science a decade from now? Rather than urging uniform adoption and enforcement of preregistration, replicationados urged the gentle nudge of badges for studies which are preregistered.

Just prior to the last passage:

Moreover, it is obvious that the researchers contributing to this special issue framed their research as a productive and generative enterprise, not one designed to destroy or undermine past research. We are compelled to make this point given the tendency for researchers to react to failed replications by maligning the intentions or integrity of those researchers who fail to support past research, as though the desires of the researchers are fully responsible for the outcome of the research.

There are multiple reasons not to give the authors of the power pose paper such a break. There is abundant evidence of undeclared conflicts of interest in the huge financial rewards for publishing false and outrageous claims. Psychological Science about the abstract of the original paper to leave out any embarrassing details of the study design and results and end with a marketing slogan:

That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.

 Then the Association for Psychological Science gave a boost to the marketing of this junk science with a Rising Star Award to two of the authors of this paper for having “already made great advancements in science.”

As seen in this special issue of Comprehensive Results in Social Psychology, the replicationados share responsibility with Psychological Science and APS for keeping keep this system of perverse incentives intact. At least they are guaranteeing plenty of junk science in the pipeline to replicate.

But in the next installment on power posing I will raise the question of whether early career researchers are hurting their prospects for advancement by getting involved in such efforts.

How many replicationados does it take to change a lightbulb? Who knows, but a multisite initiative can be combined with a Bayesian meta-analysis to give a tentative and unsatisfying answer.

Coyne JC. Replication initiatives will not salvage the trustworthiness of psychology. BMC Psychology. 2016 May 31;4(1):28.

The following can be interpreted as a declaration of financial interests or a sales pitch:

eBook_PositivePsychology_345x550I will soon be offering e-books providing skeptical looks at positive psychology and mindfulness, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

 Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.

 

“ACT: The best thing [for pain] since sliced bread or the Emperor’s new clothes?”

Reflections on the debate with David Gillanders about Acceptance and Commitment Therapy at the British Pain Society, Glasgow, September 15, 2017

mind the brain logo

Reflections on the debate with David Gillanders about Acceptance and Commitment Therapy at the British Pain Society, Glasgow, September 15, 2017

my title slideDavid Gillanders  and I held our debate “ACT: best thing since sliced bread or the Emperor’s new clothes?” at the British Pain Society meeting on Thursday, September 15, 2017 in Glasgow. We will eventually make our slides and a digital recording of the debate available.

I enjoyed hanging out with David Gillanders. He is a great guy who talks the talk, but also walks the walk. He lives ACT as a life philosophy. He was an ACT trainer speaking before a sympathetic audience, many who had been trained by him.

Some reflections from a few days later.

I was surprised how much Acceptance and Commitment Therapy (along with #mindfulness) has taken over UK pain services. A pre-debste poll showed most of the  audience  came convinced that indeed, ACT was the best thing since sliced bread.

I was confident that my skepticism was firmly rooted in the evidence. I don’t think there is debate about that. David Gillanders agreed that higher quality studies were needed.

But in the end, even I did not convert many, I came away quite pleased with the debate.

Standards for evaluating the  evidence for ACT for pain

 I recently wrote that ACT may have moved into a post-evidence phase, with its chief proponents switching from citing evidence to making claims about love, suffering, and the meaning of life. Seriously.

Steve Hayes prompted me on Twitter to take a closer look at the most recent evidence for ACT. As reported in an earlier blog, I took a close look.  I was not impressed that proponents of ACT are making much progress in developing evidence in any way as strong as their claims. We need a lot less ACT research that doesn’t add any quality evidence despite ACT being promoted enthusiastically as if it does. We need more sobriety from the promoters of ACT, particularly those in academia, like Steve Hayes and Kelly Wilson who know something about how to evaluate evidence. They should not patronize workshop goers with fanciful claims.

David Gillanders talked a lot about the philosophy and values that are expressed in ACT, but he also made claims about its research base, echoing the claims made by Steve Hayes and other prominent ACT promoters.

Standards for evaluating research exist independent of any discussion of ACT

There are standards for interpreting clinical trials and integration of their results in meta analysis that exist independent of the ACT literature. It is not a good idea to challenge these standards in the context of defending ACT against unfavorable evaluations, although that is exactly how Hayes and his colleagues often respond. I will get around to blogging about the most recent example of this.

Atkins PW, Ciarrochi J, Gaudiano BA, Bricker JB, Donald J, Rovner G, Smout M, Livheim F, Lundgren T, Hayes SC. Departing from the essential features of a high quality systematic review of psychotherapy: A response to Öst (2014) and recommendations for improvement. Behaviour Research and Therapy. 2017 May 29.

Within-group (pre-post) differences in outcome. David Gillanders echoed Hayes in using within-group effects sizes to describe the effectiveness of ACT. Results presented in this way are better and may look impressive, but they are exaggerated when compared to results obtained between groups. I am not making that up. Changes within the group of patients who received ACT reflect the specific effects of ACT plus whatever nonspecific factors were operating. That is why we need an appropriate comparison-control group to examine between-group differences, which are always more modest than just looking at the within-group effects.

Compared to what? Most randomized trials of ACT involve a wait list, no-treatment, or ill-described standard care (which often represents no treatment). Such comparisons are methodologically weak, especially when patients and providers know what is going on-called an unblinded trial– and when outcomes are subjective self-report measures.

homeopathyA clever study in New England Journal of Medicine showed that with such subjective self-report measures, one cannot distinguish between a proven effective inhaled medication for asthma, an inert substance simply inhaled, and sham acupuncture. In contrast, objective measures of breathing clearly distinguish the medication from the comparison-control conditions.

So, it is not an exaggeration to say that most evaluations of ACT are conducted under circumstances that even sham acupuncture or homeopathy would look effective.

Not superior to other treatments. There are no trials comparing ACT to a credible active treatment in which ACT proves superior, either for pain or other clinical problems. So, we are left saying ACT is better than doing nothing, at least in trials where any nonspecific effects are concentrated among the patients receiving ACT.

Rampant investigator bias. A lot of trials of ACT are conducted by researchers having an investment in showing that ACT is effective. That is a conflict of interest. Sometimes it is called investigator allegiance, or a promoter or originator bias.

Regardless, when drugs are being evaluated in a clinical trial, it is recognized that there will be a bias toward the drug favored by the manufacturer conducting the trial. It is increasingly recognized that meta analyses conducted by promoters should also be viewed with extra skepticism. And that trials conducted with researchers having such conflicts of interest should be considered separately to see if they produced exaggerated.

ACT desperately needs randomized trials conducted by researchers who don’t have a dog in the fight, who lack the motivation to torture findings to give positive results when they are simply not present. There’s a strong confirmation bias in current ACT trials, with promoter/researchers embarrassing themselves in their maneuvers to show strong, positive effects when their only weak or null findings available. I have documented [ 1, 2 ] how this trend started with Steve Hayes dropping two patients from his study of effects of brief ACT on re-hospitalization of inpatients with Patricia Bach. One patient had died by suicide and another was in jail and so they couldn’t be rehospitalized and were drop from the analyses. The deed could only be noticed by comparing the published paper with Patricia Bach’s dissertation. It allowed an otherwise nonsignificant finding a small trial significant.

Trials that are too small to matter. A lot of ACT trials have too few patients to produce a reliable, generalizable effect size. Lots of us in situations far removed from ACT trials have shown justification for the rule of thumb that we should consider effect sizes from trials having less than 35 patients per treatment of comparison cell. Even this standard is quite liberal. Even if a moderate effect would be significantly larger trial, there is less than a 50% probability it be detected the trial this small. To be significant with such a small sample size, differences between treatments have to be large, and there probably either due to chance or something dodgy that the investigators did.

Many claims for the effectiveness of ACT for particular clinical problems come from trials too small to generate a reliable effect sizes. I invite readers to undertake the simple exercise of looking at the sample sizes in a study cited has support of the effectiveness of ACT. If you exclude such small studies, there is not much research left to talk about.

Too much flexibility in what researchers report in publications. Many trials of ACT involve researchers administering a whole battery of outcome measures and then emphasizing those that make ACT look best and either downplaying or not mentioning further the rest. Similarly, many trials of ACT deemphasize whether the time X treatment interaction is significant in and simply ignore it if it is not all focus on the within-group differences. I know, we’re getting a big tactical here. But this is another way of saying is that many trials of ACT gives researchers too much latitude in choosing what variables to report and what statistics are used to evaluate them.

Under similar circumstances, showed that listening to the Beatles song When I’m 64 left undergraduates 18 months younger than when they listen to the song Karamba. Of course, the researchers knew damn well that the Beatles song didn’t have this effect, but they indicated they were doing what lots of investigators due to get significant results, what they call p-hacking.

Many randomized trials of ACT are conducted with the same researcher flexibility that would allow a demonstration that listening to a Beatles song drops the age of undergraduates 18 months.

Many of the problems with ACT research could be avoided if researchers were required to publish ahead of time their primary outcome variables and plans for analyzing them. Such preregistration is increasingly recognized as best research practices, including by NIMH. There is  no excuse not to do it.

My take away message?

ACT gurus have been able to dodge the need to develop quality data to support their claims that their treatment is effective (and their sometime claim it is more effective than other approaches). A number of them are university-based academics and have ample resources to develop better quality evidence.

Workshop and weekend retreat attendees are convinced that ACT works on the strength of experiential learning and a lot of theoretical mumbo jumbo.

But the ACT promoters also make a lot of dodgy claims that there is strong evidence that the specific ingredients of ACT, techniques and values, account for the power of ACT. But some of the ACT gurus, Steve Hayes and Kelly Wilson at least, are academics and should limit their claims of being ‘evidence-based” to what is supported by strong, quality evidence. They don’t. I think they are being irresponsible in throwing in “evidence-based’ with all the

What should I do as an evidence-based skeptic wanting to improve the conversation about ACT?

 Earlier in my career, I spent six years in live supervision in some world-renowned therapists behind the one-way mirror including John Weakland, Paul Watzlawick, and Dick Fisch. I gave workshops world wide on how to do brief strategic therapies with individuals, couples, and families. I chose not to continue because (1) I didn’t like the pressure for drama and exciting interventions when I interviewed patients in front of large groups; (2) Even when there was a logic and appearance of effectiveness to what I did, I didn’t believe it could be manualized; and (3) My group didn’t have the resources to conduct proper outcome studies.

But I got it that workshop attendees like drama, exciting interventions, and emotional experiences. They go to trainings expecting to be entertained, as much as informed. I don’t think I can change that.

Many therapists have not had the training to evaluate claims about research, even if they accept that being backed by research findings is important. They depend on presenters to tell them about research and tend to trust what they say. Even therapist to know something about research, tennis and critical judgment when caught up in emotionality provided by some training experiences. Experiential learning can be powerful, even when it is used to promote interventions that are not supported by evidence.

I can’t change the training of therapists nor the culture of workshops and training experience. But I can reach out to therapist who want to develop skills to evaluate research for themselves. I think some of the things that point out in this blog post are quite teachable as things to look for.

I hope I can connect with therapists who want to become citizen scientists who are skeptical about what they hear and want to become equipped to think for themselves and look for effective resources when they don’t know how to interpret claims.

This is certainly not all therapists and may only be a minority. But such opinion leaders can be champions for the others in facilitating intelligent discussions of research concerning the effectiveness of psychotherapies. And they can prepare their colleagues to appreciate that most change in psychotherapy is not as dramatic or immediate as seen in therapy workshops.

eBook_PositivePsychology_345x550I will soon be offering e-books providing skeptical looks at positive psychology and mindfulness, as well as scientific writing courses on the web as I have been doing face-to-face for almost a decade.

Sign up at my website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. Get advance notice of forthcoming e-books and web courses. Lots to see at CoyneoftheRealm.com.

 

Embargo broken: Bristol University Professor to discuss trial of quack chronic fatigue syndrome treatment.

An alternative press briefing to compare and contrast with what is being provided by the Science Media Centre for a press conference on Wednesday September 20, 2017.

mind the brain logo

This blog post provides an alternative press briefing to compare and contrast with what was provided by the Science Media Centre for a press conference on Wednesday September 20, 2017.

The press release attached at the bottom of the post announces the publication of results of highly controversial trial that many would argue should never have occurred. The trial exposed children to an untested treatment with a quack explanation delivered by unqualified persons. Lots of money was earned from the trial by the promoters of the quack treatment beyond the boost in credibility for their quack treatment.

Note to journalists and the media: for further information email jcoynester@Gmail.com

This trial involved quackery delivered by unqualified practitioners who are otherwise untrained and insensitive to any harm to patients.

The UK Advertising Standards Authority had previously ruled that Lightning Process could not be advertised as a treatment. [ 1 ]

The Lightning is billed as mixing elements from osteopathy, life coaching and neuro-linguistic programming. That is far from having a mechanism of action based in science or evidence. [2] Neuro-linguistic programming (NLP) has been thoroughly debunked for its pseudoscientific references to brain science and ceased to be discussed in the scientific literature. [3]

Many experts would consider the trial unethical. It involved exposing children and adolescents to an unproven treatment with no prior evidence of effectiveness or safety nor any scientific basis for the mechanism by which it is claimed to work.

 As an American who has decades served on of experience with Committees for the Protection of Human Subjects and Data Safety and Monitoring Boards, I don’t understand how this trial was approved to recruit human subjects, and particularly children and adolescents.

I don’t understand why a physician who cared about her patients would seek approval to conduct such a trial.

Participation in the trial violated patients’ trust that medical settings and personnel will protect them from such risks.

Participation in the trial is time-consuming and involves loss of opportunity to obtain less risky treatment or simply not endure the inconvenience and burden of a treatment for which there is no scientific basis to expect would work.

Esther Crawley has said “If the Lightning Process is dangerous, as they say, we need to find out. They should want to find it out, not prevent research.”  I would like to see her try out that rationale in some of the patient safety and human subjects committee meetings I have attended. The response would not likely be very polite.

Patients and their parents should have been informed of an undisclosed conflict of interest.

phil parker NHSThis trial served as basis for advertising Lightning Process on the Web as being offered in NHS clinics and as being evaluated in a randomized controlled trial. [4]

Promoters of the Lightning Process received substantial payments from this trial. Although a promoter of the treatment was listed on the application for the project, she was not among the paper’s authors, so there will probably be no conflict of interest declared.

The providers were not qualified medical personnel, but were working for an organization that would financially benefit from positive findings.

It is expected that children who received the treatment as part of the trial would continue to receive it from providers who were trained and certified by promoters of the Lightning Process,

By analogy, think of a pharmaceutical trial in which the influence of drug company and that it would profit from positive results was not indicated in patient consent forms. There would be a public outcry and likely legal action.

astonishingWhy might the SMILE create the illusion that Lightning Process is effective for chronic fatigue syndrome?

There were multiple weaknesses in the trial design that would likely generate a false impression that the Lightning Process works. Under similar conditions, homeopathy and sham acupuncture appear effective [5]. Experts know to reject such results because (1) more rigorous designs are required to evaluate efficacy of treatment in order to rule out placebo effects; and (b) there must be a scientific basis for the mechanism of change claimed for how the treatment works. 

Indoctrination of parents and patients with pseudoscientific information. Advertisements for the Lightning Process on the Internet, including YouTube videos, and created a demand for this treatment among patients but it’s cost (£620) is prohibitive for many.

Selection Bias. Participation in the trial involved a 50% probability the treatment would be received for free. (Promoters of the Lightning Process received £567 for each patient who received the treatment in the trial). Parents who believed in the power of the the Lightning Process would be motived to enroll in the trial in order to obtain the treatment free for their children.

The trial was unblinded. Patients and treatment providers knew to which group patients were assigned. Not only with patients getting the Lightning Process be exposed to the providers’ positive expectations and encouragement, those assigned to the control group could register the disappointment when completing outcome measures.

The self-report subjective outcomes of this trial are susceptible to nonspecific factors (placebo effects). These include positive expectations, increased contact and support, and a rationale for what was being done, even if scientifically unsound. These nonspecific factors were concentrated in the group receiving the Lightning Process intervention. This serves to stack the deck in any evaluation of the Lightning Process and inflate differences with the patients who didn’t get into this group.

There were no objective measures of outcome. The one measure with a semblance of objectivity, school attendance, was eliminated in a pilot study. Objective measures would have provided a check on the likely exaggerated effects obtained with subjective seif-report measures.

The providers were not qualified medical, but were working for an organization that would financially benefit from positive findings. The providers were highly motivated to obtain positive results.

During treatment, the  Lightning Process further indoctrinates child and adolescent patients with pseudoscience [ 6 ] and involves coercion to fake that they are getting well [7 ]. Such coercion can interfere with the patients getting appropriate help when they need it, their establishing appropriate expectations with parental and school authorities, and even their responding honestly to outcome assessments.

 It’s not just patients and patient family members activists who object to the trial. As professionals have gotten more informed, there’s been increasing international concern about the ethics and safety of this trial.

The Science Media Centre has consistently portrayed critics of Esther Crawley’s work as being a disturbed minority of patients and patients’ family members. Smearing and vilification of patients and parents who object to the trial is unprecedented.

Particularly with the international controversy over the PACE trial of cognitive behavior therapy  and graded exercise therapy for chronic fatigue syndrome, the patients have been joined by non-patient scientists and clinicians in their concerns.

Really, if you were a fully informed parent of a child who was being pressured to participate in the trial with false claims of the potential benefits, wouldn’t you object?

embargoed news briefing

Notes

[1] “To date, neither the ASA nor CAP [Committee of Advertising Practice] has seen robust evidence for the health benefits of LP. Advertisers should take care not to make implied claims about the health benefits of the three-day course and must not refer to conditions for which medical supervision should be sought.”

[2] The respected Skeptics Dictionary offers a scathing critique of Phil Parker’s Lightning Process. The critique specifically cites concerns that Crawley’s SMILE trial switched outcomes to increase the likelihood of obtaining evidence of effectiveness.

[3] The entry for Neuro-linguistic programming (NLP) inWikipedia states:

There is no scientific evidence supporting the claims made by NLP advocates and it has been discredited as a pseudoscience by experts.[1][12] Scientific reviews state that NLP is based on outdated metaphors of how the brain works that are inconsistent with current neurological theory and contain numerous factual errors.[13][14

[4] NHS and LP    Phil Parker’s webpage announces the collaboration with Bristol University and provides a link to the officialSMILE  trial website.

{5] A provocative New England Journal of Medicine article, Active Albuterol or Placebo, Sham Acupuncture, or No Intervention in Asthma study showed that sham acupuncture as effective as an established medical treatment – an albuterol inhaler – for asthma when judged with subjective measures, but there was a large superiority for the established medical treatment obtained with objective measures.

[6] Instructional materials that patient are required to read during treatment include:

LP trains individuals to recognize when they are stimulating or triggering unhelpful physiological responses and to avoid these, using a set of standardized questions, new language patterns and physical movements with the aim of improving a more appropriate response to situations.

* Learn about the detailed science and research behind the Lightning Process and how it can help you resolve your issues.

* Start your training in recognising when you’re using your body, nervous system and specific language patterns in a damaging way

What if you could learn to reset your body’s health systems back to normal by using the well researched connection that exists between the brain and body?

The Lightning Process does this by teaching you how to spot when the PER is happening and how you can calm this response down, allowing your body to re-balance itself.

The Lightning Process will teach you how to use Neuroplasticity to break out of any destructive unconscious patterns that are keeping you stuck, and learn to use new, life and health enhancing ones instead.

The Lightning Process is a training programme which has had huge success with people who want to improve their health and wellbeing.

[7] Responsibility of patients:

Believe that Lightning Process will heal you. Tell everyone that you have been healed. Perform magic rituals like standing in circles drawn on paper with positive Keywords stated on them. Learn to render short rhyme when you feel symptoms, no matter where you are, as many times as required for the symptoms to disappear. Speak only in positive terms and think only positive thoughts. If symptoms or negative thoughts come, you must stretch forth your arms with palms facing outward and shout “Stop!” You are solely responsible for ME. You can choose to have ME. But you are free to choose a life without ME if you wish. If the method does not work, it is you who are doing something wrong.

skeptical-cat-is-fraught-with-skepticism-300x225Special thanks to the Skeptical Cat who provided me with an advance copy of the press release from Science Media Centre.

 

 

 

 

 

 

 

Creating illusions of wondrous effects of yoga and meditation on health: A skeptic exposes tricks

The tour of the sausage factory is starting, here’s your brochure telling you’ll see.

 

A recent review has received a lot of attention with it being used for claims that mind-body interventions have distinct molecular signatures that point to potentially dramatic health benefits for those who take up these practices.

What Is the Molecular Signature of Mind–Body Interventions? A Systematic Review of Gene Expression Changes Induced by Meditation and Related Practices.  Frontiers in Immunology. 2017;8.

Few who are tweeting about this review or its press coverage are likely to have read it or to understand it, if they read it. Most of the new agey coverage in social media does nothing more than echo or amplify the message of the review’s press release.  Lazy journalists and bloggers can simply pass on direct quotes from the lead author or even just the press release’s title, ‘Meditation and yoga can ‘reverse’ DNA reactions which cause stress, new study suggests’:

“These activities are leaving what we call a molecular signature in our cells, which reverses the effect that stress or anxiety would have on the body by changing how our genes are expressed.”

And

“Millions of people around the world already enjoy the health benefits of mind-body interventions like yoga or meditation, but what they perhaps don’t realise is that these benefits begin at a molecular level and can change the way our genetic code goes about its business.”

[The authors of this review actually identified some serious shortcomings to the studies they reviewed. I’ll be getting to some excellent points at the end of this post that run quite counter to the hype. But the lead author’s press release emphasized unwarranted positive conclusions about the health benefits of these practices. That is what is most popular in media coverage, especially from those who have stuff to sell.]

Interpretation of the press release and review authors’ claims requires going back to the original studies, which most enthusiasts are unlikely to do. If readers do go back, they will have trouble interpreting some of the deceptive claims that are made.

Yet, a lot is at stake. This review is being used to recommend mind-body interventions for people having or who are at risk of serious health problems. In particular, unfounded claims that yoga and mindfulness can increase the survival of cancer patients are sometimes hinted at, but occasionally made outright.

This blog post is written with the intent of protecting consumers from such false claims and providing tools so they can spot pseudoscience for themselves.

Discussion in the media of the review speaks broadly of alternative and complementary interventions. The coverage is aimed at inspiring  confidence in this broad range of treatments and to encourage people who are facing health crises investing time and money in outright quackery. Seemingly benign recommendations for yoga, tai chi, and mindfulness (after all, what’s the harm?) often become the entry point to more dubious and expensive treatments that substitute for established treatments.  Once they are drawn to centers for integrative health care for classes, cancer patients are likely to spend hundreds or even thousands on other products and services that are unlikely to benefit them. One study reported:

More than 72 oral or topical, nutritional, botanical, fungal and bacterial-based medicines were prescribed to the cohort during their first year of IO care…Costs ranged from $1594/year for early-stage breast cancer to $6200/year for stage 4 breast cancer patients. Of the total amount billed for IO care for 1 year for breast cancer patients, 21% was out-of-pocket.

Coming up, I will take a skeptical look at the six randomized trials that were highlighted by this review.  But in this post, I will provide you with some tools and insights so that you do not have to make such an effort in order to make an informed decision.

Like many of the other studies cited in the review, these randomized trials were quite small and underpowered. But I will focus on the six because they are as good as it gets. Randomized trials are considered a higher form of evidence than simple observational studies or case reports [It is too bad the authors of the review don’t even highlight what studies are randomized trials. They are lumped with others as “longitudinal studies.]

As a group, the six studies do not actually add any credibility to the claims that mind-body interventions – specifically yoga, tai chi, and mindfulness training or retreats improve health by altering DNA.  We can be no more confident with what the trials provide than we would be without them ever having been done.

I found the task of probing and interpreting the studies quite labor-intensive and ultimately unrewarding.

I had to get past poor reporting of what was actually done in the trials, to which patients, and with what results. My task often involved seeing through cover ups with authors exercising considerable flexibility in reporting what measures were they actually collected and what analyses were attempted, before arriving at the best possible tale of the wondrous effects of these interventions.

Interpreting clinical trials should not be so hard, because they should be honestly and transparently reported and have a registered protocol and stick to it. These reports of trials were sorely lacking, The full extent of the problems took some digging to uncover, but some things emerged before I got to the methods and results.

The introductions of these studies consistently exaggerated the strength of existing evidence for the effects of these interventions on health, even while somehow coming to the conclusion that this particular study was urgently needed and it might even be the “first ever”. The introductions to the six papers typically cross-referenced each other, without giving any indication of how poor quality the evidence was from the other papers. What a mutual admiration society these authors are.

One giveaway is how the introductions  referred to the biggest, most badass, comprehensive and well-done review, that of Goyal and colleagues.

That review clearly states that the evidence for the effects of mindfulness is poor quality because of the lack of comparisons with credible active treatments. The typical randomized trial of mindfulness involves a comparison with no-treatment, a waiting list, or patients remaining in routine care where the target problem is likely to be ignored.  If we depend on the bulk of the existing literature, we cannot rule out the likelihood that any apparent benefits of mindfulness are due to having more positive expectations, attention, and support over simply getting nothing.  Only a handful  of hundreds of trials of mindfulness include appropriate, active treatment comparison/control groups. The results of those studies are not encouraging.

One of the first things I do in probing the introduction of a study claiming health benefits for mindfulness is see how they deal with the Goyal et al review. Did the study cite it, and if so, how accurately? How did the authors deal with its message, which undermines claims of the uniqueness or specificity of any benefits to practicing mindfulness?

For yoga, we cannot yet rule out that it is better than regular exercising – in groups or alone – having relaxing routines. The literature concerning tai chi is even smaller and poorer quality, but there is the same need to show that practicing tai chi has any benefits over exercising in groups with comparable positive expectations and support.

Even more than mindfulness, yoga and tai chi attract a lot of pseudoscientific mumbo jumbo about integrating Eastern wisdom and Western science. We need to look past that and insist on evidence.

Like their introductions, the discussion sections of these articles are quite prone to exaggerating how strong and consistent the evidence is from existing studies. The discussion sections cherry pick positive findings in the existing literature, sometimes recklessly distorting them. The authors then discuss how their own positively spun findings fit with what is already known, while minimizing or outright neglecting discussion of any of their negative findings. I was not surprised to see one trial of mindfulness for cancer patients obtain no effects on depressive symptoms or perceived stress, but then go on to explain mindfulness might powerfully affect the expression of DNA.

If you want to dig into the details of these studies, the going can get rough and the yield for doing a lot of mental labor is low. For instance, these studies involved drawing blood and analyzing gene expression. Readers will inevitably encounter passages like:

In response to KKM treatment, 68 genes were found to be differentially expressed (19 up-regulated, 49 down-regulated) after adjusting for potentially confounded differences in sex, illness burden, and BMI. Up-regulated genes included immunoglobulin-related transcripts. Down-regulated transcripts included pro-inflammatory cytokines and activation-related immediate-early genes. Transcript origin analyses identified plasmacytoid dendritic cells and B lymphocytes as the primary cellular context of these transcriptional alterations (both p < .001). Promoter-based bioinformatic analysis implicated reduced NF-κB signaling and increased activity of IRF1 in structuring those effects (both p < .05).

Intimidated? Before you defer to the “experts” doing these studies, I will show you some things I noticed in the six studies and how you can debunk the relevance of these studies for promoting health and dealing with illness. Actually, I will show that even if these 6 studies got the results that the authors claimed- and they did not- at best, the effects would trivial and lost among the other things going on in patients’ lives.

Fortunately, there are lots of signs that you can dismiss such studies and go on to something more useful, if you know what to look for.

Some general rules:

  1. Don’t accept claims of efficacy/effectiveness based on underpowered randomized trials. Dismiss them. The rule of thumb is reliable to dismiss trials that have less than 35 patients in the smallest group. Over half the time, true moderate sized effects will be missed in such studies, even if they are actually there.

Due to publication bias, most of the positive effects that are published from such sized trials will be false positives and won’t hold up in well-designed, larger trials.

When significant positive effects from such trials are reported in published papers, they have to be large to have reached significance. If not outright false, these effect sizes won’t be matched in larger trials. So, significant, positive effect sizes from small trials are likely to be false positives and exaggerated and probably won’t replicate. For that reason, we can consider small studies to be pilot or feasibility studies, but not as providing estimates of how large an effect size we should expect from a larger study. Investigators do it all the time, but they should not: They do power calculations estimating how many patients they need for a larger trial from results of such small studies. No, no, no!

Having spent decades examining clinical trials, I am generally comfortable dismissing effect sizes that come from trials with less than 35 patients in the smaller group. I agree with a suggestion that if there are two larger trials are available in a given literature, go with those and ignore the smaller studies. If there are not at least two larger studies, keep the jury out on whether there is a significant effect.

Applying the Rule of 35, 5 of the 6 trials can be dismissed and the sixth is ambiguous because of loss of patients to follow up.  If promoters of mind-body interventions want to convince us that they have beneficial effects on physical health by conducting trials like these, they have to do better. None of the individual trials should increase our confidence in their claims. Collectively, the trials collapse in a mess without providing a single credible estimate of effect size. This attests to the poor quality of evidence and disrespect for methodology that characterizes this literature.

  1. Don’t be taken in by titles to peer-reviewed articles that are themselves an announcement that these interventions work. Titles may not be telling the truth.

What I found extraordinary is that five of the six randomized trials had a title that indicating a positive effect was found. I suspect that most people encountering the title will not actually go on to read the study. So, they will be left with the false impression that positive results were indeed obtained. It’s quite a clever trick to make the title of an article, by which most people will remember it, into a false advertisement for what was actually found.

For a start, we can simply remind ourselves that with these underpowered studies, investigators should not even be making claims about efficacy/effectiveness. So, one trick of the developing skeptic is to confirm that the claims being made in the title don’t fit with the size of the study. However, actually going to the results section one can find other evidence of discrepancies between what was found in what is being claimed.

I think it’s a general rule of thumb that we should be careful of titles for reports of randomized that declare results. Even when what is claimed in the title fits with the actual results, it often creates the illusion of a greater consistency with what already exists in the literature. Furthermore, even when future studies inevitably fail to replicate what is claimed in the title, the false claim lives on, because failing to replicate key findings is almost never a condition for retracting a paper.

  1. Check the institutional affiliations of the authors. These 6 trials serve as a depressing reminder that we can’t go on researchers’ institutional affiliation or having federal grants to reassure us of the validity of their claims. These authors are not from Quack-Quack University and they get funding for their research.

In all cases, the investigators had excellent university affiliations, mostly in California. Most studies were conducted with some form of funding, often federal grants.  A quick check of Google would reveal from at least one of the authors on a study, usually more, had federal funding.

  1. Check the conflicts of interest, but don’t expect the declarations to be informative. But be skeptical of what you find. It is also disappointing that a check of conflict of interest statements for these articles would be unlikely to arouse the suspicion that the results that were claimed might have been influenced by financial interests. One cannot readily see that the studies were generally done settings promoting alternative, unproven treatments that would benefit from the publicity generated from the studies. One cannot see that some of the authors have lucrative book contracts and speaking tours that require making claims for dramatic effects of mind-body treatments could not possibly be supported by: transparent reporting of the results of these studies. As we will see, one of the studies was actually conducted in collaboration with Deepak Chopra and with money from his institution. That would definitely raise flags in the skeptic community. But the dubious tie might be missed by patients in their families vulnerable to unwarranted claims and unrealistic expectations of what can be obtained outside of conventional medicine, like chemotherapy, surgery, and pharmaceuticals.

Based on what I found probing these six trials, I can suggest some further rules of thumb. (1) Don’t assume for articles about health effects of alternative treatments that all relevant conflicts of interest are disclosed. Check the setting in which the study was conducted and whether it was in an integrative [complementary and alternative, meaning mostly unproven.] care setting was used for recruiting or running the trial. Not only would this represent potential bias on the part of the authors, it would represent selection bias in recruitment of patients and their responsiveness to placebo effects consistent with the marketing themes of these settings.(2) Google authors and see if they have lucrative pop psychology book contracts, Ted talks, or speaking gigs at positive psychology or complementary and alternative medicine gatherings. None of these lucrative activities are typically expected to be disclosed as conflicts of interest, but all require making strong claims that are not supported by available data. Such rewards are perverse incentives for authors to distort and exaggerate positive findings and to suppress negative findings in peer-reviewed reports of clinical trials. (3) Check and see if known quacks have prepared recruitment videos for the study, informing patients what will be found (Serious, I was tipped off to look and I found that).

  1. Look for the usual suspects. A surprisingly small, tight, interconnected group is generating this research. You could look the authors up on Google or Google Scholar or  browse through my previous blog posts and see what I have said about them. As I will point out in my next blog, one got withering criticism for her claim that drinking carbonated sodas but not sweetened fruit drinks shortened your telomeres so that drinking soda was worse than smoking. My colleagues and I re-analyzed the data of another of the authors. We found contrary to what he claimed, that pursuing meaning, rather than pleasure in your life, affected gene expression related to immune function. We also showed that substituting randomly generated data worked as well as what he got from blood samples in replicating his original results. I don’t think it is ad hominem to point out a history for both of the authors of making implausible claims. It speaks to source credibility.
  1. Check and see if there is a trial registration for a study, but don’t stop there. You can quickly check with PubMed if a report of a randomized trial is registered. Trial registration is intended to ensure that investigators commit themselves to a primary outcome or maybe two and whether that is what they emphasized in their paper. You can then check to see if what is said in the report of the trial fits with what was promised in the protocol. Unfortunately, I could find only one of these was registered. The trial registration was vague on what outcome variables would be assessed and did not mention the outcome emphasized in the published paper (!). The registration also said the sample would be larger than what was reported in the published study. When researchers have difficulty in recruitment, their study is often compromised in other ways. I’ll show how this study was compromised.

Well, it looks like applying these generally useful rules of thumb is not always so easy with these studies. I think the small sample size across all of the studies would be enough to decide this research has yet to yield meaningful results and certainly does not support the claims that are being made.

But readers who are motivated to put in the time of probing deeper come up with strong signs of p-hacking and questionable research practices.

  1. Check the report of the randomized trial and see if you can find any declaration of one or two primary outcomes and a limited number of secondary outcomes. What you will find instead is that the studies always have more outcome variables than patients receiving these interventions. The opportunities for cherry picking positive findings and discarding the rest are huge, especially because it is so hard to assess what data were collected but not reported.
  1. Check and see if you can find tables of unadjusted primary and secondary outcomes. Honest and transparent reporting involves giving readers a look at simple statistics so they can decide if results are meaningful. For instance, if effects on stress and depressive symptoms are claimed, are the results impressive and clinically relevant? Almost in all cases, there is no peeking allowed. Instead, authors provide analyses and statistics with lots of adjustments made. They break lots of rules in doing so, especially with such a small sample. These authors are virtually assured to get results to crow about.

Famously, Joe Simmons and Leif Nelson hilariously published claims that briefly listening to the Beatles’ “When I’m 64” left students a year and a half older younger than if they were assigned to listening to “Kalimba.”  Simmons and Leif Nelson knew this was nonsense, but their intent was to show what researchers can do if they have free reign with how they analyze their data and what they report and  . They revealed the tricks they used, but they were so minor league and amateurish compared to what the authors of these trials consistently did in claiming that yoga, tai chi, and mindfulness modified expression of DNA.

Stay tuned for my next blog post where I go through the six studies. But consider this, if you or a loved one have to make an immediate decision about whether to plunge into the world of woo woo unproven medicine in hopes of  altering DNA expression. I will show the authors of these studies did not get the results they claimed. But who should care if they did? Effects were laughably trivial. As the authors of this review about which I have been complaining noted:

One other problem to consider are the various environmental and lifestyle factors that may change gene expression in similar ways to MBIs [Mind-Body Interventions]. For example, similar differences can be observed when analyzing gene expression from peripheral blood mononuclear cells (PBMCs) after exercise. Although at first there is an increase in the expression of pro-inflammatory genes due to regeneration of muscles after exercise, the long-term effects show a decrease in the expression of pro-inflammatory genes (55). In fact, 44% of interventions in this systematic review included a physical component, thus making it very difficult, if not impossible, to discern between the effects of MBIs from the effects of exercise. Similarly, food can contribute to inflammation. Diets rich in saturated fats are associated with pro-inflammatory gene expression profile, which is commonly observed in obese people (56). On the other hand, consuming some foods might reduce inflammatory gene expression, e.g., drinking 1 l of blueberry and grape juice daily for 4 weeks changes the expression of the genes related to apoptosis, immune response, cell adhesion, and lipid metabolism (57). Similarly, a diet rich in vegetables, fruits, fish, and unsaturated fats is associated with anti-inflammatory gene profile, while the opposite has been found for Western diet consisting of saturated fats, sugars, and refined food products (58). Similar changes have been observed in older adults after just one Mediterranean diet meal (59) or in healthy adults after consuming 250 ml of red wine (60) or 50 ml of olive oil (61). However, in spite of this literature, only two of the studies we reviewed tested if the MBIs had any influence on lifestyle (e.g., sleep, diet, and exercise) that may have explained gene expression changes.

How about taking tango lessons instead? You would at least learn dance steps, get exercise, and decrease any social isolation. And so what if there were more benefits than taking up these other activities?

 

 

Power Poseur: The lure of lucrative pseudoscience and the crisis of untrustworthiness of psychology

This is the second of two segments of Mind the Brain aimed at redirecting the conversation concerning power posing to the importance of conflicts of interest in promoting and protecting its scientific status. 

The market value of many lines of products offered to consumers depends on their claims of being “science-based”. Products from psychologists that invoke wondrous mind-body or brain-behavior connections are particularly attractive. My colleagues and I have repeatedly scrutinized such claims, sometimes reanalyzing the original data, and consistently find the claims false or premature and exaggerated.

There is so little risk and so much money and fame to be gained in promoting questionable and even junk psychological science to lay audiences. Professional organizations confer celebrity status on psychologists who succeed, provide them with forums and free publicity that enhance their credibility, and protect their claims of being “science-based” from critics.

How much money academics make from popular books, corporate talks, and workshops and how much media attention they garner serve as alternative criteria for a successful career, sometimes seeming to be valued more than the traditional ones of quality and quantity of publications and the amount of grant funding obtained.

Efforts to improve the trustworthiness of what psychologists publish in peer-reviewed have no parallel in any efforts to improve the accuracy of what psychologists say to the public outside of the scientific literature.

By the following reasoning, there may be limits to how much the former efforts at reform can succeed without the latter. In the hypercompetitive marketplace, only the most dramatic claims gain attention. Seldom are the results of rigorously done, transparently reported scientific work sufficiently strong and  unambiguous enough to back up the claims with the broadest appeal, especially in psychology. Psychologists who remain in academic setting but want to sell market their merchandise to consumers face a dilemma: How much do they have to hype and distort their findings in peer-reviewed journals to fit with what they say to the public?

It important for readers of scientific articles to know that authors are engaged in these outside activities and have pressure to obtain particular results. The temptation of being able to make bold claims clash with the requirements to conduct solid science and report results transparently and completely. Let readers decide if this matters for their receptivity to what authors say in peer-reviewed articles by having information available to them. But almost never is a conflict of interest declared. Just search articles in Psychological Science and see if you can find a single declaration of a COI, even when the authors have booking agents and give high priced corporate talks and seminars.

The discussion of the quality of science backing power posing should have been shorter.

Up until now, much attention to power posing in academic circles has been devoted to the quality of the science behind it, whether results can be independently replicated, and whether critics have behaved badly. The last segment of Mind the Brain examined the faulty science of the original power posing paper in Psychological Science and showed why it could not contribute a credible effect size to the literature.

The discussion of the science behind power posing should have been much shorter and should have reached a definitive conclusion: the original power posing paper should never have been published in Psychological Science. Once the paper had been published, a succession of editors failed in their expanded Pottery-Barn responsibility to publish critiques by Steven J. Stanton  and by Marcus Crede and Leigh A. Phillips that were quite reasonable in their substance and tone. As is almost always the case, bad science was accorded an incumbent advantage once it was published. Any disparagement or criticism of this paper would be held by editors to strict and even impossibly high standards if it were to be published. Let’s review the bad science uncovered in the last blog. Readers who are familiar with that post can skip to the next section.

A brief unvarnished summary of the bad science of the original power posing paper has a biobehavioral intervention study

Reviewers of the original paper should have balked at the uninformative and inaccurate abstract. Minimally, readers need to know at the outset that there were only 42 participants (26 females and 16 males) in the study comparing high power versus low-power poses. Studies with so few participants cannot be expected to provide reproducible effect sizes. Furthermore, there is no basis for claiming that results held for both men and women because that claim depended on analyses with even smaller numbers. Note the 16 males were distributed in some unknown way across the two conditions. If power is fixed by the smaller cell size, even the optimal 8 males/cell is well below contributing an effect size. Any apparent significant effects in this study are likely to be meaning imposed on noise.

The end sentence in the abstract is an outrageously untrue statement of results. Yet, as we will see, it served as the basis of a product launch worth in the seven-figure range that was already taking shape:

That a person can, by assuming two simple 1-minute poses, embody power and instantly become more powerful has real-world, actionable implications.

Aside from the small sample size, as an author, editor and critic for in clinical and health psychology for over 40 years, I greet a claim of ‘real-world actionable implications’ from two one-minute manipulations of participants’ posture with extreme skepticism. My skepticism grows as we delve into the details of the study.

Investigators’ collecting a single pair of pre-post assessments of salivary cortisol is at best a meaningless ritual, and can contribute nothing to understanding what is going on in the study at a hormonal level.

Men in this age range of participants in this study have six times more testosterone than women. Statistical “control” of testosterone by controlling for gender is a meaningless gesture producing uninterpretable results. Controlling for baseline testosterone in analyses of cortisol and vice versa eliminates any faint signal in the loud noise of the hormonal data.

Although it was intended as a manipulation check (and subsequently as claimed as evidence of the effect of power posing on feelings),  the crude subjective self-report ratings of how “powerful” and “in charge” on a 1-4 scale could simply communicate the experimenters’ expectancies to participants. Endorsing whether they felt more powerful indicated how smart participants were and if they were go along with the purpose of the study. Inferences beyond that uninteresting finding require external validation.

In clinical and health psychology trials, we are quite wary of simple subjective self-report analogue scales, particularly when there is poor control of the unblinded experimenters’ behavior and what they communicate to participants.

The gambling task lacks external validation. Low stakes could simply reduce it to another communication of experimenters’ expectancies. Note that the saliva assessments were obtained after completion of the task and if there is any confidence left in the assessments of hormones, this is an important confound.

The unblinded experimenters’ physically placing participants in either 2 1-minute high power or 2 1-minute low-power poses is a weird, unvalidated experimental manipulation that could not have the anticipated effects on hormonal levels. Neither high- nor low-power poses are credible, but the hypothesis particularly strains credibility that they low-power pose would actually raise cortisol, if cortisol assessments in the study had any meaning at all.

Analyses were not accurately described, and statistical controls of any kind with such a small sample  are likely to add to spurious findings. The statistical controls in this study were particularly inappropriate and there is evidence of the investigators choosing the analyses to present after the results were known.

There is no there there: The original power pose paper did not introduce a credible effect size into the literature.

The published paper cannot introduce a credible effect size into the scientific literature. Power posing may be an interesting and important idea that deserves careful scientific study but if any future study of the idea would be “first ever,” not a replication of the  Psychological Science article. The two commentaries that were blocked from publication in Psychological Science but published elsewhere amplify any dismissal of the paper, but we are already well over the top. But then there is the extraordinary repudiation of the paper by the first author and her exposure of the exploitation of investigator degrees of freedom and outright p-hacking.  How many stakes do you have to plunge into the heart of a vampire idea?

Product launch

 Even before the power posing article appeared in Psychological Science, Amy Cuddy was promoting it at Harvard, first  in Power Posing: Fake It Until You Make It  in Harvard Business School’s Working Knowledge: Business Research for Business Leaders. Shortly afterwards was the redundant but elaborated article in Harvard Magazine, subtitled Amy Cuddy probes snap judgments, warm feelings, and how to become an “alpha dog.”

Amy Cuddy is the middle author on the actual Psychological Science between first author Dana Carney and third author, Dana Carney’s graduate student Andy J Yap. Yet, the Harvard Magazine article lists Cuddy first. The Harvard Magazine article is also noteworthy in unveiling what would grow into Cuddy’s redemptive self narrative, although Susan Fiske’s role as  as the “attachment figure” who nurtures Cuddy’s  realization of her inner potential was only hinted.

QUITE LITERALLY BY ACCIDENT, Cuddy became a psychologist. In high school and in college at the University of Colorado at Boulder, she was a serious ballet dancer who worked as a roller-skating waitress at the celebrated L.A. Diner. But one night, she was riding in a car whose driver fell asleep at 4:00 A.M. while doing 90 miles per hour in Wyoming; the accident landed Cuddy in the hospital with severe head trauma and “diffuse axonal injury,” she says. “It’s hard to predict the outcome after that type of injury, and there’s not much they can do for you.”

Cuddy had to take years off from school and “relearn how to learn,” she explains. “I knew I was gifted–I knew my IQ, and didn’t think it could change. But it went down by two standard deviations after the injury. I worked hard to recover those abilities and studied circles around everyone. I listened to Mozart–I was willing to try anything!” Two years later her IQ was back. And she could dance again.

Yup, leading up to promoting the idea that overcoming circumstances and getting what you want is as simple as adopitng these 2 minutes of  behavioral manipulation.

The last line of the Psychological Science abstract was easily fashioned into the pseudoscientific basis for this ease of changing behavior and outcomes, which now include the success of venture-capital pitches:

 

Tiny changes that people can make can lead to some pretty dramatic outcomes,” Cuddy reports. This is true because changing one’s own mindset sets up a positive feedback loop with the neuroendocrine secretions, and also changes the mindset of others. The success of venture-capital pitches to investors apparently turns, in fact, on nonverbal factors like “how comfortable and charismatic you are.”

Soon, The New York Times columnist David Brooks   placed power posing solidly within the positive thinking product line of positive psychology, even if Cuddy had no need to go out on that circuit: “If you act powerfully, you will begin to think powerfully.”

In 2011, both first author Dana Carney and Amy Cuddy received the Rising Star Award from the Association for Psychological Science (APS) for having “already made great advancements in science” Carney cited her power posing paper as one that she liked. Cuddy didn’t nominate the paper, but reported er recent work examined “how brief nonverbal expressions of competence/power and warmth/connection actually alter the neuroendocrine levels, expressions, and behaviors of the people making the expressions, even when the expressions are “posed.”

The same year, In 2011, Cuddy also appeared at PopTech, which is a”global community of innovators, working together to expand the edge of change” with tickets selling for $2,000. According to an article in The Chronicle of Higher Education :

When her turn came, Cuddy stood on stage in front of a jumbo screen showing Lynda Carter as Wonder Woman while that TV show’s triumphant theme song announced the professor’s arrival (“All the world is waiting for you! And the power you possess!”). After the music stopped, Cuddy proceeded to explain the science of power poses to a room filled with would-be innovators eager to expand the edge of change.

But that performance was just a warm up for Cuddy’s TedGlobal Talk which has now received almost 42 million views.

A Ted Global talk that can serve as a model for all Ted talks: Your body language may shape who you are  

This link takes you not only to Amy Cuddy’s Ted Global talk but to a transcript in 49 different languages

 Amy Cuddy’s TedGlobal Talk is brilliantly crafted and masterfully delivered. It has two key threads. The first thread is what David McAdams has described as an obligatory personal narrative of a redeemed self.  McAdams summarizes the basic structure:

As I move forward in life, many bad things come my way—sin, sickness, abuse, addiction, injustice, poverty, stagnation. But bad things often lead to good outcomes—my suffering is redeemed. Redemption comes to me in the form of atonement, recovery, emancipation, enlightenment, upward social mobility, and/or the actualization of my good inner self. As the plot unfolds, I continue to grow and progress. I bear fruit; I give back; I offer a unique contribution.

This is interwoven with a second thread, the claims of the strong science of power pose derived from the Psychological Science article. Without the science thread, the talk is reduced to a motivational talk of the genre of Oprah Winfrey or Navy Seal Admiral William McRaven Sharing Reasons You Should Make Bed Everyday

It is not clear that we should hold the redeemed self of a Ted Talk to the criteria of historical truth. Does it  really matter whether  Amy Cuddy’s IQ temporarily fell two standard deviations after an auto accident (13:22)? That Cuddy’s “angel adviser Susan Fiske saved her from feeling like an imposter with the pep talk that inspired the “fake it until you make it” theme of power posing (17:03)? That Cuddy similarly transformed the life of her graduate student (18:47) with:

So I was like, “Yes, you are! You are supposed to be here! And tomorrow you’re going to fake it, you’re going to make yourself powerful, and, you know –

This last segment of the Ted talk is best viewed, rather than read in the transcript. It brings Cuddy to tears and the cheering, clapping audience to their feet. And Cuddy wraps up with her takeaway message:

The last thing I’m going to leave you with is this. Tiny tweaks can lead to big changes. So, this is two minutes. Two minutes, two minutes, two minutes. Before you go into the next stressful evaluative situation, for two minutes, try doing this, in the elevator, in a bathroom stall, at your desk behind closed doors. That’s what you want to do. Configure your brain to cope the best in that situation. Get your testosterone up. Get your cortisol down. Don’t leave that situation feeling like, oh, I didn’t show them who I am. Leave that situation feeling like, I really feel like I got to say who I am and show who I am.

So I want to ask you first, you know, both to try power posing, and also I want to ask you to share the science, because this is simple. I don’t have ego involved in this. (Laughter) Give it away. Share it with people, because the people who can use it the most are the ones with no resources and no technology and no status and no power. Give it to them because they can do it in private. They need their bodies, privacy and two minutes, and it can significantly change the outcomes of their life.

Who cares if the story is literal historical truth? Maybe we should not. But I think psychologists should care about the misrepresentation of the study. For that matter, anyone concerned with truth in advertising to consumers. Anyone who believes that consumers have the right to fair and accurate portrayal of science in being offered products, whether anti-aging cream, acupuncture, or self-help merchandise:

Here’s what we find on testosterone. From their baseline when they come in, high-power people experience about a 20-percent increase, and low-power people experience about a 10-percent decrease. So again, two minutes, and you get these changes. Here’s what you get on cortisol. High-power people experience about a 25-percent decrease, and the low-power people experience about a 15-percent increase. So two minutes lead to these hormonal changes that configure your brain to basically be either assertive, confident and comfortable, or really stress-reactive, and feeling sort of shut down. And we’ve all had the feeling, right? So it seems that our nonverbals do govern how we think and feel about ourselves, so it’s not just others, but it’s also ourselves. Also, our bodies change our minds.

Why should we care? Buying into such simple solutions prepares consumers to accept other outrageous claims. It can be a gateway drug for other quack treatments like Harvard psychologist Ellen Langer’s claims that changing mindset can overcome advanced cancer.

Unwarranted claims breaks down the barriers between evidence-based recommendations and nonsense. Such claims discourages consumers from accepting more deliverable promises that evidence-based interventions like psychotherapy can indeed make a difference, but they take work and effort, and effects can be modest. Who would invest time and money in cognitive behavior therapy, when two one-minute self-manipulations can transform lives? Like all unrealistic promises of redemption, such advice may ultimately lead people to blame themselves when they don’t overcome adversity- after all it is so simple  and just a matter of taking charge of your life. Their predicament indicates that they did not take charge or that they are simply losers.

But some consumers can be turned cynical about psychology. Here is a Harvard professor trying to sell them crap advice. Psychology sucks, it is crap.

Conflict of interest: Nothing to declare?

In an interview with The New York Times, Amy Cuddy said: “I don’t care if some people view this research as stupid,” she said. “I feel like it’s my duty to share it.”

Amy Cuddy may have been giving her power pose advice away for free in her Ted Talk, but she already had given it away at the $2,000 a ticket PopTech talk. The book contract for Presence: Bringing Your Boldest Self to Your Biggest Challenges was reportedly for around a million dollars.  And of course, like many academics who leave psychology for schools of management, Cuddy had a booking agency soliciting corporate talks and workshops. With the Ted talk, she could command $40,000-$100,000.

Does this discredit the science of power posing? Not necessarily, but readers should be informed and free to decide for themselves. Certainly, all this money in play might make Cuddy more likely to respond defensively to criticism of her work. If she repudiated this work the way that first author Dana Carey did, would there be a halt to her speaking gigs, a product recall, or refunds issued by Amazon for Presence?

I think it is fair to suggest that there is too much money in play for Cuddy to respond to academic debate.  Maybe things are outside that realm because of these stakes.

The replicationados attempt replications: Was it counterproductive?

 Faced with overwhelming evidence of the untrustworthiness of the psychological literature, some psychologists have organized replication initiatives and accumulated considerable resources for multisite replications. But replication initiatives are insufficient to salvage the untrustworthiness of many areas of psychology, particularly clinical and health psychology intervention studies, and may inadvertently dampen more direct attacks on bad science. Many of those who promote replication initiatives are silent when investigators refused to share data for studies with important clinical and public health implications. They are also silent when journals like Psychological Science fail to publish criticism of papers with blatantly faulty science.

Replication initiatives take time and results are often,but not always ultimately published outside of the journals where a flawed original work was published. But in important unintended consequence of them is they lend credibility to effect sizes that had no validity whatsoever when they occurred in the original papers. In debate attempting to resolve discrepancies between original studies and large scale replications, the original underpowered studies are often granted a more entrenched incumbent advantage.

It should be no surprise that in large-scale attempted  replication,  Ranehill , Dreber, Johannesson, Leiberg, Sul , and Weber failed to replicate the key, nontrivial findings of the original power pose study.

Consistent with the findings of Carney et  al., our results showed a significant effect of power posing on self-reported feelings of power. However, we found no significant effect of power posing on hormonal levels or in any of the three behavioral tasks.

It is also not surprising that Cuddy invoked her I-said-it-first-and-i-was-peer-reviewed incumbent advantage reasserting of her original claim, along with a review of 33 studies including the attempted replication:

The work of Ranehill et al. joins a body of research that includes 33 independent experiments published with a total of 2,521 research participants. Together, these results may help specify when nonverbal expansiveness will and will not cause embodied psychological changes.

Cuddy asserted methodological differences between their study and the attempted Ranehill replication, may have moderated the effects of posing. But no study has shown that putting participants into a power pose affects hormones.

Joe Simmons and Uri Simonsohn and performed a meta analysis of the studies nominated by Cuddy and ultimately published in Psychological Science. Their blog Data Colada succinctly summarized the results:

Consistent with the replication motivating this post, p-curve indicates that either power-posing overall has no effect, or the effect is too small for the existing samples to have meaningfully studied it. Note that there are perfectly benign explanations for this: e.g., labs that run studies that worked wrote them up, labs that run studies that didn’t, didn’t. [5]

While the simplest explanation is that all studied effects are zero, it may be that one or two of them are real (any more and we would see a right-skewed p-curve). However, at this point the evidence for the basic effect seems too fragile to search for moderators or to advocate for people to engage in power posing to better their lives.

Come on, guys, there was never a there there, don’t invent one, but keeping trying to explain it.

It is interesting that none of these three follow up articles in Psychological Science have abstracts, especially in contrast to the original power pose paper that effectively delivered its misleading message in the abstract.

Just as this blog post was being polished, a special issue of Comprehensive Results in Social Psychology (CRSP) on Power Poses was released.

  1. No preregistered tests showed positive effects of expansive poses on any behavioral or hormonal measures. This includes direct replications and extensions.
  2. Surprise: A Bayesian meta-analysis across the studies reveals a credible effect of expansive poses on felt power. (Note that this is described as a ‘manipulation check’ by Cuddy in 2015.) Whether this is anything beyond a demand characteristic and whether it has any positive downstream behavioral effects is unknown.

No, not a surprise, just an uninteresting artifact. But stay tuned for the next model of poser pose dropping the tainted name and focusing on “felt power.” Like rust, commercialization of bad psychological science never really sleeps, it only takes power naps.

Meantime, professional psychological organizations, with their flagship journals and publicity machines need to:

  • Lose their fascination with psychologists whose celebrity status depends on Ted talks and the marketing of dubious advice products grounded in pseudoscience.
  • Embrace and adhere to an expanded Pottery Barn rule that covers not only direct replications, but corrections to bad science that has been published.
  • Make the protection of  consumers from false and exaggerated claims a priority equivalent to the vulnerable reputations of academic psychologists in efforts to improve the trustworthiness of psychology.
  • Require detailed conflicts of interest statements for talks and articles.

All opinions expressed here are solely those of Coyne of the Realm and not necessarily of PLOS blogs, PLOS One or his other affiliations.

Disclosure:

I receive money for writing these blog posts, less than $200 per post. I am also marketing a series of e-books,  including Coyne of the Realm Takes a Skeptical Look at Mindfulness and Coyne of the Realm Takes a Skeptical Look at Positive Psychology.

Maybe I am just making a fuss to attract attention to these enterprises. Maybe I am just monetizing what I have been doing for years virtually for free. Regardless, be skeptical. But to get more information and get on a mailing list for my other blogging, go to coyneoftherealm.com and sign up.

 

 

 

 

Calling out pseudoscience, radically changing the conversation about Amy Cuddy’s power posing paper

Part 1: Reviewed as the clinical trial that it is, the power posing paper should never have been published.

Has too much already been written about Amy Cuddy’s power pose paper? The conversation should not be stopped until its focus shifts and we change our ways of talking about psychological science.

The dominant narrative is now that a junior scientist published an influential paper on power posing and was subject to harassment and shaming by critics, pointing to the need for greater civility in scientific discourse.

Attention has shifted away from the scientific quality of the paper and the dubious products the paper has been used to promote and on the behavior of its critics.

Amy Cuddy and powerful allies are given forums to attack and vilify critics, accusing them of damaging the environment in which science is done and discouraging prospective early career investigators from entering the field.

Meanwhile, Amy Cuddy commands large speaking fees and has a top-selling book claiming the original paper provides strong science for simple behavioral manipulations altering mind-body relations and producing socially significant behavior.

This misrepresentation of psychological science does potential harm to consumers and the reputation of psychology among lay persons.

This blog post is intended to restart the conversation with a reconsideration of the original paper as a clinical and health psychology randomized trial (RCT) and, on that basis, identifying the kinds of inferences that are warranted from it.

In the first of a two post series, I argue that:

The original power pose article in Psychological Science should never been published.

-Basically, we have a therapeutic analog intervention delivered in 2 1-minute manipulations by unblinded experimenters who had flexibility in what they did,  what they communicated to participants, and which data they chose to analyze and how.

-It’s unrealistic to expect that 2 1-minute behavioral manipulations would have robust and reliable effects on salivary cortisol or testosterone 17 minutes later.

-It’s absurd to assume that the hormones mediated changes in behavior in this context.

-If Amy Cuddy retreats to the idea that she is simply manipulating “felt power,” we are solidly in the realm of trivial nonspecific and placebo effects.

The original power posing paper

Carney DR, Cuddy AJ, Yap AJ. Power posing brief nonverbal displays affect neuroendocrine levels and risk tolerance. Psychological Science. 2010 Oct 1;21(10):1363-8.

The Psychological Science article can be construed as a brief mind-body intervention consisting of 2 1-minute behavioral manipulations. Central to the attention that the paper attracted is that argument that this manipulation  affected psychological state and social performance via the effects of the manipulation on the neuroendocrine system.

The original study is in effect, a disguised randomized clinical trial (RCT) of a biobehavioral intervention. Once this is recognized, a host of standards can come into play for reporting this study and interpreting the results.

CONSORT

All major journals and publishers including Association for Psychological Science have adopted the Consolidated Standards of Reporting Trials (CONSORT). Any submission of a manuscript reporting a clinical trial is required to be accompanied by a checklist  indicating that the article reports that particular details of how the trial was conducted. Item 1 on the checklist specifies that both the title and abstract indicate the study was a randomized trial. This is important and intended to aid readers in evaluating the study, but also for the study to be picked up in systematic searches for reviews that depend on screening of titles and abstracts.

I can find no evidence that Psychological Science adheres to CONSORT. For instance, my colleagues and I provided a detailed critique of a widely promoted study of loving-kindness meditation that was published in Psychological Science the same year as Cuddy’s power pose study. We noted that it was actually a poorly reported null trial with switched outcomes. With that recognition, we went on to identify serious conceptual, methodological and statistical problems. After overcoming considerable resistance, we were able  to publish a muted version of our critique. Apparently reviewers of the original paper had failed to evaluate it in terms of it being an RCT.

The submission of the completed CONSORT checklist has become routine in most journals considering manuscripts for studies of clinical and health psychology interventions. Yet, additional CONSORT requirements that developed later about what should be included in abstracts are largely being ignored.

It would be unfair to single out Psychological Science and the Cuddy article for noncompliance to CONSORT for abstracts. However, the checklist can be a useful frame of reference for noting just how woefully inadequate the abstract was as a report of a scientific study.

CONSORT for abstracts

Hopewell S, Clarke M, Moher D, Wager E, Middleton P, Altman DG, Schulz KF, CONSORT Group. CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration. PLOS Medicine. 2008 Jan 22;5(1):e20.

Journal and conference abstracts should contain sufficient information about the trial to serve as an accurate record of its conduct and findings, providing optimal information about the trial within the space constraints of the abstract format. A properly constructed and well-written abstract should also help individuals to assess quickly the validity and applicability of the findings and, in the case of abstracts of journal articles, aid the retrieval of reports from electronic databases.

Even if CONSORT for abstracts did not exist, we could argue that readers, starting with the editor and reviewers were faced with an abstract with extraordinary claims that required better substantiation. They were disarmed by a lack of basic details from evaluating these claims.

In effect, the abstract reduces the study to an experimercial for products about to be marketed in corporate talks and workshops, but let’s persist in evaluating it as an abstract as a scientific study.

Humans and other animals express power through open, expansive postures, and they express powerlessness through closed, contractive postures. But can these postures actually cause power? The results of this study confirmed our prediction that posing in high-power nonverbal displays (as opposed to low-power nonverbal displays) would cause neuroendocrine and behavioral changes for both male and female participants: High-power posers experienced elevations in testosterone, decreases in cortisol, and increased feelings of power and tolerance for risk; low-power posers exhibited the opposite pattern. In short, posing in displays of power caused advantaged and adaptive psychological, physiological, and behavioral changes, and these findings suggest that embodiment extends beyond mere thinking and feeling, to physiology and subsequent behavioral choices. That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.

I don’t believe I have ever encountered in an abstract the extravagant claims with which this abstract concludes. But readers are not provided any basis for evaluating the claim until the Methods section. Undoubtedly, many holding opinions about the paper did not read that far.

Namely:

Forty-two participants (26 females and 16 males) were randomly assigned to the high-power-pose or low-power-pose condition.

Testosterone levels were in the normal range at both Time 1 (M = 60.30 pg/ml, SD = 49.58) and Time 2 (M = 57.40 pg/ml, SD = 43.25). As would be suggested by appropriately taken and assayed samples (Schultheiss & Stanton, 2009), men were higher than women on testosterone at both Time 1, F(1, 41) = 17.40, p < .001, r = .55, and Time 2, F(1, 41) = 22.55, p < .001, r = .60. To control for sex differences in testosterone, we used participant’s sex as a covariate in all analyses. All hormone analyses examined changes in hormones observed at Time 2, controlling for Time 1. Analyses with cortisol controlled for testosterone, and vice versa.2

Too small a study to provide an effect size

Hold on! First. Only 42 participants  (26 females and 16 males) would readily be recognized as insufficient for an RCT, particularly in an area of research without past RCTs.

After decades of witnessing the accumulation of strong effect sizes from underpowered studies, many of us have reacted by requiring 35 participants per group as the minimum acceptable level for a generalizable effect size. Actually, that could be an overly liberal criterion. Why?

Many RCTs are underpowered, yet a lack of enforcement of preregistration allows positive results by redefining the primary outcomes after results are known. A psychotherapy trial with 30 or less patients in the smallest cell has less than a 50% probability of detecting a moderate sized significant effect, even if it is present (Coyne,Thombs, & Hagedoorn, 2010). Yet an examination of the studies mustered for treatments being evidence supported by APA Division 12 ( http://www.div12.org/empirically-supported-treatments/ ) indicates that many studies were too underpowered to be reliably counted as evidence of efficacy, but were included without comment about this problem. Taking an overview, it is striking the extent to which the literature continues depend on small, methodologically flawed RCTs conducted by investigators with strong allegiances to one of the treatments being evaluated. Yet, which treatment is preferred by investigators is a better predictor of the outcome of the trial than the specific treatment being evaluated (Luborsky et al., 2006).

Earlier my colleagues and I had argued for the non-accumulative  nature of evidence from small RCTs:

Kraemer, Gardner, Brooks, and Yesavage (1998) propose excluding small, underpowered studies from meta-analyses. The risk of including studies with inadequate sample size is not limited to clinical and pragmatic decisions being made on the basis of trials that cannot demonstrate effectiveness when it is indeed present. Rather, Kraemer et al. demonstrate that inclusion of small, underpowered trials in meta-analyses produces gross overestimates of effect size due to substantial, but unquantifiable confirmatory publication bias from non-representative small trials. Without being able to estimate the size or extent of such biases, it is impossible to control for them. Other authorities voice support for including small trials, but generally limit their argument to trials that are otherwise methodologically adequate (Sackett & Cook, 1993; Schulz & Grimes, 2005). Small trials are particularly susceptible to common methodological problems…such as lack of baseline equivalence of groups; undue influence of outliers on results; selective attrition and lack of intent-to-treat analyses; investigators being unblinded to patient allotment; and not having a pre-determined stopping point so investigators are able to stop a trial when a significant effect is present.

In the power posing paper, there was the control for sex in all analyses because a peek at the data revealed baseline sex differences in testosterone dwarfing any other differences. What do we make of investigators conducting a study depending on testosterone mediating a behavioral manipulation who did not anticipate large baseline sex differences in testosterone?

In a Pubpeer comment leading up to this post , I noted:

We are then told “men were higher than women on testosterone at both Time 1, F(1, 41) = 17.40, p < .001, r = .55, and Time 2, F(1, 41) = 22.55, p < .001, r = .60. To control for sex differences in testosterone, we used participant’s sex as a covariate in all analyses. All hormone analyses examined changes in hormones observed at Time 2, controlling for Time 1. Analyses with cortisol controlled for testosterone, and vice versa.”

The findings alluded to in the abstract should be recognizable as weird and uninterpretable. Most basically, how could the 16 males be distributed across the two groups so that the authors could confidently say that differences held for both males and females? Especially when all analyses control for sex? Sex is highly correlated with testosterone and so an analysis that controlled for both the variables, sex and testosterone would probably not generalize to testosterone without such controls.

We are never given the basic statistics in the paper to independently assess what the authors are doing, not the correlation between cortisol and testosterone, only differences in time 2 cortisol controlling for time 1 cortisol, time 1 testosterone and gender. These multivariate statistics are not  very generalizable in a sample with 42 participants distributed across 2 groups. Certainly not for the 26 females and 16  males taken separately.

The behavioral manipulation

The original paper reports:

Participants’ bodies were posed by an experimenter into high-power or low-power poses. Each participant held two poses for 1 min each. Participants’ risk taking was measured with a gambling task; feelings of power were measured with self-reports. Saliva samples, which were used to test cortisol and testosterone levels, were taken before and approximately 17 min after the power-pose manipulation.

And then elaborates:

To configure the test participants into the poses, the experimenter placed an electrocardiography lead on the back of each participant’s calf and underbelly of the left arm and explained, “To test accuracy of physiological responses as a function of sensor placement relative to your heart, you are being put into a certain physical position.” The experimenter then manually configured participants’ bodies by lightly touching their arms and legs. As needed, the experimenter provided verbal instructions (e.g., “Keep your feet above heart level by putting them on the desk in front of you”). After manually configuring participants’ bodies into the two poses, the experimenter left the room. Participants were videotaped; all participants correctly made and held either two high-power or two low-power poses for 1 min each. While making and holding the poses, participants completed a filler task that consisted of viewing and forming impressions of nine faces.

The behavioral task and subjective self-report assessment

Measure of risk taking and powerful feelings. After they finished posing, participants were presented with the gambling task. They were endowed with $2 and told they could keep the money—the safe bet—or roll a die and risk losing the $2 for a payoff of $4 (a risky but rational bet; odds of winning were 50/50). Participants indicated how “powerful” and “in charge” they felt on a scale from 1 (not at all) to 4 (a lot).

An imagined bewildered review from someone accustomed to evaluating clinical trials

Although the authors don’t seem to know what they’re doing, we have an underpowered therapy analogue study with extraordinary claims. It’s unconvincing  that the 2 1-minute behavioral manipulations would change subsequent psychological states and behavior with any extralaboratory implications.

The manipulation poses a puzzle to research participants, challenging them to figure out what is being asked of them. The $2 gambling task presumably is meant to simulate effects on real-world behavior. But the low stakes could mean that participants believed the task evaluated whether they “got” the purpose of the intervention and behaved accordingly. Within that perspective, the unvalidated subjective self-report rating scale would serve as a clue to the intentions of the experimenter and an opportunity to show the participants were smart. The  manipulation of putting participants  into a low power pose is even more unconvincing as a contrasting active intervention or a control condition.  Claims that this manipulation did anything but communicate experimenter expectancies are even less credible.

This is a very weak form of evidence: A therapy analogue study with such a brief, low intensity behavioral manipulation followed by assessments of outcomes that might just inform participants of what they needed to do to look smart (i.e., demand characteristics). Add in that the experimenters were unblinded and undoubted had flexibility in how they delivered the intervention and what they said to participants. As a grossly underpowered trial, the study cannot make a contribution to the literature and certainly not an effect size.

Furthermore, if the authors had even a basic understanding of gender differences in social status or sex differences in testosterone, they would have stratified the study with respect to participate gender, not attempted to obtain control by post hoc statistical manipulation.

I could comment on signs of p-hacking and widespread signs of inappropriate naming, use, and interpretation of statistics, but why bother? There are no vital signs of a publishable paper here.

Is power posing salvaged by fashionable hormonal measures?

 Perhaps the skepticism of the editor and reviewers was overcome by the introduction of mind-body explanations  of what some salivary measures supposedly showed. Otherwise, we would be left with a single subjective self-report measure and a behavioral task susceptible to demand characteristics and nonspecific effects.

We recognize that the free availability of powerful statistical packages risks people using them without any idea of the appropriateness of their use or interpretation. The same observation should be made of the ready availability of means of collecting spit samples from research participants to be sent off to outside laboratories for biochemical analysis.

The clinical health psychology literature is increasingly filled with studies incorporating easily collected saliva samples intended to establish that psychological interventions influence mind-body relations. These have become particularly applied in attempts to demonstrate that mindfulness meditation and even tai chi can have beneficial effects on physical health and even cancer outcomes.

Often inaccurately described as as “biomarkers,” rather than merely as biological measurements, there is seldom little learned by inclusion of such measures that is generalizable within participants or across studies.

Let’s start with salivary-based cortisol measures.

A comprehensive review  suggests that:

  • A single measurement on a participant  or a pre-post pair of assessments would not be informative.
  • Single measurements are unreliable and large intra-and inter-individual differences not attributable to intervention can be in play.
  • Minor variations in experimental procedures can have large, unwanted effects.
  • The current standard is cortisol awakening response in the diurnal slope over more than one day, which would not make sense for the effects of 2 1-minute behavioral manipulations.
  • Even with sophisticated measurement strategies there is low agreement across and even within studies and low agreement with behavioral and self-report data.
  • The idea of collecting saliva samples would serve the function the investigators intended is an unscientific, but attractive illusion.

Another relevant comprehensive theoretical review and synthesis of cortisol reactivity was available at the time the power pose study was planned. The article identifies no basis for anticipating that experimenters putting participants into a 1-minute expansive poses would lower cortisol. And certainly no basis for assuming that putting participants into a 1-minute slumped position would raise cortisol. Or what such findings could possibly mean.

But we are clutching at straws. The authors’ interpretations of their hormonal data depend on bizarre post hoc decisions about how to analyze their data in a small sample in which participant sex is treated in incomprehensible  fashion. The process of trying to explain spurious results risks giving the results a credibility that authors have not earned for them. And don’t even try to claim we are getting signals of hormonal mediation from this study.

Another system failure: The incumbent advantage given to a paper that should not have been published.

Even when publication is based on inadequate editorial oversight and review, any likelihood or correction is diminished by published results having been blessed as “peer reviewed” and accorded an incumbent advantage over whatever follows.

A succession of editors have protected the power pose paper from post-publication peer review. Postpublication review has been relegated to other journals and social media, including PubPeer and blogs.

Soon after publication of  the power pose paper, a critique was submitted to Psychological Science, but it was desk rejected. The editor informally communicated to the author that the critique read like a review and teh original article had already been peer reviewed.

The critique by Steven J. Stanton nonetheless eventually appeared in Frontiers in Behavioral Neuroscience and is worth a read.

Stanton took seriously the science being invoked in the claims of the power pose paper.

A sampling:

Carney et al. (2010) collapsed over gender in all testosterone analyses. Testosterone conforms to a bimodal distribution when including both genders (see Figure 13; Sapienza et al., 2009). Raw testosterone cannot be considered a normally distributed dependent or independent variable when including both genders. Thus, Carney et al. (2010) violated a basic assumption of the statistical analyses that they reported, because they used raw testosterone from pre- and post-power posing as independent and dependent variables, respectively, with all subjects (male and female) included.

And

^Mean cortisol levels for all participants were reported as 0.16 ng/mL pre-posing and 0.12 ng/mL post-posing, thus showing that for all participants there was an average decrease of 0.04 ng/mL from pre- to post-posing, regardless of condition. Yet, Figure 4 of Carney et al. (2010) shows that low-power posers had mean cortisol increases of roughly 0.025 ng/mL and high-power posers had mean cortisol decreases of roughly 0.03 ng/mL. It is unclear given the data in Figure 4 how the overall cortisol change for all participants could have been a decrease of 0.04 ng/mL.

Another editor of Psychological Science received a critical comment from Marcus Crede and Leigh A. Phillips. After the first round of reviews, the Crede and Philips removed references to changes in the published power pose paper from earlier drafts that they had received from the first author, Dana Carney. However, Crede and Phillips withdrew their critique when asked to respond to a review by Amy Cuddy in a second resubmission.

The critique is now forthcoming in Social Psychological and Personality Science

Revisiting the Power Pose Effect: How Robust Are the Results Reported by Carney, Cuddy and Yap (2010) to Data Analytic Decisions

The article investigates effects of choices made in p-hacking in the original paper. An excerpt from the abstract

In this paper we use multiverse analysis to examine whether the findings reported in the original paper by Carney, Cuddy, and Yap (2010) are robust to plausible alternative data analytic specifications: outlier identification strategy; the specification of the dependent variable; and the use of control variables. Our findings indicate that the inferences regarding the presence and size of an effect on testosterone and cortisol are  highly sensitive to data analytic specifications. We encourage researchers to routinely explore the influence of data analytic choices on statistical inferences and also encourage editors and  reviewers to require explicit examinations of the influence of alternative data analytic  specifications on the inferences that are drawn from data.

Dana Carney, the first author of the has now posted an explanation why she no longer believes the originally reported findings are genuine and why “the evidence against the existence of power poses is undeniable.” She discloses a number of important confounds and important “researcher degrees of freedom in the analyses reported in the published paper.

Coming Up Next

A different view of the Amy Cuddy’s Ted talk in terms of its selling of pseudoscience to consumers and its acknowledgment of a strong debt to Cuddy’s adviser Susan Fiske.

A disclosure of some of the financial interests that distort discussion of the scientific flaws of the power pose.

How the reflexive response of the replicationados inadvertently reinforced the illusion that the original pose study provided meaningful effect sizes.

How Amy Cuddy and her allies marshalled the resources of the Association for Psychological Science to vilify and intimidate critics of bad science and of the exploitation of consumers by psychological pseudoscience.

How journalists played into this vilification.

What needs to be done to avoid a future fiasco for psychology like the power pose phenomenon and protect reformers of the dissemination of science.

Note: Time to reiterate that all opinions expressed here are solely those of Coyne of the Realm and not necessarily of PLOS blogs, PLOS One or his other affiliations.