No author left behind:  Getting authors published who cannot afford article processing charges

Efforts to promote open access publishing ignore the many scholars who cannot afford the article processing charges of quality open access journals. Their situation may be about to get worse.

mind the brain logo

Efforts to promote open access publishing ignore the many scholars who cannot afford the article processing charges of quality open access journals. Their situation may be about to get worse.

open accessOpen access has turned out to be a misnomer. Of course, free access to research findings is good for science and society. However, open access is clearly not freely open to the scholars who are required to pay exorbitant fees to publish their results, often out of their own pockets.

Andrew V. Suarez and Terry McGlynn 

  • Current proposals for accelerating a transition to full open access for all scholarly articles focus primarily on readers who cannot obtain paywalled articles that require a subscription or privileges at a library with subscriptions.
  • Much less attention to the many prospective authors who cannot pay article processing charges (APCs), but who fall outside a narrow range of eligibility for APC waivers and discounts.
  • This bias perpetuates global and local social inequalities in who gets to publish in quality open access journals and who does not.
  • Many open access journals provide explicit guidelines for authors from particular countries obtaining waivers and discounts, but are deliberately vague about policies and procedures for other classes of authors.
  • Many prospective authors lack resources for publishing an open access journal without having to pay out of their own pockets. They also lack awareness of how to obtain waivers. If they apply at all, they may be disappointed.
  • As an immediate solution, I encourage authors to query journals about waiver policies and share their experience in whether and how they obtain waivers with others in their social networks.
  • For a short while, it is also possible to provide feedback concerning implementation of an ambitious Plan S to encourage and require publication in open access journals. Read on and provide feedback while you can, but hurry.
  • In the absence of corrective action, a group of funding agencies is about to strengthen a model of open access publishing in which the costs of publishing are shifted to authors, most of whom are not receiving or applying for grants. Yet, they will effectively be excluded from publishing in quality of open access journals unless some compensatory mechanism is introduced.

Open access improves health care, especially in less resourced environments.

Open Access involves providing unrestricted free online access to scholarly publications. Among many benefits, open access facilitates clinicians, policymakers, and patients and their caretakers being able to obtain information for decision-making, when they lack subscription to paywalled journals or privileges at a library that subscribes.

The transition from the originally paywalled electronic bibliographic resource Medline to the open access PubMed and Google Scholar meant that without open access, such stakeholders could obtain titles and abstracts through, but making decisions only on this information can prove risky.

PLoS Medicine article noted:

Arthur Amman, President of Global Strategies for HIV Prevention, tells this story: “I recently met a physician from southern Africa, engaged in perinatal HIV prevention, whose primary access to information was abstracts posted on the Internet. Based on a single abstract, they had altered their perinatal HIV prevention program from an effective therapy to one with lesser efficacy. Had they read the full text article they would have undoubtedly realized that the study results were based on short-term follow-up, a small pivotal group, incomplete data, and unlikely to be applicable to their country situation. Their decision to alter treatment based solely on the abstract’s conclusions may have resulted in increased perinatal HIV transmission.”

Advancing open access for readers, but not for authors

wellcome trustCurrently initiatives underway to accelerate the transition to full and immediate open access to scientific and biomedical  publications:

“After 1 January 2020 scientific publications on the results from research funded by public grants provided by national and European research councils and funding bodies, must be published in compliant Open Access Journals or on compliant Open Access Platforms.”

Among the proposed guiding principles are:

“Where applicable, Open Access publication fees are covered by the Funders or universities, not by individual researchers; it is acknowledged that all scientists should be able to publish their work Open Access even if their institutions have limited means.”

And

“The journal/platform must provide automatic APC waivers for authors from low-income countries and discounts for authors in middle-income countries.”

Stop and think: what about authors who do not and cannot compete for external funding? The first 15 funders [there are currently 16]  to back Plan S accounted for only 3.5% of the global research articles in 2017, but their initiative is about to be implemented, more broadly mandating open access publishing.

Enforcing author‐pay models will strengthen the hand of those who have resources and weaken the hand of those who do not have, magnifying the north‐south academic divide, creating another structural bias, and further narrowing the knowledge‐production system (Medie & Kang 2018; Nagendra et al. 2018). People with limited access to resources will find it increasingly difficult to publish in the best journals. The European mandate will amplify the advantages of some scientists working in developed countries over their less affluent counterparts.

The author‐pays inequality may also affect equity of access within countries, including those considered developed, where there can be major differences between different research groups in their ability to pay (Openjuru et al. 2015). It is harder for disadvantaged groups from these jurisdictions to appeal for waivers (Lawson 2015), deepening the divide between those who can pay and those who cannot.

What exists now for authors who cannot afford article processing charges

What happens for authors who do not have such coverage of APCs– clinicians in community settings, public health professionals, independent scholars, patients and their advocates, or other persons without necessary affiliations or credentials who are nonetheless capable of making a contribution to bettering science and health care? That is a huge group. If they can’t pay, they won’t be able to play the publishing game or will do so in obscurity.

Too much confidence being placed in solutions that are too narrow in focus or simply do not work for this large and diverse group.

doaj logo_squareSolutions that are assumed to work, but that are inadequate

  1. Find a high quality open access journal using the DOAJ (Directory of Open Access Journals). Many of the journals that are indexed in this directory have free or low APCs.

The Directory of Open Access Journals is a service that indexes high quality, peer reviewed Open Access research journals, periodicals and their articles’ metadata. The Directory aims to be comprehensive and cover all open access academic journals that use an appropriate quality control system (see below for definitions) and is not limited to particular languages, geographical region, or subject areas. The Directory aims to increase the visibility and ease of use of open access academic journals—regardless of size and country of origin—thereby promoting their visibility, usage and impact.

DOAJ currently lists over 12,000 journals from 129 countries. It is growing rapidly, with 2018 being the best year to date. Over 1,700 journals were added. Reflecting the level of quality control, DOAJ in the same period rejected without review over 2000 poorly completed applications for journals to be included, removing them from the system so that they would not end up with the editorial teams.

Impressive? Sadly, a considerable proportion of DOAJ listed journals are obscure, narrow in specialization, and often not even listed in PubMed or Web of Knowledge/Web of Science. This is particularly true of the DOAJ journals without fees. Eigenfactor.com did an analysis of over 400 open access journals without APCs and found only the top 31 had a JIF greater than 1.00. Only the top 104 had an impact factor above 0.500. The bottom quarter of journals had JIFs of less than 0.16.

A low impact journal can still be valuable in some contexts, especially if it is in a highly specialized field or contains information relevant to stakeholders not read English. However, even in modestly resourced settings that do not cover authors’ APCs, there are commonly pressures to publish in journals with JIFs more than 1.0 and stigma and even penalties for publishing in lower impact journals.

  1. Apply for waivers or reduction in APCs through a Global Initiative Program. Current proposals are for all journals to establish such programs. Most current programs are for countries on the United Nations Least Developed Country List or countries with the lowest Healthy Life Expectancy (HALE). The PLOS website description of this program for PLOS is particularly clear.

PLOS GLOBAL PARTICIPATION INITIATIVE

The PLOS Global Participation Initiative (GPI) aims to lower barriers to publication based on cost for researchers around the world who may be unable, or have limited ability, to publish in Open Access journals.

Authors’ research funded primarily (50% or more of the work contained within the article) by an institution or organization from eligible low- and middle-income countries is automatically eligible for assistance. If the author’s research funder is based in a Group 1 country, PLOS will cover the entire publication fee and there will be no charge. For authors whose research funder is part of Group 2, PLOS will cover all but part of the publication fee — the remaining publication fee will be $500 USD.

Stop and think: For scholars in Group 2 countries [Click and see which countries these are and which countries are excluded from any such relief. You may be surprised.], how many can come up with $500 per paper? To get concrete, consider a recent PhD in a Group 2 country who is forced to work in the service sector for lack of academic opportunities who needs two quality publications to improve her chances of receiving a postdoctoral opportunity in a better-resourced setting.

  1. Apply for a waiver based on demonstration of individual need and inability to pay. Some journals only provide waivers and discounts to authors in Group 1 or Group 2 countries. Other journals are more flexible. Authors have to ask, and sometimes this must occur before they begin uploading their manuscript. Here too, PLOS is more explicit than most websites and seemingly more generous in granting waivers or discounts.

PLOS PUBLICATION FEE ASSISTANCE PROGRAM

The PLOS Publication Fee Assistance (PFA) program was created for authors unable to pay all or part of their publication fees and who can demonstrate financial need.

An author can apply for PFA when submitting an article for publication. A decision is usually sent to the author within 10 business days. PLOS considers applications on a case-by-case basis.

PLOS publication decisions are based solely on editorial criteria. Information about applications for fee assistance are not disclosed to journal editors or reviewers.

  • Authors should exhaust all alternative funding sources before applying for PFA. The application form includes questions on the availability of alternative funding sources such as the authors’ or co-authors’ institution, institutional library, government agencies and research funders. Funding disclosure information provided by authors will be used as part of the PFA application review.

  • Assistance must be formally applied for at submission. Requests made during the review process or after acceptance will not be considered. Authors cannot apply for the fee assistance by email or through direct request to journal editors.

The PLOS website states:

In 2017 PLOS provided $2.1 million in individual fee support to its authors, through the PLOS Global Participation Initiative (GPI) and Publication Fee Assistance Program.

That sounds like a generous sum of money. It does not distinguish between payments made through the PLOS Global Participation Initiative (GPI) and the fee assistance program requiring individual application. Consider some math.

APCs for PLOS One are currently $1,595 USD; for PLOS Biology and PLOS Medicine, $3,000 USD.

In 2017, PLOS published ~23,000 articles, maybe 80% in PLOS One.

So, a lower estimate would be that PLOS took in $35,000,000 in APCs in 2017.

The Scholarly Kitchen reports that 2017 was not a good financial year for the Public Library of Science (PLOS). Largely as a result of a continued decline in submissions to PLOS One, which peaked at over 32,000 in 2013, revenue was down by $2 million. The Scholarly Kitchen quotes the PLOS’ 2017 Financial Overview:

“All our decisions in 2017 (and 2018) have been driven by the need to be fiscally responsible and remain a sustainable non-profit organization.”

In response, PLOS is increasing APCs by US$100 for 2019.

PLOS is a non-profit, not a charitable organization. It should be no surprise that PLOS did not respond to my request that they publicize more widely details of their program to waive or discount APCs for authors outside of what is done for the Global Participation Initiative. Presumably, at least some authors who cannot pay full APCs find ways of getting reimbursed. A procedure for too easily getting waivers and discounts from PLOS would encourage gaming and authors not utilizing resources in their own settings that are involve more effort, take more time or are more uncertain in whether they will provide reimbursements.

PLOS provides insufficient details of the criteria for receiving a waiver. There is no readily available information about what proportion of requested waivers are granted or the average size of discounts.

My modest efforts to promote publishing in quality open access journals by authors who are less likely to do so

 I work with a range of authors who sometimes need assistance getting published in the open access journals that will most reach the readership that they want to influence. For instance, much probing of published papers for errors and some bad science is done by people on the fringe of academia who currently do not have affiliations. We downloaded and reanalyzed data from a PNAS article, and the authors responded by altering the data without acknowledging they had done so, reanalyzing the data and ridiculing us in a PLOS One article. We had to request a waiver of APCs formally before it was granted. I had to provide evidence of my retirement. Open access journals, like those of PLOS or Nature Springer do not grant waivers automatically for substantive criticism of published articles, even when serious problems are being identified.

As another example, patient citizen scientists have had a crucial role in reanalizing data from the PACE trial of cognitive behavior therapy and graded exercise therapy for chronic fatigue syndrome. These activists have faced strong resistance from the PACE investigators and their supporters when they attempt to publish. It is nonetheless important for these activists reach clinicians and policymakers outside of their own community. Journal of Health Psychology organized a special issue around an article by patient scientist activist Keith Geraghty, ‘PACE-Gate’: When clinical trial evidence meets open data access. A last minute decision by the editorial board (which included me) was crucial in the issue’s rapid distribution within the patient community, but also among policy makers.

A large group of authors who are disadvantaged by current open access publishing policies are early career academics in Eastern Europe and Latin American countries, whom I reach in face-to-face and web-based writing workshops. Their universities do not typically fall into group 1 or group 2 countries, although they share some of the same disadvantages in terms of resources. These ECAs often lack mentorship because the older generation academics and administrators did not have to publish anything of quality, if they often had to publish at all. This older cohort nonetheless hold the ECAs responsible for improving their institutions reputation and visibility with expectations that would be much more appropriate to properly mentored ECAs in well-sourced settings. I have heard these unrealistic expectations referred to as the “field of dreams” administrative philosophy.

It is important for these ECAs to publish in open access journals in their own language, which uniformly low JIFs and often not listed international electronic bibliographic sources. Yet, they also must publish in English-language journals of at least minimal JIF. When I discussed these ECAs with colleagues in more sourced settings, I was criticized falling into the common logical fallacy of “affirming the consequent” by assuming JIF is 1) a true measure of “goodness” and 2) that publishing in smaller, non-English journals is a penalty. My reply is ‘please don’t shoot the messenger’ or blame the victims of irrational and unrealistic expectations.

In brief trainings, I can provide an overview of the process of getting published in the quality journal in a rapidly changing time of digitalization and quick obsolescence of the old ways of doing things. Often these ECAs are struggling without a map. I can show them how to use resources like JANE (Journal/Author Name estimator) to select a range of possible journals; how to avoid the trap of predatory journals, which are increasingly sophisticated and appealing to naïve authors; creative ways of utilizing Google Scholar to be strategic about titles and abstracts; and the more general use of publisher and journal websites to access the resources that are increasingly real there. But ultimately, it is important for ECAs to gain and curate their own experiences and share them as a substitute for the mentorship and accumulated knowledge about publishing in the most appropriate journals that they do not have.

In many of these settings, there is an ongoing crucial transition with retirements opening new opportunities. Just as these ECAs struggle to gain the achievements and credentials that success in their careers require, it could be coming more difficult for them to publish in the most appropriate open access journals. Implementation of Plan S as it is currently envisioned may mean that some major funding agencies and well resourced institutions will assume more of a burden for absorbing the costs of publishing open access.

Scholars with access to international funding and coverage of the APCs required by the dominant model of open access publishing have a huge advantage over many scholars without such resources: scholars outing and correcting bad science; patient citizen scientists; and the large group of scholars disadvantaged by being in the Global South simply being many other settings incapable of providing relief from APCs. It may not be possible to fill gaps in the opportunity to publish in quality open access journals if the dominant business model continues to be author focused APCs or subsidies by publishers and journals. The gap may widen with implementation of Plan S.

global south
Global South

A closing window in which to attempt to influence implementation of Plan S…

If you are concerned about inequalities in the opportunities to publish in quality open access journals, there is a small window in which you can express your concerns and potentially influence the implementation of a broad plan to transform publishing in open access journals, Plan S of cOALition S.

coalitions-1

cOALition S is a group of national research funding organizations, with the support of the European Commission and the European Research Council (ERC), launching an initiative to make full and immediate Open Access to research publications a reality. It is built around Plan S, which consists of one target and 10 principles. Other researchers from across the world are signing on, including China in December 2018. Nonetheless, Plan S is decidedly focused on issues arising in Western Europe where there well-resourced universities have access to supportive funding organizations.

The 10 principles are no longer up for debate, but there is an opportunity to influence how they will be implemented. Until February 1, 2019, feedback can be left concerning two key questions

  1. Is there anything unclear or are there any issues that have not been addressed by the guidance document?
  2. Are there other mechanisms or requirements funders should consider to foster full and immediate Open Access of research outputs?

Please click and provide feedback now, before it is too late.

Wisdom of the Ego: Childhood Adverse Experiences Are Not Destiny

Today’s readers probably can’t appreciate how radical George Valliant’s work was in its day.

George Valliant drew upon a longitudinal study of adult development to challenge the Freudian idea of childhood adverse experiences as destiny.

mind the brain logo

Free download of George Valliant’s Wisdom of the Ego

wisdom of the ego

 

 

Today’s readers probably can’t appreciate how radical George Valliant’s work was in its day.

George Valliant drew upon a longitudinal study of adult development to challenge the Freudian idea of childhood adverse experiences as destiny.

 

 

You can learn more about the study Valliant headed

Harvard study of development 

And

Summary of the Harvard Grant Study: Triumphs of Experience

I know, in his last book, George Valliant turned into kind of a positive psychology guru of sorts, using results of the study to espouse views about how to lead a happy and meaningful life. I’ll just have to live with that and maybe some of the liberties he took in interpreting his data.

But now the important thing is that his classic book, Wisdom of the Ego, is available free for download. Get it here

The website is perfectly safe. I’ve made one of my own books available there. After having made lots of money from publishing mainly psychoanalytically and psychodynamically oriented psychotherapy books, Jason Aronson, Publisher is on a mission to give a lot of books away free

As of October 1, 2018 readers just like you from 200 countries and territories around the world have saved $55,685,206.30 on 1,149,012 FREE downloads of classic psychotherapy books.

From the original blurb for the book:

Freud tells us that the first five years of life constitute destiny. If this were so, Vaillant asks, then how could so many deeply troubled youths become well-adjusted, productive adults? Drawing on the Study of Adult Development, based at Harvard University, this book takes us into the lives of such individuals—thriving men and women who suffered grievous disadvantages and abuses during childhood—to show us that the mind’s remarkable defense develop well into adulthood, that the maladjustments of adolescence can evolve into the virtues of maturity. In one fascinating case after another, he introduces us to middle-aged men and women learning how to love, to make meaning, to reorder chaos.

Because creativity is so intrinsic to this alchemy of the ego, Vaillant mingles these life studies with psychobiographies of famous artists and others. We meet Florence Nightingale, the intractable hypochondriac and hopeless dreamer who, at the age of thirty-one, wrote in her diary, “I see nothing desirable but death,” and we watch as she transforms her anguish into altruism, her hapless fantasies into fantastic success. In the tormented life of Sylvia Plath, we see psychosis as not only a defect but also an effort at repair, her poetry as an extraordinary illustration of the adaptive process. We witness the mature working of the mind’s defenses in the career of Anna Freud, their greatest elucidator. And we see the wisdom of the ego at work as Eugene O’Neill evolves from self-destructive youth to creator of great art.

In these compelling portraits of obscure and famous lives, Vaillant charts the evolution of the ego’s defenses, from the psychopathic to the sublime, and from the mundane to the most ingenious. An account of the boundless psychological resilience of adult development, The Wisdom of the Ego is a brilliant summation of the mind’s amazing power to fashion creative victories out of life’s would-be defeats (1041 pgs).

From a couple of reviews at the time:

“A richly textured, elegantly written, and humane book by the person who is becoming the Anna Freud of his day. Vaillant’s sympathetic treatment of the defenses is itself wise and creative.” —Robert Kegan, Harvard University and Massachusetts School of Professional Psychology

“Vaillant tells us that ego defenses are not pathological formations or symptoms of mental illness. They are ingenious self-deceptions that serve adaptation… He is to be commended for bringing certain unconscious processes into focus and for illuminating the various ways in which ego defenses contribute to a person’s adaptation to life.”—Louise J. Kaplan, The Boston Sunday Globe

You may also be interested in two of my controversial, but most heavily accessed blog posts:

Stop using the Adverse Childhood Experiences Checklist to make claims about trauma causing physical and mental health problems

And

In a classic study of early childhood abuse and neglect, effects on later mental health nearly disappeared when….

 

 

How to get a flawed systematic review and meta-analysis withdrawn from publication: a detailed example

Cochrane normally requires authors to agree to withdraw completed reviews that have been published. This withdrawal in the face of resistance from the authors is extraordinary.

There is a lot to be learned from this letter and the accompanying documents in terms of Courtney calmly and methodically laying out a compelling case for withdrawal of a review with important clinical practice and policy implications.

mind the brain logo

Robert Courtney’s wonderfully detailed cover letter probably proved decisive in getting the Cochrane review withdrawn, along with the work of another citizen scientist/patient advocate, Tom Kindlon.

Cochrane normally requires authors to agree to withdraw completed reviews that have been published. This withdrawal in the face of resistance from the authors is extraordinary.

There is a lot to be learned from this letter and the accompanying documents in terms of Courtney calmly and methodically laying out a compelling case for withdrawal of a review with important clinical practice and policy implications.

Especially take a look at the exchanges with the author Lillebeth Larun that are included in the letter.

Excerpt from the cover letter below:

It is my opinion that the published Cochrane review unfortunately fails to meet the standards expected by the public of Cochrane in terms of publishing rigorous, unbiased, transparent and independent analysis; So I would very much appreciate it if you could investigate all of the problems I raised in my submitted comments and ensure that corrections are made or, at the very least, that responses are provided which allow readers to understand exactly why Cochrane believe that no corrections are required, with reference to Cochrane guidelines.

On this occasion, in certain respects, I consider the review to lack rigour, to lack clarity, to be misleading, and to be flawed. I also consider the review (including the discussions, some of the analyses, and unplanned changes to the protocol) to indicate bias in favour of the treatments which it investigates.

robert bob courtneyAnother key excerpt summarized Courtney’s four comments on the Cochrane review that had not yet succeeded in getting the review withdrawn:

In summary, my four submissions focus on, but are not restricted to the following issues:

  • The review authors switched their primary outcomes in the review, and used unplanned analyses, which has had the effect of substantially transforming some of the interpretation and reporting of the primary outcomes of the review;

  • The review fails to prominently explain and describe the primary outcome switching and to provide a prominent sensitivity analysis. In my opinion, the review also fails to justify the primary outcome switching;

  • The review fails to clearly report that there were no significant treatment effects at follow-up for any pooled outcomes in any measures of health (except for sleep, a secondary outcome), but instead the review gives the impression that most follow-up outcomes indicated significant improvements, and that the treatments were largely successful at follow-up;

  • The review uses some unpublished and post-hoc data from external studies, despite the review-authors claiming that they have included only formally published data and pre-specified outcome data. Using post-hoc and unpublished data, which contradicts the review’s protocol and stated methodology, may have had a significant effect on the review outcomes, possibly even changing the review outcomes from non-significant to significant;

  • The main discussion sections in the review include incorrect and misleading reports of the review’s own outcomes, giving a.false overall impression of the efficacy of the reviewed therapies;

  • The review includes an inaccurate assessment of bias (according to the Cochrane guidelines for reporting bias) with respect to some of the studies included in the review’s analyses.

These are all serious issues, that I believe we should not be seeing in a Cochrane review.

Digression: My Correspondence with Tom Kindlon regarding this blog post

James Coyne <jcoynester@gmail.com>

Oct 18, 2018, 12:45 PM (3 days ago)

to Tom

I’m going to be doing a couple of blog posts about Bob, one of them about the details of the lost year of his life (2017) which he shared with me in February 2018, shortly before he died. But the other blog post is going to be basically this long email posted with commentary. I am concerned that you get your proper recognition as fully sharing the honors with him for ultimately forcing the withdrawal of the exercise review. Can you give me some suggestion how that might be assured? references? blogs

Do you know the details of Bob ending his life? I know it was a deliberate decision, but was it an accompanied suicide? More people need to know about his involuntary hospitalization and stupid diagnosis of anorexia.

Kind regards

tom Kindlon
Tom Kindlon

Tom Kindlon’s reply to me

Tom Kindlon

Oct 18, 2018, 1:01 PM (3 days ago)

Hi James/Jim,

It is great you’re going to write on this.

I submitted two long comments on the Cochrane review of exercise therapy for CFS, which can be read here:

<https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD003200.pub7/detailed-comment/en?messageId=157054020&gt;

<https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD003200.pub7/detailed-comment/en?messageId=157052118&gt;

Robert Courtney then also wrote comments. When he was not satisfied with the responses, he made a complaint.

All the comments can be read on the review here:

<https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD003200.pub7/read-comments&gt;

but as I recall the comments by people other than Robert and myself were not substantial.

I will ask what information can be given out about Bob’s death.

Thanks again for your work on this,

Tom

The Cover Letter: Did it break the impasse about withdrawing the review?

from:     Bob <brightonbobbob@yahoo.co.uk>

to:            James Coyne <jcoynester@gmail.com>

date:     Feb 18, 2018, 5:06 PM

subject:                Fw: Formal complaint – Cochrane review CD003200Sun, Feb 18, 1:15 PM

THIS IS A COPY OF A FORMAL COMPLAINT SENT TO DR DAVID TOVEY.

Formal Complaint

12th February 2018

From:

Robert Courtney.

UK

To:

Dr David Tovey

Editor in Chief of the Cochrane Library

Cochrane Editorial Unit

020 7183 7503

dtovey@cochrane.org

Complaint with regards to:

Cochrane Database of Systematic Reviews.

Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. Exercise therapy for chronic fatigue syndrome. Cochrane Database Syst Rev. 2017; CD003200. DOI: 10.1002/14651858.CD003200.pub7

Dear Dr David Tovey,

This is a formal complaint with respect to the current version of “Exercise therapy for chronic fatigue syndrome” by L. Larun et al. (Cochrane Database Syst Rev. 2017; CD003200.)

First of all, I would like to apologise for the length of my submissions relating to this complaint. The issues are technical and complex and I hope that I have made them easy to read and understand despite the length of the text.

I have attached four PDF files to this email which outline the details of my complaint. In 2016, I submitted each of these documents as part of the Cochrane comments facility. They have now been published in the updated version of the review. (For your convenience, the details of these submissions are listed at the end of this email with a weblink to an online copy of each document.)

I have found the responses to my comments, by L. Larun, the lead author of the review, to be inadequate, especially considering the seriousness of some of the issues raised.

It is my opinion that the published Cochrane review unfortunately fails to meet the standards expected by the public of Cochrane in terms of publishing rigorous, unbiased, transparent and independent analysis; So I would very much appreciate it if you could investigate all of the problems I raised in my submitted comments and ensure that corrections are made or, at the very least, that responses are provided which allow readers to understand exactly why Cochrane believe that no corrections are required, with reference to Cochrane guidelines.

On this occasion, in certain respects, I consider the review to lack rigour, to lack clarity, to be misleading, and to be flawed. I also consider the review (including the discussions, some of the analyses, and unplanned changes to the protocol) to indicate bias in favour of the treatments which it investigates.

Exercise as a therapy for chronic fatigue syndrome is a highly controversial subject, and so there may be more of a need for independent oversight and scrutiny of this Cochrane review than might usually be the case.

In addition to the technical/methodological issues raised in my four submitted comments, I would also like you to consider whether there may be a potential lack of independence on the part of the authors of this review.

All of the review authors, bar Price, are currently working in collaboration on another Cochrane project with some of the authors of the studies included in this review. (The project involves co-authoring a protocol for a future Cochrane review) [2]. One of the meetings held to develop the protocol for this new review was funded by Peter White’s academic fund [1]. White is the Primary Investigator for the PACE trial (a study included in this Cochrane review).

It is important that Cochrane is seen to uphold high standards of independence, transparency and rigour.

Please refer to my four separate submissions (attached) for the details of my complaint regarding the contents of the review. As way of an introduction, only, I will also briefly discuss, below, some of the points I have raised in my four documents.

In summary, my four submissions focus on, but are not restricted to the following issues:

  • The review authors switched their primary outcomes in the review, and used unplanned analyses, which has had the effect of substantially transforming some of the interpretation and reporting of the primary outcomes of the review;
  • The review fails to prominently explain and describe the primary outcome switching and to provide a prominent sensitivity analysis. In my opinion, the review also fails to justify the primary outcome switching;
  • The review fails to clearly report that there were no significant treatment effects at follow-up for any pooled outcomes in any measures of health (except for sleep, a secondary outcome), but instead the review gives the impression that most follow-up outcomes indicated significant improvements, and that the treatments were largely successful at follow-up;
  • The review uses some unpublished and post-hoc data from external studies, despite the review-authors claiming that they have included only formally published data and pre-specified outcome data. Using post-hoc and unpublished data, which contradicts the review’s protocol and stated methodology, may have had a significant effect on the review outcomes, possibly even changing the review outcomes from non-significant to significant;
  • The main discussion sections in the review include incorrect and misleading reports of the review’s own outcomes, giving a.false overall impression of the efficacy of the reviewed therapies;
  • The review includes an inaccurate assessment of bias (according to the Cochrane guidelines for reporting bias) with respect to some of the studies included in the review’s analyses.

These are all serious issues, that I believe we should not be seeing in a Cochrane review.

These issues have already caused misunderstanding and misreporting of the review in academic discourse and publishing. (See an example of this below.)

All of the issues listed above are explained in full detail in the four PDF files attached to this email. They should be considered to be the basis of this complaint.

For the purposes of this correspondence, I will illustrate some specific issues in more detail.

In the review, the following health indicators were used as outcomes to assess treatment effects: fatigue, physical function, overall health, pain, quality of life, depression, anxiety, and sleep. All of these health indicators, except uniquely for sleep (a secondary outcome) demonstrated a non-significant outcome for pooled treatment effects at follow-up for exercise therapy versus passive control. But a reader would not be aware of this from reading any of the discussion in the review. I undertook a lengthy and detailed analysis of the data in the review before i could comprehend this. I would like these results to be placed in a prominent position in the review, and reported correctly and with clarity, so that a casual reader can quickly understand these important outcomes. These outcomes cannot be understood from reading the discussion, and some outcomes have been reported incorrectly in the discussion. In my opinion, Cochrane is not maintaining its expected standards.

Unfortunately, there is a prominent and important error in the review, which I believe helps to give the mis-impression that the investigated therapies were broadly effective. Physical function and overall-health (both at follow-up) have been mis-reported in the main discussion as being positive outcomes at follow-up, when in fact they were non-significant outcomes. This seems to be an important failing of the review that I would like to be investigated and corrected.

Regarding one of the points listed above, copied here:

“The review fails to clearly report that there were no significant treatment effects at follow-up for any pooled outcomes in any measures of health (except for sleep, a secondary outcome), but instead the review gives the impression that most follow-up outcomes indicated significant improvements, and that the treatments were largely successful at follow-up”

This is one of the most substantial issues that I have highlighted. This issue is related to the primary outcome switching in the review.

(This relates to assessing fatigue at long-term follow-up for exercise therapy vs passive control.)

An ordinary (i.e. casual) reader of the review may easily be left with the impression that the review demonstrates that the investigated treatment has almost universal beneficial health effects. However there were no significant treatment effects for pooled outcome analyses at follow-up for any health outcomes except for sleep (a secondary outcome ). The lack of universal treatment efficacy at follow-up is not at all clear from a casual read of the review, or even from a thorough read. Instead, a careful analysis of the data is necessary to understand the outcomes. I believe that the review is unhelpful in the way it has presented the outcomes, and lacks clarify.

These follow-up outcomes are a very important issue for medical, patient and research communities, but I believe that they have been presented in a misleading and unhelpful way in the discussions of the review. This issue is discussed mainly in my submission no.4 (see my list of PDF documents at the bottom of this correspondence), and also a little in submission no.3.

I will briefly explain some of the specific details, as way of an introduction, but please refer to my attached documents for the full details.

The pre-specified primary outcomes were pooled treatment effects (i.e. using pooled data from all eligible studies) immediately after treatment and at follow-up.

However, for fatigue, this pre-specified primary outcome (i.e. pooled treatment effects for the combination of data from all eligible studies) was abandoned/switched (for what i consider to be questionable reasons) and replaced with a non-pooled analysis. The new unplanned analysis did not pool the data from all eligible studies but analysed data from studies grouped together by the specific measure used to assess fatigue (i.e. grouped by the various different fatigue questionnaire assessments).

Looking at these post-hoc grouped outcomes, for fatigue at follow-up , two out of the three grouped outcomes had significant treatment effects, and the other outcome was a non-significant effect. This post-hoc analysis indicates that the majority of outcomes ( i.e. two out of three) demonstrated a significant treatment effect , however, this does not mean that the pre-specified pooled analysis of all eligible studies would have demonstrated a positive treatment effect. Therefore switching outcomes, and using a post-hoc analysis, allows for the potential introduction of bias to the review. Indeed, on careful inspection of the minutia of the review, the pre-specified analysis of pooled outcomes demonstrates a non-significant treatment effect, for fatigue at follow-up (exercise therapy versus passive control)

The (non-significant) outcome of this pre-specified pooled analysis of fatigue at follow-up is somewhat buried within the data tables of review, and is very difficult to find; It is not discussed prominently or highlighted. Furthermore, the explanation that the primary outcome was switched, is only briefly mentioned and can easily be missed. Uniquely, for the main outcomes, there is no table outlining the details of the pre-specified pooled analysis of fatigue at follow-up. In contrast, the post-hoc analysis, which has mainly positive outcomes, has been given high prominence throughout the review with little explanation that it is a post-hoc outcome.

So, to reiterate, the (two out of three significant, and one non-significant) post-hoc outcomes for fatigue at follow-up were reported as primary outcomes instead of the (non-significant) pre-specified pooled treatment effect for all eligible studies. Two out of three post-hoc outcomes were significant in effect, however, the pre-specified pooled treatment effect, for the same measures, were not significant (for fatigue at follow-up – exercise therapy versus passive control). Thus, the outcome switching transformed one of the main outcomes of the review, from a non-insignificant effect to a mainly significant effect.

Furthermore, for exercise therapy versus passive control at follow-up, all the other health outcomes were non-significant (except sleep – a secondary outcome), but I believe the casual reader would be unaware of this because it is not explained clearly or prominently in the discussion, and some outcomes have been reported erroneously in the discussion as indicating a significant effect.

All of the above is outlined in my four PDF submissions, with detailed reference to specific sections of the review and specific tables etc.

I believe that the actual treatment effects at follow-up are different to the impression gained from a casual read of the review, or even a careful read of the review. It’s only by an in-depth analysis of the entire review that these issues would be noticed.

In what i believe to be a reasonable request in my submissions, i asked the reviewers to: “Clearly and unambiguously explain that all but one health indicator (i.e. fatigue, physical function, overall health, pain, quality of life, depression, and anxiety, but not sleep) demonstrated a non-significant outcome for pooled treatment effects at follow-up for exercise therapy versus passive control”. My request was not acted upon.

The Cochrane reviewers did provide a reason for the change to the protocol, from a pooled analysis to analyses of groups of mean difference values: “We realise that the standardised mean difference (SMD) is much more difficult to conceptualise and interpret than the normal mean difference (MD) […]”.

However, this is a questionable and unsubstantiated claim, and in my opinion isn’t an adequate explanation or justification for changing the primary outcomes; personally, I find it easier to interpret a single pooled analysis than a group of different analyses with each analysis using a different non-standardised scale to measure fatigue.

Using a SMD is standard practice for Cochrane reviews; Cochrane’s guidance recommends using pooled analyses when the outcomes use different measures, which was the case in this review; Thus i struggle to understand why (in an unplanned change to methodology) using a SMD was considered unhelpful by the reviewers in this case. My PDF document no.4 challenges the reviewers’ reason, with reference to the official Cochrane reviewers’ guidelines.

This review has already led to an academic misunderstanding and mis-reporting of its outcomes, which is demonstrated in the following published letter from one of the co-authors of the IPD protocol……

CMAJ (Canada) recommends exercise for CFS [http://www.cmaj.ca/content/188/7/510/tab-e-letters ]

The letter claims: “We based the recommendations on the Cochrane systematic review which looked at 8 randomised trials of exercise for chronic fatigue, and together showed a consistent modest benefit of exercise across the different patient groups included. The clear and consistent benefit suggests indication rather than contraindication of exercise.”

However, there was not a “consistent modest benefit of exercise” and there was not a “clear and consistent benefit” considering that there were no significant treatment effects for any pre-specified (pooled) health outcomes at follow-up, except for sleep. The actual outcomes of the review seem to contradict the interpretation expressed in the letter.

Even if we include the unplanned analyses in our considerations, then it would still be the case that most outcomes did not indicate a beneficial treatment effect at follow-up for exercise therapy versus passive control. Furthermore, one of the most important outcomes, physical function, did not indicate a significant improvement at follow up (despite the discussion erroneously stating that it was a significant effect).

Two of my submissions discuss other issues, which I will outline below.

My first submission is in relation to the following…

The review states that all the analysed data had previously been formally published and was pre-specified in the relevant published studies. However, the review includes an analysis of external data that had not been formally published and is post-hoc in nature, despite alternative data being available that has been formally published and had been pre-specified in the relevant study. The post-hoc data relates to the FINE trial (Wearden 2010). The use of this data was not in accordance with the Cochrane review’s protocol and also contradicts the review’s stated methodology and the discussion of the review.

Specifically, the fatigue data taken from the FINE trial was not pre-specified for the trial and was not included in the original FINE trial literature. Instead, the data had been informally posted on a BMJ rapid response by the FINE trial investigators[3].

The review analyses post-hoc fatigue data from the FINE trial which is based on the Likert scoring system for the Chalder fatigue questionnaire, whereas the formally published FINE trial literature uses the same Chalder fatigue questionnaires but uses the biomodal scoring system, giving different outcomes for the same patient questionnaires. The FINE trial’s post-hoc Likert fatigue data (used in the review) was initially published by the FINE authors only in a BMJ rapid response post [3], apparently as an after-thought.

This is the response to my first letter…

Larun
Larun said she was “extremely concerned and disappointed” with the Cochrane editors’ actions. “I disagree with the decision and consider it to be disproportionate and poorly justified,” she said.

———————-

Larun said:

Dear Robert Courtney

Thank you for your detailed comments on the Cochrane review ‘Exercise Therapy for Chronic Fatigue Syndrome’. We have the greatest respect for your right to comment on and disagree with our work. We take our work as researchers extremely seriously and publish reports that have been subject to rigorous internal and external peer review. In the spirit of openness, transparency and mutual respect we must politely agree to disagree.

The Chalder Fatigue Scale was used to measure fatigue. The results from the Wearden 2010 trial show a statistically significant difference in favour of pragmatic rehabilitation at 20 weeks, regardless whether the results were scored bi-modally or on a scale from 0-3. The effect estimate for the 70 week comparison with the scale scored bi-modally was -1.00 (CI-2.10 to +0.11; p =.076) and -2.55 (-4.99 to -0.11; p=.040) for 0123 scoring. The FINE data measured on the 33-point scale was published in an online rapid response after a reader requested it. We therefore knew that the data existed, and requested clarifying details from the authors to be able to use the estimates in our meta-analysis. In our unadjusted analysis the results were similar for the scale scored bi-modally and the scale scored from 0 to 3, i.e. a statistically significant difference in favour of rehabilitation at 20 weeks and a trend that does not reach statistical significance in favour of pragmatic rehabilitation at 70 weeks. The decision to use the 0123 scoring did does not affect the conclusion of the review.

Regards,

Lillebeth Larun

——————

In her response, above, Larun discusses the FINE trial and quotes an effect size for post-hoc outcome data (fatigue at follow-up) from the FINE trial that is included in the review. Her quoted figures accurately reflect the data quoted by the FINE authors in their BMJ rapid-response comment [3] but, confusingly, these are slightly different from the data in the Cochrane review. In her response, Larun states that the FINE trial effect size for fatigue at 70 weeks using Likert data is -2.55 (-4.99 to -0.11; p=.040), whereas the Cochrane Review states that it is -2.12 [-4.49, 0.25].

This inconsistency makes this discussion confusing. Unfortunately there is no authoritative source for the data because it had not been formally published when the Cochrane review was published.

It seems that, in her response, Larun has quoted the BMJ rapid response data by Wearden et al.[3], rather than her own review’s data. Referring to her review’s data, Larun says that in “our unadjusted analysis the results were similar for the scale scored bi-modally and the scale scored from 0 to 3, i.e. a statistically significant difference in favour of rehabilitation at 20 weeks and a trend that does not reach statistical significance in favour of pragmatic rehabilitation at 70 weeks”.

It is not clear exactly why there are now two different Likert effect sizes, for fatigue at 70 weeks, but we can be sure that the use of this data undermines the review’s claim that “for this updated review, we have not collected unpublished data for our outcomes…”

This confusion, perhaps, demonstrates one of the pitfalls of using unpublished data. The difference between the data published in the review and the data quoted by Larun in her response (which are both supposedly the same unpublished data from the FINE trial) raises the question of exactly what data has been analysed in the review, and what exactly is the source . If it is unpublished data, and seemingly variable in nature, how are readers expected to scrutinise or trust the Cochrane analysis?

With respect to the FINE trial outcomes (fatigue at 70 week follow-up), Larun has provided the mean differences (effect size) for the (pre-specified) bimodal data and for (post-hoc) Likert data. These two different scoring methods (bimodel and Likert), are used for identical patient Chalder fatigue questionnaires, and provide different effect sizes, so switching the fatigue scoring methods may possibly have had an impact on the review’s primary outcomes for fatigue.

Larun hasn’t provided the effect estimates for fatigue at end-of-treatment, but these would also demonstrate variance between bimodal and Likert scoring, so switching the outcomes might have had a significant impact on the primary outcome of the Cochrane review at end-of-treatment, as well as at follow-up.

Note that the effect estimates outlined in this correspondence, for the FINE trial, are mean differences (this is the data taken from the FINE trial), rather than standardised mean differences (which are sometimes used in the meta-analyses in the Cochrane review); It is important not to get confused between the two different statistical analyses.

Larun said: “The decision to use the 0123 [i.e. Likert] scoring did does [sic] not affect the conclusion of the review.”

But it is not possible for a reader to verify that because Larun has not provided any evidence to demonstrate that switching outcomes has had no effect on the conclusion of the review. i.e. There is no sensitivity analysis, despite the review switching outcomes and using unpublished post-hoc data instead of published pre-specified data. This change in methodology means that the review does not conform to its own protocol and stated methodology. This seems like a significant issue.

Are we supposed to accept the word of the author, rather than review the evidence for ourselves? This is a Cochrane review – renowned for rigour and impartiality.

Note that Larun has acknowledged that I am correct with respect to the FINE trial data used in the review (i.e. that the data was unpublished and not part of the formally published FINE trial study, but was simply posted informally in a BMJ rapid response). Larun confirms that: “…the 33-point scale was published in an online rapid response after a reader requested it. We therefore knew that the data existed, and requested clarifying details from the authors…” But then Larun confusingly (for me) says we must “agree to disagree”.

Larun has not amended her literature to resolve the situation; Larun has not changed her unplanned analysis back to her planned analyses (i.e. to use published pre-specified data as per the review protocol, rather than unpublished post-hoc data); nor has she amended the text of the review so that it clearly and prominently indicates that the primary outcomes were switched. Neither has a sensitivity analysis been published using the FINE trial’s published pre-specified data.

Note the difference in the effect estimates at 70 weeks for bimodal scoring [-1.00 (CI -2.10 to +0.11; p =.076)] vs Likert scoring [-2.55 (-4.99 to -0.11; p=.040)] (as per the Cochrane analysis) or -2.12 [-4.49, 0.25] (also Likert scoring) as per Larun’s response and the BMJ rapid response where the data was initially presented to the public.

Confusingly, there are two different effect sizes for the same (Likert) data; one shows a significant treatment effect and the other shows a non-significant treatment effect. This seems like a rather chaotic situation for a Cochrane review . The data is neither consistent nor transparent. The unplanned Cochrane analysis uses data which has not been published and cannot be scrutinised.

Furthermore, we now have three sets of data for the same outcomes. Because an unplanned analysis was used in the review, it is nearly impossible to work out what is what.

In her response, above, Larun says that both fatigue outcomes (i.e. bimodal & Likert scoring systems) at 70 weeks are non-significant. This is true of the data published in the Cochrane review but, confusingly, this isn’t true if we consider the data that Larun has provided in her response, above. The bimodal and Likert data (fatigue at 70 weeks) presented in the review both have a non-significant effect, however, the Likert data quoted in Larun’s correspondence (which reflects the data in the FINE trial authors’ BMJ rapid response) shows a significant outcome. This may reflect the use of adjusted vs unadjusted data, but it isn’t clear.

Using post-hoc data may allow bias to creep into the review; For example, the Cochrane reviewers might have seen the post hoc data for the FINE trial , because it was posted in an open-access BMJ rapid response [3] prior to the Cochrane review publication date. I am not accusing the authors of conscious bias but Cochrane guidelines are put in place to avoid doubt and to maintain rigour and transparency. Hypothetically, a biased author may have seen that a post-hoc Likert analysis allowed for better outcomes to be reported for the FINE trial. The Cochrane guidelines are established in order to avoid such potential pitfalls and bias, and to avoid the confusion that is inherent in this review.

Note that the review still incorrectly says that all the data is previously published data – even though Larun admits in the letter that it isn’t. (i.e. the data are not formally published in a peer-reviewed journal; i assume that the review wasn’t referring to data that might be informally published in blogs or magazines etc, because the review pretends to analyse formally published data only.)

The authors have practically dismissed my concerns and have not amended anything in the review, despite admitting in the response that they’ve used post-hoc data.

The fact that this is all highly confusing, even after I have studied it in detail, demonstrates that these issues need to be straightened out and fixed.

It surely shouldn’t be the case, in a Cochrane review, that we ( for the same outcomes ) have three sets of results being bandied about, and the data used in a post hoc analysis seems to vary over time, and change from a non-significant treatment effect to a significance treatment effect, depending on where it is quoted. Because it is unpublished, independent scrutiny is made more difficult.

For your information, the BMJ rapid response (Wearden et al.) includes the following data : “Effect estimates [95% confidence intervals] for 20 week comparisons are: PR versus GPTAU -3.84 [-6.17, -1.52], SE 1.18, P=0.001; SL versus GPTAU +0.30 [-1.73, +2.33], SE 1.03, P=0.772. Effect estimates [95% confidence intervals] for 70 week comparisons are: PR versus GPTAU -2.55 [-4.99,-0.11], SE 1.24, P=0.040; SL versus GPTAU +0.36 [-1.90, 2.63], SE 1.15, P=0.752.”

My second submission was in relation to the following…

I believe that properly applying the official Cochrane guidelines would require the review to categorise the PACE trial (White 2011) data as ‘unplanned’ rather than ‘pre-specified’, and would require the risk of bias in relation to ‘selective reporting’ to be categorised accordingly. The Cochrane review currently categorises the risk of ‘selective reporting’ bias for the PACE trial as “low”, whereas the official Cochrane guidelines indicate (unambiguously) that the risk of bias for the PACE data should be “high”. I believe that my argument is fairly robust and water-tight.

This is the response to my second letter…

———————–

Larun said:

Dear Robert Courtney

Thank you for your detailed comments on the Cochrane review ‘Exercise Therapy for Chronic Fatigue Syndrome’. We have the greatest respect for your right to comment on and disagree with our work. We take our work as researchers extremely seriously and publish reports that have been subject to rigorous internal and external peer review. In the spirit of openness, transparency and mutual respect we must politely agree to disagree.

Cochrane reviews aim to report the review process in a transparent way, for example, are reasons for the risk of bias stated. We do not agree that Risk of Bias for the Pace trial (White 2011) should be changed, but have presented it in a way so it is possible to see our reasoning. We find that we have been quite careful in stating the effect estimates and the certainty of the documentation. We note that you read this differently.

Regards,

Lillebeth

————————-

I do not understand what is meant by: “We do not agree that Risk of Bias for the Pace trial (White 2011) should be changed, but have presented it in a way so it is possible to see our reasoning.” …

The review does not discuss the issue of the PACE data being unplanned and I, for one, do not understand the reasoning for not correcting the category for the risk of selective reporting bias. The response to my submission fails to engage with the substantive and serious issues that I raised.

To date, nearly all the issues raised in my letters have been entirely dismissed by Larun. I find this surprising, especially considering that some of the points that I have made were factual (i.e. not particularly open to interpretation) and difficult to dispute. Indeed, Larun’s response even accepts the factual point that I made, in relation to the FINE data, but then confusingly dismisses my request for the issue to be remedied.

There is more detail in the four PDF submissions which are attached to this email, and which have now been published in the latest version of the Cochrane review. I will stop this email now so as not to overwhelm you, and so I don’t repeat myself .

Again, I apologise for the complexity. My four submissions , attached to this email as PDF files, form the basis of my complaint so I ask you to consider them to be the central basis of my complaint . I hope that they will be sufficiently clear.

I trust that you will wish to investigate these issues, with a view to upholding the high standards expected from a Cochrane review.

I look forward to hearing from you in due course. Please feel free to email me at any time with any questions, of if you believe it would be helpful to discuss any of the issues raised.

Regards,

Robert Courtney.

My ‘comments’ (submitted to the Cochrane review authors):

Please note that the four attached PDF documents form the basis of this complaint.

For your convenience, I have included a weblink to a downloadable online copy of each document, and I have attached copies to this email as PDF files, and the comments have now been published in the latest updated version of the review.

The dates refer to the date the comments were submitted to Cochrane.

  1. Query re use of post-hoc unpublished outcome data: Scoring system for the Chalder fatigue scale, Wearden 2010.

Robert Courtney

16th April 2016

https://sites.google.com/site/mecfsnotes/submissions-to-the-cochrane-review-of-exercise-therapy-for-chronic-fatigue-syndrome/fine-trial-unpublished-data

  1. Assessment of Selective Reporting Bias in White 2011.

Robert Courtney

1st May 2016

https://sites.google.com/site/mecfsnotes/submissions-to-the-cochrane-review-of-exercise-therapy-for-chronic-fatigue-syndrome/pace-trial-selective-reporting-bias

  1. A query regarding the way outcomes for physical function and overall health have been described in the abstract, conclusion and discussions of the review.

Robert Courtney

12th May 2016

[ https://sites.google.com/site/mecfsnotes/submissions-to-the-cochrane-review-of-exercise-therapy-for-chronic-fatigue-syndrome/misreporting-of-outcomes-for-physical-function ]

  1. Concerns regarding the use of unplanned primary outcomes in the Cochrane review.

Robert Courtney

3rd June 2016

https://sites.google.com/site/mecfsnotes/submissions-to-the-cochrane-review-of-exercise-therapy-for-chronic-fatigue-syndrome/primary-outcome-switching

References:

  1. Quote from Cochrane reference CD011040:

“Acknowledgements[…]The author team held three meetings in 2011, 2012 and 2013 which were funded as follows: […]2013 via Peter D White’s academic fund (Professor of Psychological Medicine, Centre for Psychiatry, Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Queen Mary University of London).”

  1. Larun L, Odgaard-Jensen J, Brurberg KG, Chalder T, Dybwad M, Moss-Morris RE, Sharpe M, Wallman K, Wearden A, White PD, Glasziou PP. Exercise therapy for chronic fatigue syndrome (individual patient data) (Protocol). Cochrane Database of Systematic Reviews 2014, Issue 4. Art. No.: CD011040.

http://onlinelibrary.wiley.com/doi/10.1002/14651858.CD011040/abstract

http://www.cochrane.org/CD011040/DEPRESSN_exercise-therapy-for-chronic-fatigue-syndrome-individual-patient-data

 

  1. Wearden AJ, Dowrick C, Chew-Graham C, et al. Fatigue scale. BMJ Rapid Response. 2010.

http://www.bmj.com/rapid-response/2011/11/02/fatigue-scale-0 (accessed Feb 21, 2016).

End.

Cochrane complaints procedure:

http://www.cochranelibrary.com/help/the-cochrane-library-complaints-procedure.html

The lost last year of one of the key two people in getting the Cochrane review of exercise withdrawn

Did the struggle to get the Cochrane review withdrawn kill Robert Courtney? Or the denial of his basic human rights by the medical system?

mind the brain logo

An incomplete  story that urgently needs to be told. We need to get some conversations going.

Did the struggle to get the Cochrane review withdrawn kill Robert Courtney? Or did the denial of his basic human rights by the medical system?

LONDON, Oct 17 (Reuters) – A respected science journal is to withdraw a much-cited review of evidence on an illness known as chronic fatigue syndrome (CFS) amid fierce criticism and pressure from activists and patients.

robert courtney
Robert Courtney from https://www.meaction.net/2018/03/19/a-tribute-to-robert-courtney/

Citizen scientists and patient advocates Tom Kindlon and Robert Courtney played a decisive role in getting the Cochrane review withdrawn.

In the next few days, I will provide the cover letter email sent by Robert Courtney to Senior Cochrane Editor David Tovey that accompanied his last decisive contribution.  Robert is now deceased.

I will also provide links to Tom Kindlon’s contributions that are just as important.

Readers will be able to see from what David Tuller calls their cogent, persuasive and unassailable submissions that the designation of these two as citizen scientists is well-deserved.

Background

Since 2015, I have kept in touch with an advisory group of about a dozen patients with myalgic encephalomyelitis/chronic fatigue syndrome (ME/cfs). I send emails to myself with this group blind copied. The rationale was that any one of them could respond to me and not have the response revealed to anyone else. A number of patients requested that kind of confidentiality, given the divisions within the patient community.

Robert Courtney was a valued, active member of that group, but then he mysteriously disappeared in January 2017. Patients have their own reasons for entering and withdrawing from social engagement. Sometimes they announce taking leave, sometimes not. I’ve learned to respect absences without challenge, but  I sometimes ask around. In the case of Robert, I could learn nothing from the community except he was not well.

Then in February 2018, Robert reemerged with the email message below. I had assumed his recovery would continue and he would participate in telling his story. Obviously there were a lot more details to tell, but he died by suicide a few weeks later.

Long, unbroken periods of being housebound and often bedridden is one of the curses of having  severe ME/cfs. Able-bodied persons need to understand the reluctance of patients to invite them into their homes.  Even able-bodied persons who believe that they have forged strong bonds with patients on social media.

I nonetheless occasionally make such offers to meet, as I travel through Europe.  I’m typically told things like “sorry, I only leave my house for medical appointments and a twice a year holiday with my family.”

We have to learn not to be offended.

Consequently, few  people who were touched by Robert Courtney and his efforts have ever met him. Most know little about him beyond his strong presence in social media.

From MEpedia, a crowd-sourced encyclopedia of ME and CFS science and history:

Robert Courtney (d. March 7, 2018) was a patient advocate for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and an outspoken critic of the PACE trial and the biopsychosocial model of chronic fatigue syndrome. He authored numerous published letters in medical journals regarding the PACE trial and, also, filed freedom of information requests in an attempt to get the authors of the PACE trial to release the full trial data to the public for scrutiny.

The day after I received the email below, Robert Courtney sent off to  David Tovey of the Senior Editor Cochrane his final comments.

The email describes the horrible conditions of his last year and his mistreatment and the denial of basic human rights by the medical system. I think airing his story as a wake up call can become another of his contributions to the struggle for the dignity and rights of the patient community.

An excerpt from the email, repeated below.

It seems that this type of mistreatment is all too typical for ME patients. Since I’ve been out of hospital, many patients have told me that they have similar nutritional difficulties, and that they are too scared to seek medical assistance, and that quite a lot of them have been threatened with detention or indeed have been detained under the mental health act. It is a much worse situation than I ever realised.-Robert “Bob” Courtney

We can never know whether Bob’ determined effort to get the review withdrawn led to his medical collapse. The speculation is not just a mindless invoking of “stress kills.” One of the cardinal, defining symptoms of myalgic encephalomyelitis is post exertion malaise.

We usually think of the “exertion” as being physical, but patients with severe form of the illness learn to anticipate that sustained emotional arousal can, within 48 hours or so, put them in their beds for weeks. That applies to positive emotion, like a birthday party, and certainly to negative emotion. Aside from the stress, frustration, and uncertainty of trying to get bad science out of the literature, Bob and other members of the patient community had to contend with enormous vilification and gaslighting, which  still continues today.

After the anorexia diagnosis, they rediagnosed my ME symptoms as being part of a somatoform disorder, and placed me on an eating disorders unit. .-Robert “Bob” Courtney

On Sat, Feb 17, 2018 at 2:44 PM, Bob <brightonbobbob@yahoo.co.uk> wrote:

Hi James,

I don’t know if you’ll remember me. I am an ME patient who was in regular contact with you in 2016. Unfortunately I had a health crisis in early 2017 and I was hospitalised for most of the year. I had developed severe food intolerances and associated difficulties with eating and nutrition. When I admitted myself to hospital they quickly decided there was nothing medically wrong with me and then diagnosed me with anorexia ( to my shock and bewilderment ), and subsequently detained me under the mental health act. I’m not anorexic. The level of ignorance, mistreatment, neglect, abuse, and miscommunication was staggering. After the anorexia diagnosis, they rediagnosed my ME symptoms as being part of a somatoform disorder, and placed me on an eating disorders unit. Then they force-fed me.  It is a very long and troubling story and I’ll spare you the details. I’d quite like a journalist to write up my story but that will have to wait while I address my ongoing health issues.

Unfortunately, it seems that this type of mistreatment is all too typical for ME patients. Since I’ve been out of hospital, many patients have told me that they have similar nutritional difficulties, and that they are too scared to seek medical assistance, and that quite a lot of them have been threatened with detention or indeed have been detained under the mental health act. It is a much worse situation than I ever realised. It is only by sharing my story that people have approached me and been able to tell me what had happened to them. It is such an embarrassing situation both to have eating difficulties and to be detained. The detention is humiliating and the eating difficulties are also excruciatingly embarrassing. Having difficulties with food makes one feel subhuman. So I have discovered that many patients keep their stories to themselves.

You might remember that in 2016 I submitted four lengthy comments to Cochrane with respect to the exercise therapy for chronic fatigue syndrome review. . Before hospital, I had also written an incomplete draft complaint to follow up my submitted comments, but my health crisis interrupted the process and so I haven’t yet sent it .

I am out of hospital now and have finished editing the complaint and I am about to send it. I am going to blind copy you into the complaint so this email is just to let you know to expect it. I’ll probably send it within the next 24 hours. The complaint isn’t as concise or carefully formatted as it could be because I’m still unwell and I have limited capacity.

Anyway this is just to give you some advance notice. I hope this email finds you in good spirits. I haven’t been keeping up to date with the news and activities, while I’ve been away, but I see there’s been a lot of activity. Thanks so much your ongoing efforts.

Best wishes,

Bob (Robert Courtney)

My replies

James Coyne <jcoynester@gmail.com>

Feb 17, 2018, 2:50 PM

to Bob

Bob, I remember you well as one of the heroes of the patient movement, and a particularly exemplary hero because you so captured my idea or of the citizen scientist gathering the data and the sense of methodology to understand the illness and battle the PACE people. I’m so excited to see your reemergence. I look forward to what you send.

Warmest regards

Jim

James Coyne <jcoynester@gmail.com>

Feb 17, 2018, 3:11 PM

to Bob

Your first goal must be to look after yourself and keep yourself as active and well as possible. You know, the patient conception of pacing. You are an important model and resource for lots of people

But when you are ready, I look forward to your telling your story and how it fits with others.

Warmest of regards

Jim

Lessons we need to learn from a Lancet Psychiatry study of the association between exercise and mental health

The closer we look at a heavily promoted study of exercise and mental health, the more its flaws become obvious. There is little support for the most basic claims being made – despite the authors marshaling enormous attention to the study.

giphyThe closer we look at a heavily promoted study of exercise and mental health, the more its flaws become obvious. There is little support for the most basic claims being made – despite the authors marshaling enormous attention to the study.

Apparently, the editor of Lancet Psychiatry and reviewers did not give the study a close look before it was accepted.

The article was used to raise funds for a startup company in which one of the authors was heavily invested. This was disclosed, but doesn’t let the authors off the hook for promoting a seriously flawed study. Nor should the editor of Lancet Psychiatry or reviewers escape criticism, nor the large number of people on Twitter who thoughtlessly retweeted and “liked” a series of tweets from the last author of the study.

This blog post is intended to raise consciousness about bad science appearing in prestigious journals and to allow citizen scientists to evaluate their own critical thinking skills in terms of their ability to detect misleading and exaggerated claims.

1.Sometimes a disclosure of extensive conflicts of interest alerts us not to pay serious attention to a study. Instead, we should question why the study got published in a prestigious peer-reviewed journal when it had such an obvious risk of bias.

2.We need citizen scientists with critical thinking skills to identify such promotional efforts and alert others in their social network that hype and hokum are being delivered.

3.We need to stand up to authors who use scientific papers for commercial purposes, especially when they troll critics.

Read on and you will see what a skeptical look at the paper and its promotion revealed.

  • The study failed to capitalize on the potential of multiple years of data for developing and evaluating statistical models. Bigger is not necessarily better. Combining multiple years of data was wasteful and served only the purpose of providing the authors bragging rights and the impressive, but meaningless p-values that come from overly large samples.
  • The study relied on an unvalidated and inadequate measure of mental health that confounded recurring stressful environmental conditions in the work or home with mental health problems, even where validated measures of mental health would reveal no effects.
  • The study used an odd measure of history of mental health problems that undoubtedly exaggerated past history.
  • The study confused physical activity with (planned) exercise. Authors amplified their confusion by relying on an exceedingly odd strategy for getting estimate of how much participants exercised: Estimates of time spent in a single activity was used in analyses of total time spent exercising. All other physical activity was ignored.
  • The study made a passing acknowledgment of the problems interpreting simple associations as causal, but then went on to selectively sample the existing literature to make the case that interventions to increase exercise improve mental health.
  • Taken together, a skeptical of assessment of this article provides another demonstration that disclosure of substantial financial conflicts of interests should alert readers to a high likelihood of a hyped, inaccurately reported study.
  • The article was pay walled so that anyone interested in evaluating the authors claims for themselves had to write to the author or have access to the article through a university library site. I am waiting for the authors to reply to my requests for the supplementary tables that are needed to make full sense of their claims. In the meantime, I’ll just complain about authors with significant conflicts of interest heavily promoting studies that they hide behind paid walls.

I welcome you to  examine the author’s thread of tweets. Request the actual article from the author if you want to evaluate independently my claims. This can be great material for a masters or honors class on critical appraisal, whether in psychology or journalism.

title of article

Let me know if you think that I’ve been too hard on this study.

A thread of tweets  from the last author celebrated the success of well orchestrated publicity campaign for a new article concerning exercise and mental health in Lancet Psychiatry.

The thread started:

Our new @TheLancetPsych paper was the biggest ever study of exercise and mental health. it caused quite a stir! here’s my guided tour of the paper, highlighting some of our excitements and apprehensions along the way [thread] 1/n

And ended with pitch for the author’s do-good startup company:

Where do we go from here? Over @spring_health – our mental health startup in New York City – we’re using these findings to develop personalized exercise plans. We want to help every individual feel better—faster, and understand exactly what each patient needs the most.

I wasn’t long into the thread before my skepticism was stimulated. The fourth tweet in the thread had a figure that didn’t get any comments about how bizarre it was.

The tweet

It looks like those differences mattered. for example, people who exercised for about 45 minutes seemed to have better mental health than people who exercised for less than 30, or more than 60 minutes. — a sweet spot for mental health, perhaps?

graphs from paper

Apparently the author does not comment on an anomaly either. Housework appears to be better for mental health than a summary score of all exercise and looks equal to or better than cycling or jogging. But how did housework slip into the category “exercise”?

I begin wondering what the authors meant by “exercise” or if they’d given the definition serious consideration when constructing their key variable from the survey data.

But then that tweet was followed by another one that generated more confusion with a  graph the seemingly contradicted the figures in the last one

the type of exercise people did seems important too! People doing team sports or cycling had much better mental health than other sports. But even just walking or doing household chores was better than nothing!

Then a self-congratulatory tweet for a promotional job well done.

for sure — these findings are exciting, and it has been overwhelming to see the whole world talking openly and optimistically about mental health, and how we can help people feel better. It isn’t all plain sailing though…

The author’s next tweet revealed a serious limitation to the measure of mental health used in the study in a screenshot.

screenshot up tweet with mental health variable

The author acknowledged the potential problem, sort of:

(1b- this might not be the end of the world. In general, most peple have a reasonable understanding of their feelings, and in depressed or anxious patients self-report evaluations are highly correlated with clinician-rated evaluations. But we could be more precise in the future)

“Not the end of the world?” Since when does the author of the paper in the Lancet family of journals so casually brush off a serious methodological issue? A lot of us who have examined the validity of mental health measures would be skeptical of this dismissal  of a potentially fatal limitation.

No validation is provided for this measure. On the face of it, respondents could endorse it on basis of facing  recurring stressful situations that had no consequences for their mental health. This reflects ambiguity of the term stress for both laypersons and scientists. “Stress” could variously refer to an environmental situation, a subjective experience of stress, or an adaptational outcome. Waitstaff could consider Thursday when the chef is off, a recurrent, weekly stress. Persons with diagnosable persistent depressive disorder would presumably endorse more days than not as being a mental health challenge. But they would mean something entirely different.

The author acknowledged that the association between exercise and mental health might be bidirectional in terms of causality

adam on lots of reasons to believe relationship goes both ways.PNG

But then made a strong claim for increased exercise leading to better mental health.

exercise increases mental health.PNG

[Actually, as we will see, the evidence from randomized trials of exercise to improve mental health is modest, and entirely disappears one limits oneself to the quality studies.]

The author then runs off the rail with the claim that the benefits of exercise exceed benefits of having greater than poverty-level income.

why are we so excited.PNG

I could not resist responding.

Stop comparing adjusted correlations obtained under different circumstances as if they demonstrated what would be obtained in RCT. Don’t claim exercising would have more effect than poor people getting more money.

But I didn’t get a reply from the author.

Eventually, the author got around to plugging his startup company.

I didn’t get it. Just how did this heavy promoted study advance the science fo such  “personalized recommendation?

Important things I learned from others’ tweets about the study

I follow @BrendonStubbs on Twitter and you should too. Brendon often makes wise critical observations of studies that most everyone else is uncritically praising. But he also identifies some studies that I otherwise would miss and says very positive things about them.

He started his own thread of tweets about the study on a positive note, but then he identified a couple of critical issues.

First, he took issue with the author’s week claiming to have identified a tipping point, below which exercise is beneficial, and above which exercise could prove detrimental the mental health.

4/some interpretations are troublesome. Most confusing, are the assumptions that higher PA is associated/worsens your MH. Would we say based on cross sect data that those taking most medication/using CBT most were making their MH worse?

A postdoctoral fellow @joefirth7  seconded that concern:

I agree @BrendonStubbs: idea of high PA worsening mental health limited to observation studies. Except in rare cases of athletes overtraining, there’s no exp evidence of ‘tipping point’ effect. Cross-sect assocs of poor MH <–> higher PA likely due to multiple other factors…

Ouch! But then Brendan follows up with concerns that the measure of physical activity has not been adequately validated, noting that such self-report measures prove to be invalid.

5/ one consideration not well discussed, is self report measures of PA are hopeless (particularly in ppl w mental illness). Even those designed for population level monitoring of PA https://journals.humankinetics.com/doi/abs/10.1123/jpah.6.s1.s5 … it is also not clear if this self report PA measure has been validated?

As we will soon see, the measure used in this study is quite flawed in its conceptualization and its odd methodology of requiring participants to estimate the time spent exercising for only one activity, with 70 choices.

Next, Brandon points to a particular problem using self-reported physical activity in persons with mental disorder and gives an apt reference:

6/ related to this, self report measures of PA shown to massively overestimate PA in people with mental ill health/illness – so findings of greater PA linked with mental illness likely bi-product of over-reporting of PA in people with mental illness e.g Validity and Value of Self-reported Physical Activity and Accelerometry in People With Schizophrenia: A Population-Scale Study of the UK Biobank [ https://academic.oup.com/schizophreniabulletin/advance-article/doi/10.1093/schbul/sbx149/4563831 ]

7/ An additional point he makes: anyone working in field of PA will immediately realise there is confusion & misinterpretation about the concepts of exercise & PA in the paper, which is distracting. People have been trying to prevent this happening over 30 years

Again, Brandon provides a spot-on citation clarifying the distinction between physical activity and exercise:, Physical activity, exercise, and physical fitness: definitions and distinctions for health-related research 

The mysterious pseudonymous Zad Chow @dailyzad called attention to a blog post they had just uploaded and let’s take a look at some of the key points.

Lessons from a blog post: Exercise, Mental Health, and Big Data

Zad Chow is quite balanced in dispensing praise and criticism of the Lancet Psychiatry paper. They noted the ambiguity of any causality in cross-sectional correlation and that investigated the literature on their own.

So what does that evidence say? Meta-analyses of randomized trials seem to find that exercise has large and positive treatment effects on mental health outcomes such as depression.

Study Name     # of Randomized Trials             Effects (SMD) + Confidence Intervals

Schuch et al. 2016       25         1.11 (95% CI, 0.79-1.43)

Gordon et al. 2018      33         0.66 (95% CI, 0.48-0.83)

Krogh et al. 2017          35         −0.66 (95% CI, -0.86, -0.46)

But, when you only pool high-quality studies, the effects become tiny.

“Restricting this analysis to the four trials that seemed less affected of bias, the effect vanished into −0.11 SMD (−0.41 to 0.18; p=0.45; GRADE: low quality).” – Krogh et al. 2017

Hmm, would you have guessed this from the Lancet Psychiatry author’s thread of tweets?

Zad Chow showed the hype and untrustworthiness of the press coverage in prestigious media with a sampling of screenshots.

zad chou screenshots of press coverage

I personally checked and don’t see that Zad Chow’s selection of press coverage was skewed. Coverage in the media all seemed to be saying the same thing. I found the distortion to continue with uncritical parroting – a.k.a. churnaling – of the claims of the Lancet Psychiatry authors in the Wall Street Journal. 

The WSJ repeated a number of the author’s claims that I’ve already thrown into question and added a curiosity:

In a secondary analysis, the researchers found that yoga and tai chi—grouped into a category called recreational sports in the original analysis—had a 22.9% reduction in poor mental-health days. (Recreational sports included everything from yoga to golf to horseback riding.)

And the NHS England totally got it wrong:

NHS getting it wrong.PNG

So, we learned that the broad category “recreational sports” covers yoga and tai chi , as well as golf and  horseback riding. This raises serious questions about the lumping and splitting of categories of physical activity in the analyses that are being reported.

I needed to access the article in order to uncover some important things 

I’m grateful for the clues that I got from Twitter, and especially Zad Chow that I used in examining the article itself.

I got hung up on the title proclaiming that the study involved 1·2 million individuals. When I checked the article, I saw that the authors use three waves of publicly available data to get that number. Having that many participants gave them no real advantage except for bragging rights and the likelihood that modest associations could be expressed in expressed in spectacular p-values, like p<2・2 × 10–16. I don’t understand why the authors didn’t conduct analyses with one-way and Qwest validate results in another.

The obligatory Research in Context box made it sound like a systematic search of the literature had been undertaken. Maybe, but the authors were highly selective in what they chose to comment upon, as seen in its contradiction by the brief review of Zad Chow. The authors would have us believe that the existing literature is quite limited and inconclusive, supporting the need for like their study.

research in context

Caveat Lector, a strong confirmation bias is likely ahead in this article.

Questions accumulated quickly as to the appropriateness of the items available from a national survey undoubtedly constructed with other purposes. Certainly these items would not have been selected if the original investigators were interested in the research question at the center of this article.

Participants self-reported a previous diagnosis of depression or depressive episode on the basis of the following question: “Has a doctor, nurse, or other health professional EVER told you that you have a depressive disorder, including depression, major depression, dysthymia, or minor depression?”

Our own work has cast serious doubt on the correspondence of reports of a history of depression in response to a brief question embedded in a larger survey with results of a structured interview in which respondents’ answers can be probed. We found that answers to such questions were more related to current distress, then to actual past diagnoses and treatment of depression. However, the survey question used in the Lancet Psychiatry study added the further ambiguity and invalidity with the added  “or minor depression.” I am not sure under what circumstances a health care professional would disclose a diagnosis of “minor depression” to a patient, but I doubt it would be in context in which the professional felt treatment was needed.

Despite the skepticism that I was developing about the usefulness of the survey data, I was unprepared for the assessment of “exercise.”

Other than your regular job, did you participate in any physical activities or exercises such as running, calisthenics, golf, gardening, or walking for exercise?” Participants who answered yes to this question were then asked: “What type of physical activity or exercise did you spend the most time doing during the past month?” A total of 75 types of exercise were represented in the sample, which were grouped manually into eight exercise categories to balance a diverse representation of exercises with the need for meaningful cell sizes (appendix).

Participants indicated the number of times per week or month that they did this exercise and the number of minutes or hours that they usually spend exercising in this way each time.

I had already been tipped off by the discussion on twitter that there would be a thorough confusion of planned exercise and mere physical activity. But now that was compounded. Why was physical activity during employment excluded? What if participants were engaged in a number of different physical activities,  like both jogging and bicycling? If so, the survey obtained data for only one of these activities, with the other excluded, and the choice could’ve been quite arbitrary as to which one the participant identified as the one to be counted.

Anyone who has ever constructed surveys would be alert to the problems posed by participants’ awareness that saying “yes” to exercising would require contemplating  75 different options, arbitrarily choosing one of them for a further question how much time the participant engaged in this activity. Unless participants were strongly motivated, then there was an incentive to simply say no, they didn’t exercise.

I suppose I could go on, but it was my judgment that any validity what the authors were claiming  had been ruled out. Like someone once said on NIH grant review panel, there are no vital signs left, let’s move on to the next item.

But let’s refocus just a bit on the overall intention of these authors. They want to use a large data set to make statements about the association between physical activity and a measure of mental health. They have used matching and statistical controls to equate participants. But that strategy effectively eliminates consideration of crucial contextual variables. Persons’ preferences and opportunities to exercise are powerfully shaped by their personal and social circumstances, including finances and competing demands on their time. Said differently, people are embedded in contexts in which a lot of statistical maneuvering has sought to eliminate.

To suggest a small number of the many complexities: how much physical activity participants get  in their  employment may be an important determinant of choices for additional activity, as well as how much time is left outside of work. If work typically involves a lot of physical exertion, people may simply be left too tired for additional planned physical activity, a.k.a. exercise, and the physical health may require it less. Environments differ greatly in terms of the opportunities and the safety of engaging in various kinds of physical activities. Team sports require other people being available. Etc., etc.

What I learned from the editorial accompanying the Lancet Psychiatry article

The brief editorial accompanying the article aroused my curiosity as to whether someone assigned to reading and commenting on this article would catch things that apparently the editor and reviewer missed.

Editorial commentators are chosen to praise, not to bury articles. There are strong social pressures to say nice things. However, this editorial leaked a number of serious concerns.

First

In presenting mental health as a workable, unified concept, there is a presupposition that it is possible and appropriate to combine all the various mental disorders as a single entity in pursuing this research. It is difficult to see the justification for this approach when these conditions differ greatly in their underlying causes, clinical presentation, and treatment. Dementia, substance misuse, and personality disorder, for example, are considered as distinct entities for research and clinical purposes; capturing them for study under the combined banner of mental health might not add a great deal to our understanding.

The problem here of categorisation is somewhat compounded by the repeated uncomfortable interchangeability between mental health and depression, as if these concepts were functionally equivalent, or as if other mental disorders were somewhat peripheral.

Then:

A final caution pertains to how studies approach a definition of exercise. In the current study, we see the inclusion of activities such as childcare, housework, lawn-mowing, carpentry, fishing, and yoga as forms of exercise. In other studies, these activities would be excluded for not fulfilling the definition of exercise as offered by the American College of Sports Medicine: “planned, structured and repetitive bodily movement done to improve or maintain one or more components of physical fitness.” 11 The study by Chekroud and colleagues, in its all-encompassing approach, might more accurately be considered a study in physical activity rather than exercise.

The authors were listening for a theme song with which they could promote their startup company in a very noisy data set. They thought they had a hit. I think they had noise.

The authors’ extraordinary disclosure of interests (see below this blog post) should have precluded publication of this serious flawed piece of work, either simply for reason of high likelihood of bias or because it promoted the editor and reviewers to look more carefully at the serious flaws hiding in plain sight.

Postscript: Send in the trolls.

On Twitter, Adam Chekroud announced he felt no need to respond to critics. Instead, he retweeted and “liked” trolling comments directed at critics from the twitter accounts of his brother, his mother, and even the official Twitter account of a local fried chicken joint @chickenlodge, that offered free food for retweets and suggested including Adam Chekroud’s twitter handle if you wanted to be noticed.

chicken lodge

Really, Adam, if you can’t stand the heat, don’t go near  where they are frying chicken.

The Declaration of Interests from the article.

declaration of interest 1

declaration of interest 2

 

Headspace mindfulness training app no better than a fake mindfulness procedure for improving critical thinking, open-mindedness, and well-being.

The Headspace app increased users’ critical thinking and being open-minded. So did practicing a sham mindfulness procedure- participants simply sat with their eyes closed, but thought they were meditating.

mind the brain logo

The Headspace app increased users’ critical thinking and open-mindedness. So did practicing a sham mindfulness procedure. Participants simply sat with their eyes closed, but thought they were meditating.

cat_ dreamstime_164683 (300 x 225)Results call into question claims about Headspace  coming from other studies that did not have such a credible, active control group comparison.

Results also call into question the widespread use of standardized self-report measures of mindfulness to establish whether someone is in the state of mindfulness. These measures don’t distinguish between the practice of standard versus fake mindfulness.

Results can be seen as further evidence that practicing mindfulness depends on nonspecific factors (AKA placebo), rather than any active, distinctive ingredient.

Hopefully this study will prompt better studies evaluating the Headspace App, as well as evaluations of mindfulness training more generally, using credible active treatments, rather than no treatment or waitlist controls.

Maybe it is time for a moratorium on trials of mindfulness without such an active control or at least a tempering of claims based on poorly controlled  trials.

This study points to the need for development of more psychometrically sophisticated measures of mindfulness that are not so vulnerable to experiment expectations and demand characteristics.

Until the accumulation of better studies with better measures, claims about the effects of practicing mindfulness ought to be recognized as based on relatively weak evidence.

The study

Noone, C & Hogan,M. Randomised active-controlled trial of effects of online mindfulness intervention on executive control, critical thinking and key thinking dispositionsBMC Psychology, 2018

Trial registration

The study was initially registered in the AEA Social Science Registry before the recruitment was initiated (RCT ID: AEARCTR-0000756; 14/11/2015) and retrospectively registered in the ISRCTN registry (RCT ID: ISRCTN16588423) in line with requirements for publishing the study protocol.

Excerpts from the Abstract

The aim of this study was…investigating the effects of an online mindfulness intervention on executive function, critical thinking skills, and associated thinking dispositions.

Method

Participants recruited from a university were randomly allocated, following screening, to either a mindfulness meditation group or a sham meditation group. Both the researchers and the participants were blind to group allocation. The intervention content for both groups was delivered through the Headspace online application, an application which provides guided meditations to users.

And

Primary outcome measures assessed mindfulness, executive functioning, critical thinking, actively open-minded thinking, and need for cognition. Secondary outcome measures assessed wellbeing, positive and negative affect, and real-world outcomes.

Results

Significant increases in mindfulness dispositions and critical thinking scores were observed in both the mindfulness meditation and sham meditation groups. However, no significant effects of group allocation were observed for either primary or secondary measures. Furthermore, mediation analyses testing the indirect effect of group allocation through executive functioning performance did not reveal a significant result and moderation analyses showed that the effect of the intervention did not depend on baseline levels of the key thinking dispositions, actively open-minded thinking, and need for cognition.

The authors conclude

While further research is warranted, claims regarding the benefits of mindfulness practice for critical thinking should be tempered in the meantime.

Headscape Be used on an iPhone

The active control condition

The sham treatment control condition was embarrassingly straightforward and simple. But as we will see, participants found it credible.

This condition presented the participants with guided breathing exercises. Each session began by inviting the participants to sit with their eyes closed. These exercises were referred to as meditation but participants were not given guidance on how to control their awareness of their body or breath. This approach was designed to control for the effects of expectations surrounding mindfulness and physiological relaxation to ensure that the effect size could be attributed to mindfulness practice specifically. This content was also delivered by Andy Puddicombe and was developed based on previous work by Zeidan and colleagues [55, 57, 58].

What can we conclude about the standard self-report measures of the state of mindfulness?

The study used the Five Facet Mindfulness Questionnaire, which is widely used to assess whether people are in a state of mindfulness. It has been cited almost 4000 times.

Participants assigned to the mindfulness condition had significant changes for all five facets from baseline to follow up: observing, non-reactivity, non-judgment, acting with awareness, and describing. In the absence of a comparison with change in the sham mindfulness group, these pre-post results would seem to suggest that the measure was sensitive to whether participants had practiced mindfulness. However, there were no differences from the changes observed for the participants assigned to mindfulness and those which were simply asked to sit with their eyes closed.

I asked Chris Noone about the questionnaires his group used to assess mindfulness:

The participants genuinely thought they were meditating in the sham condition so I think both non-specific and demand characteristics were roughly equivalent across both groups. I’m also skeptical regarding the ability of the Five-Facet Mindfulness Questionnaire (or any mindfulness questionnaire for that matter) to capture anything other than “perceived mindfulness”. The items used in these questionnaires feature similar content to the scripts used by the people delivering the mindfulness (and sham) guided meditations. The improvement in critical thinking across both groups is just a mix of learning across a semester and habituation to the task (as the same problems were posed at both measurements).

What I like about this trial

The trial provides a critical test of a key claim for mindfulness:

Mindfulness should facilitate critical thinking in higher-education, based on early Buddhist conceptualizations of mindfulness as clarity of thought.

The trial was registered before recruitment and departures from protocol were noted.

Sample size was determined by power analysis.

The study had a closely matched, active control condition, a sham mindfulness treatment.

The credibility and equivalence of this sham condition versus the active treatment under study was repeatedly assessed.

“Manipulation checks were carried out to assess intervention acceptability, technology acceptance and meditation quality 2 weeks after baseline and 4 weeks after baseline.”

The study tested some a priori hypotheses about mediators and moderation:

Analyses were intention to treat.

 How the study conflicts with past studies

Previous studies claimed to show positive effects of mindfulness on aspects of executive functioning [25 and  26]

How the contradiction of past studies by these results is resolved

 “There are many studies using guided meditations similar to those in our mindfulness meditation condition, delivered through smartphone applications [49, 50, 52, 90, 91], websites [92, 93, 94, 95, 96, 97] and CDs [98, 99], which show effects on measures of outcomes reliably associated with increases in mindfulness such as depression, anxiety, stress, wellbeing and compassion. There are two things to note about these studies – they tend not to include a measure of dispositional mindfulness (e.g. only 4% of all mindfulness intervention studies reviewed in a recent meta-analysis include such measures at baseline and follow-up; [54]) and they usually employ a weak form of control group such as a no-treatment control or waitlist control [54]. Therefore, even when change in mindfulness is assessed in mindfulness meditation intervention studies, it is usually overestimated and this must be borne in mind when comparing the results of this study with those of previous studies. This combined with generally only moderate correlations with behavioural outcomes [54] suggests that when mindfulness interventions are effective, dispositional measures do not fully capture what has changed.”

The broader take away messages

“Our results show that, for most outcomes, there were significant changes from baseline to follow-up but none which can be specifically attributed to the practice of mindfulness.’

This creative use of a sham mindfulness control condition is a breakthrough that should be widely followed. First, it allowed a fair test of whether mindfulness is any better than another active, credible treatment. Second, because the active treatment was a sham, results provide a challenge to the notion that apparent effects of mindfulness on critical thinking are anything more than a placebo effect.

The Headspace App is enormously popular and successful, based on claims about what benefits its use will provide. Some of these claims may need to be tempered, not only in terms of critical thinking, but effects on well-being.

The Headspace App platform lends itself to such critical evaluations with respect to a sham treatment with a degree of standardization that is not readily possible with face-to-face mindfulness training. This opportunity should be exploited further with other active control groups constructed on the basis of specific hypotheses.

There is far too much research on the practice of mindfulness being done that does not advance understanding of what works or how it works. We need a lot fewer studies, and more with adequate control/comparison groups.

Perhaps we should have a moratorium on evaluations of mindfulness without adequate control groups.

Perhaps articles being aimed at audiences making enthusiastic claims for the benefits of mindfulness should routinely note whether these claims are based on adequately controlled studies. Most are not.

Creating TED talks from peer-reviewed growth mindset research papers with colored brain pictures

The TED talk fallacy – When you confuse what presenters say about a peer-reviewed article – the breathtaking, ‘breakthrough’ strength of findings demanded for a TED talk – with what a transparent, straightforward analysis and reporting of relevant findings would reveal. 

mind the brain logo

The TED talk fallacy – When you confuse what presenters say about a peer-reviewed article – the breathtaking, ‘breakthrough’ strength of findings demanded for a TED talk – with what a transparent, straightforward analysis and reporting of relevant findings would reveal. 

 fixed vs growth mind setA reminder that consumers, policymakers, and other stakeholders should not rely on TED talks for their views of what constitutes solid “science’ or “best evidence,” even when presenters are established scientists.

The authors of this modest, but overhyped paper do not give TED talks. But this article became the basis for a number of TED and TED-related talks by a psychologist who integrated a story of its findings with stories about her own publications. She has a booking agent for expensive talks and a line of self-help products. This raises the question:  Should such information routinely be a reported conflict of interests in in publications?  

We will contrast the message of  the paper under discussion in this post, along with the TED talk with a new pair of comprehensive meta analyses. The meta analyses show that growth mindset and academic achievement are weak and interventions to improve mindset are ineffectual.

The study

 Moser JS, Schroder HS, Heeter C, Moran TP, Lee YH. Mind your errors: Evidence for a neural mechanism linking growth mind-set to adaptive posterror adjustments. Psychological Science. 2011 Dec;22(12):1484-9.

 Key issues with the study.

The abstract is uninformative as a guide to what was done and what was found in this study. It ends with a rousing promotion of growth mind set as a way of understanding and improving academic achievement.

A study with N = 25 is grossly underpowered for most purposes and should not be used to generate estimates of associations.

Key details of methods and results needed for independent evaluation are not available in article.

The colored brain graphics in the article were labeled “for illustrative purposes only.”

Where would you find such images of the brain not tied to the data in a credible neuroscience journal?  Articles in real such journals are increasingly retracted because of the discovery of suspected pasted-in or altered brain graphics.

The discussion has a strong confirmation bias, ignoring relevant literature and overselling the use of event-related potentials for monitoring and evaluating the determinants of academic achievement.

The press release issued by Association for Psychological Science.

How Your Brain Reacts To Mistakes Depends On Your Mindset

Concludes:

The research shows that these people are different on a fundamental level, Moser says. “This might help us understand why exactly the two types of individuals show different behaviors after mistakes.” People who think they can learn from their mistakes have brains that are tuned to pay more attention to mistakes, he says. This research could help in training people to believe that they can work harder and learn more, by showing how their brain is reacting to mistakes.

The abstract.

The abstract does not report basic details of methods and results, except what is consistent with the authors’ intended message. The crucial final sentence is quote worthy and headed for clickbait. When we look at what was done and what was found in this study, this conclusion is grossly overstated.

How well people bounce back from mistakes depends on their beliefs about learning and intelligence. For individuals with a growth mind-set, who believe intelligence develops through effort, mistakes are seen as opportunities to learn and improve. For individuals with a fixed mind-set, who believe intelligence is a stable characteristic, mistakes indicate lack of ability. We examined performance-monitoring event-related potentials (ERPs) to probe the neural mechanisms underlying these different reactions to mistakes. Findings revealed that a growth mind-set was associated with enhancement of the error positivity component (Pe), which reflects awareness of and allocation of attention to mistakes. More growth-minded individuals also showed superior accuracy after mistakes compared with individuals endorsing a more fixed mind-set. It is critical to note that Pe amplitude mediated the relationship between mind-set and posterror accuracy. These results suggest that neural mechanisms indexing on-line awareness of and attention to mistakes are intimately involved in growth-minded individuals’ ability to rebound from mistakes.

The introduction.

The introduction opens with:

Decades of research by Dweck and her colleagues indicate that academic and occupational success depend not only on cognitive ability, but also on beliefs about learning and intelligence (e.g., Dweck, 2006).

This sentence echoes the Amazon blurb for the pop psychology book  that is being cited:

After decades of research, world-renowned Stanford University psychologist Carol S. Dweck, Ph.D., discovered a simple but groundbreaking idea: the power of mindset. In this brilliant book, she shows how success in school, work, sports, the arts, and almost every area of human endeavor can be dramatically influenced by how we think about our talents and abilities.

Nowhere in the introduction are there balancing references to studies investigating Carol Dweck’s theory independently, from outside her group, nor any citing of any inconsistent findings. This is a selective, strongly confirmation-driven review of the relevant literature. (Contrast this view with an independent assessment from a recent comprehensive meta analysis at the end of the this post).

The method.

Twenty-five native-English-speaking undergraduates (20 female, 5 male; mean age = 20.25 years) participated for course credit.

There is no discussion of why a sample of only 25 participants was chosen or any mention of a power analysis.

If we stick to simple bivariate correlations with the full sample of N= 25:

R = .40 p <.05  (p= 0.0475)

R=  .51  p <.01 (p = 0.0092)

N = 25 does not allow reliable detection of a small to moderate sized,  statistically significant relationship where one exists.

Any significant findings will of necessity be large, r >.40 for p<.05 and  r> .51 for p<.01.

As been noted elsewhere:

In systematic studies of psychological and biomedical effect sizes (e.g., Meyer et al., 2001)  one rarely encounters correlations greater than .4.

How growth mindset scores were calculated is crucially important, but the information that is presented about the measure is inadequate. There is no reference to an established scale with psychometric data and cross validation. Rather:

Following the flanker [a noise letter version of the Eriksen flanker task (Eriksen & Eriksen,  1974)  task, participants completed a TOI scale that asked respondents to rate the extent to which they agreed with four fixed-mind-set statements on a 6-point Likert-type scale (1 = strongly disagree, 6 = strongly agree). These statements (e.g., “You have a certain amount of intelligence and you really cannot do much to change it”) were drawn from previous studies measuring TOI (e.g., Hong, Chiu, Dweck, Lin, & Wan, 1999). TOI items were reverse-scored so that higher scores indicated more endorsement of a growth mind-set, and lower scores indicated more of a fixed mind-set,

Details in the referenced Hong et al (1999) study are difficult to follow, but the paper lays out the following requirement:

Those participants who believe that intelligence is fixed (entity theorists) should consistently endorse responses at the lower (agree) end of the scale (yielding a mean score of 3.0 or lower), whereas participants who believe that intelligence is malleable (incremental theorists) should consistently endorse responses at the upper (disagree) end of the scale (yielding a mean score of 4.0 or above).

If this distribution occurred naturally, it would be an extraordinary set of questions. In the Hong et al (1999) study, this distribution was achieved by throwing away data in the middle of the distribution that didn’t fit the investigators’ preconceived notion.

Excluding the middle third of a distribution of scores with only N = 25 compounds the errors associated with the practice with a larger sample. With the small number of scores now reduced to N= 17, the influence of single outlier participant would be increased. Any generalization to the larger population would be even more problematic.  We cannot readily evaluate whether scores in the present sample were neatly and naturally bimodal. We are not provided the basic data, not even the means and standard deviations in text or table. However, as we will see, one graphic representation leaves some doubts.

Overview of data analyses.

Repeated measures analyses of variance (ANOVAs) were first conducted on behavioral and ERP measures without regard to individual differences in TOIs in order to establish baseline experimental effects. ANOVAs conducted on behavioral measures and the ERN included one 2-level factor: accuracy (error vs. correct response). The Pe [error positivity component ]was analyzed using a 2 (accuracy: error vs. correct response) × 2 (time window: 150–350 ms vs. 350–550 ms) ANOVA. Subsequently, TOI scores were entered into ANOVAs as covariates to assess the main and interactive effects of mind-set on behavioral and ERP measures. When significant effects of TOI score were detected, we conducted follow-up correlational analyses to aid in the interpretation of results.

Thus, multiple post hoc analyses examine the effects of the growth mindset (TOI), based on whether significant main and interaction effects were obtained in other analyses, which in turn, were followed up with correlational analyses.

Highlights of the results.

 Only a few of numerous analyses produced significant results for TOI. Given the sample size and multiple tests without correction, we probably should not attach substantive interpretations to them.

Behavioral data.

Overall accuracy was not correlated with TOI (r = .06, p > .79).

[Speed on error vs correct trials]  trials] When TOI was entered into the ANOVA as a covariate, there were no significant effects (Fs < 1.78, ps > .19, ηp 2s < .08) [where ‘ps’ and ‘no significant effects’ refer to either a main or interaction effects].

[Posterror adjustments] When TOI was entered into the ANOVA as a covariate, there were no significant effects (Fs <1.15, ps > .29, ηp 2 s  < .05).

When entered into the ANOVA as a covariate, however, TOI scores interacted with postresponse accuracy, F(1, 23) = 5.22, p < .05, ηp2= .19. Correlational analysis showed that as TOI scores increased, indicating a growth mind-set, so did accuracy on trials immediately following errors relative to accuracy on trials immediately following correct responses (i.e., posterror accuracy – postcorrect-response accuracy; r = .43, p < .05).

ERPs (event-related potentials).

As expected, the ANOVA confirmed greater ERP negativity on error trials (M = –3.43 μV, SD = 4.76 μV) relative to correct trials (M = –0.23 μV, SD = 4.20 μV), F(1, 24) = 24.05, p < .001, ηp2 = .50, in the 0- to 100-ms postresponse time window. This result is consistent with the presence of an ERN. There were no significant effects involving TOI (Fs < 1.24, ps > .27, ηp2s < .06).

When entered as a covariate, TOI showed a significant interaction with accuracy, F(1, 23) = 8.64, p < .01, ηp2 = .27. Correlational analysis demonstrated that as TOI scores increased so did positivity on error trials relative to correct trials averaged across both time windows (i.e., error activity – correct-response activity; r = .52,1 p < .01)

Mediation analysis.

As Figure 2 illustrates, controlling for Pe amplitude significantly attenuated the relationship between TOI scores and posterror accuracy. The 95% confidence intervals derived from the bootstrapping test did not include zero (.01–.04), and thus indicated significant mediation.

So, a priori conditions for testing for a significant mediation was met because a statistical test barely excluded zero (.01–.04, with no correction for the many tests of TOI in the study. But what are we doing exploring mediation with N = 25?

Distribution of TOI [growth mindset] scores.

Let’s look at the distribution of TOI scores in a graph available as the x-axis in Figure 1.

graph with outlier

Any dichotomization of these continuous scores would be arbitrary. Close scores clustered around different sides of the median would  be considered  different, but  diverging  scores on the same side of the median  would be treated as the same.  Any association between TOI and ERPs (event-related potentials) could be due to one or a few interindividual differences in brains or intraindividual variability of ERP over occasions. These are not the kind of data from which generalizable estimates of effects can be obtained.

The depiction of brains with fixed versus growth mind sets.

The one picture of brains in the main body of this article supposedly contrasts fixed versus growth mindsets. The differences appear dramatic, in sharply contrasting colors. But in the article itself, no such dichotomization is discussed. Nor should it be. Furthermore, the simulation is based on an isolation of one of the few significant effects of TOI. Readers are cautioned that the picture is “for illustrative purposes only.”

fixed vs growth mind set

The discussion.

Similar to the introduction, there is a selective citation of the literature with a strong confirmation bias. There is no reference to weak or null findings or any controversy concerning growth mindset that might have accumulated over a decade of research. There is no acknowledgment of the folly of making substantive interpretations of significant findings from such a small, underpowered study. Results of the mediation analysis are confidently presented, with no indication of doubts whether they should even have been conducted. Or that, even under the best of circumstances, such mediational analyses remain correlational  and provide only weak evidence of causal mechanisms. Event-related evoked potentials are proposed as biomarkers and as surrogate outcomes in implementations of growth mindset interventions. A lot of misunderstanding and neurononsense are crammed into a few sentences. There is no mention of any limitations to the study.

The APS Observer press release revisited.

Why was this article recognized with a special press release by the APS? The press release is much more tied to the author’s claims about their study, rather than to their actual methods and results. The press release provides an opportunity to publicize the study with further exaggeration of what it accomplished.

This is an unfortunate message to authors about what they need to do to be promoted by APS. Your intended message can override your actual results if you strategically emphasize the message and downplay any discrepancy with the results. Don’t mention any limitations of your study.

The TED talks.

A number of TED and TED-related talks incorporate a discussion of the study, with its picture of fixed versus growth mindset brains. There is remarkable overlap among these talks. I have chosen TEDxNorrkoping The power of believing that you can improve  because it had a handy transcript available.

 same screenshot in TED talk1

On the left, you see the fixed-mindset students. There’s hardly any activity. They run from the error. They don’t engage with it. But on the right, you have the students with the growth mindset, the idea that abilities can be developed. They engage deeply. Their brain is on fire with yet. They engage deeply. They process the error. They learn from it and they correct it.

“On fire”? The presented exploits the arbitrary red color chosen for the for-illustrative-purposes-only picture.

The brain graphic is reduced to a cartoon in a comic book level account of action heroes engaging their errors deeply, learning from them, and correcting their next response when ordinary mortals are running, like cowards.

The presenter soon introduces another cartoon for her comic book depiction of the effects of growth mindset on the brain. But first, here is an overview of how this talk fits the predictable structure of a TED talk.

The TED talk begins with a personal testimony concerning  “a critical event early in my career, a real turning point.” It is recognizable to TED talk devotees as an epiphany (an “epiphimony” if you like ) through which the speaker shares a personal journey of insight and realisation, its triumphs and tribulations. In telling the story, the presenter introduces an epic struggle between the children of the darkness (the “now” of a fixed mindset) versus children of the light (the “yet” or “not yet” of a growth mindset).

There is much more of a sense of a televangelist than academic presenting an accurate summary of her research to a lay audience. Sure, the live audience and the millions of viewers of this and related talks were not seeking a colloquium or even a Cafe Scientifique. The audience came to be entertained with a good story. But how much license can be taken with the background science? After all, the information being discussed is relevant to their personal decisions as parents and as citizens and communities making important choices about how to improve academic performance. The issue becomes more serious when the presenter gets to claims of dramatic transformations of impoverished students in economically deprived school settings.

The presenter cites one of her studies for an account of what students “gripped with the tyranny of now” did in difficult learning experiences:

So what do they do next? I’ll tell you what they do next. In one study, they told us they would probably cheat the next time instead of studying more if they failed a test. In another study, after a failure, they looked for someone who did worse than they did so they could feel really good about themselves.

cheat vs study

We are encouraged to think ‘Students with a fixed mind set cheat instead of studying more. How horrible!’ But I looked up the study:

Blackwell LS, Trzesniewski KH, Dweck CS. Implicit Theories of Intelligence Predict Achievement Across an Adolescent Transition: A Longitudinal Study and an InterventionChild Development. 2007 Jan 1;78(1):246-63.

I searched for “cheat” and found one mention:

Students rated how likely they would be to engage in positive, effort-based strategies (e.g., ‘‘I would work harder in this class from now on’’ ‘‘I would spend more time studying for tests’’) or negative, effort-avoidant strategies (e.g., ‘‘I would try not to take this subject ever again’’ ‘‘I would spend less time on this subject from now on’’ ‘‘I would try to cheat on the next test’’). Positive and negative items were combined to form a mean Positive Strategies score.

All subsequent reporting of results was in terms of this composite Positive Strategies. So, I was unable to evaluate how common endorsement occurred of “I would try to cheat…”

Three minutes into the talk, the speaker introduces an element of moral panic about a threat to Western civilization as we know it:

How are we raising our children? Are we raising them for now instead of yet? Are we raising kids who are obsessed with getting As? Are we raising kids who don’t know how to dream big dreams? Their biggest goal is getting the next A, or the next test score? And are they carrying this need for constant validation with them into their future lives? Maybe, because employers are coming to me and saying, “We have already raised a generation of young workers who can’t get through the day without an award.”

Less than a minute later, the presenter gets ready to roll out her solution.

So what can we do? How can we build that bridge to yet?

Praising performance in terms of fixed characteristics like IQ or ability is ridiculed. However, great promises are made for praising process, regardless of outcome.

Here are some things we can do. First of all, we can praise wisely, not praising intelligence or talent. That has failed. Don’t do that anymore. But praising the process that kids engage in, their effort, their strategies, their focus, their perseverance, their improvement. This process praise creates kids who are hardy and resilient.

“Yet” or “not yet” becomes a magical incantation.  The presenter builds on her comic book science of the effects of growth mindset, by introducing by cartoon of a synapse (mislabeled as a neuron),  linked to her own research only by some wild speculation.

build stronger connections synapse

Just the words “yet” or “not yet,” we’re finding, give kids greater confidence, give them a path into the future that creates greater persistence. And we can actually change students’ mindsets. In one study, we taught them that every time they push out of their comfort zone to learn something new and difficult, the neurons in their brain can form new, stronger connections, and over time, they can get smarter.

I found no relevant measurements of brain activity in Dweck’s studies, but let’s not ruin a good story.

Look what happened: In this study, students who were not taught this growth mindset continued to show declining grades over this difficult school transition, but those who were taught this lesson showed a sharp rebound in their grades. We have shown this now, this kind of improvement, with thousands and thousands of kids, especially struggling students.

Up until now, we have disappointingly hyped and inaccurate accounts of how to foster academic achievement. But soon turns into a cruel hoax when claims are made about improving the performance of under privileged children in under resource settings.

So let’s talk about equality. In our country, there are groups of students who chronically underperform, for example, children in inner cities, or children on Native American reservations. And they’ve done so poorly for so long that many people think it’s inevitable. But when educators create growth mindset classrooms steeped in yet, equality happens. And here are just a few examples. In one year, a kindergarten class in Harlem, New York scored in the 95th percentile on the national achievement test. Many of those kids could not hold a pencil when they arrived at school. In one year, fourth-grade students in the South Bronx, way behind, became the number one fourth-grade class in the state of New York on the state math test. In a year, to a year and a half, Native American students in a school on a reservation went from the bottom of their district to the top, and that district included affluent sections of Seattle. So the Native kids outdid the Microsoft kids.

This happened because the meaning of effort and difficulty were transformed. Before, effort and difficulty made them feel dumb, made them feel like giving up, but now, effort and difficulty, that’s when their neurons are making new connections, stronger connections. That’s when they’re getting smarter.

So the Native kids outdid the Microsoft kids.” There is some kind of poetic license being taken here in describing the results of an intervention. The message is that subjective mindset can trump entrenched structural inequalities and accumulated deficits in skills and knowledge, as well as limits on ability. All school staff and parents need to do is wave the magic wand and recite the incantation “Not yet.” How reassuring to those in politics who control resources who don’t want to adequately fund the school settings. They just need to exhort anyone who wants to improve outcomes to recite the magic.

And what do we say when we don’t witness dramatic improvements? Who is to blame when such failures need to be explained. . The cruel irony is that school boards will blame principals, who blame teachers, and parents will blame schools and their children. All will be held to unrealistic expectations.

But it gets worse. The presenter ends with a call to action arguing that that not buying into her program would violate the human rights of vulnerable children.

Let’s not waste any more lives, because once we know that abilities are capable of such growth, it becomes a basic human right for children, all children, to live in places that create that growth, to live in places filled with “yet”.

Paradox: Do poor kids with a growth mindset suffer negative consequences?

Maybe so, suggests some recent research concerning the longer term outcomes of disadvantaged African American children.

A newly published study in the peer-reviewed journal Child Development …finds traditionally marginalized youth who grew up believing in the American ideal that hard work and perseverance naturally lead to success show a decline in self-esteem and an increase in risky behaviors during their middle-school years. The research is considered the first evidence linking preteens’ emotional and behavioral outcomes to their belief in meritocracy, the widely held assertion that individual merit is always rewarded.

“If you’re in an advantaged position in society, believing the system is fair and that everyone could just get ahead if they just tried hard enough doesn’t create any conflict for you … [you] can feel good about how [you] made it,” said Erin Godfrey, the study’s lead author and an assistant professor of applied psychology at New York University’s Steinhardt School. But for those marginalized by the system—economically, racially, and ethnically—believing the system is fair puts them in conflict with themselves and can have negative consequences.

We know surprisingly little about the adverse events associated with growth mindset interventions or their negative unintended consequences for children and school systems. Cost/benefit analyses of mindset interventions should be done with respect to academic interventions known to be effective when conducted with the equivalent resources, not no treatment.

Overall associations of growth mind set with academic achievement are weak and interventions are not effective.

Sisk VF, Burgoyne AP, Sun J, Butler JL, Macnamara BN. To What Extent and Under Which Circumstances Are Growth Mind-Sets Important to Academic Achievement? Two Meta-Analyses. Psychological Science. 2018 Mar 1:0956797617739704.

This newly published article published in Psychological Science started by noting  the influence of growth mind set.

These ideas have led to the establishment of nonprofit organizations (e.g., Project for Education Research that Scales [PERTS]), for-profit entities (e.g., Mindset Works, Inc.), schools purchasing mind-set intervention programs (e.g., Brainology), and millions of dollars in funding to individual researchers, nonprofit organizations, and for-profit companies (e.g., Bill and Melinda Gates Foundation,1 Department of Education,2 Institute of Educational Sciences3).

In our first meta-analysis (k = 273, N = 365,915), we examined the strength of the relationship between mind-set and academic achievement and potential moderating factors. In our second meta-analysis (k = 43, N = 57,155), we examined the effectiveness of mind-set interventions on academic achievement and potential moderating factors. Overall effects were weak for both meta-analyses.

The first meta analysis integrated 273 effect sizes. The overall effect was very weak, by conventional standards, hardly consistent with the TED talks.

The meta-analytic average correlation (i.e., the average of various population effects) between growth mind-set and academic achievement is r⎯⎯ = .10, 95% confidence interval (CI) = [.08, .13], p < .001.

The data set of effects of growth mindset interventions integrated 43 effect sizes and 37 of the 43 effect sizes (86%) are not significantly different from zero.

The authors conclude:

Some researchers have claimed that mind-set interventions can “lead to large gains in student achievement” and have “striking effects on educational achievement” (Yeager & Walton, 2011, pp. 267 and 268, respectively). Overall, our results do not support these claims. Mind-set interventions on academic achievement were nonsignificant for adolescents, typical students, and students facing situational challenges (transitioning to a new school, experiencing stereotype threat). However, our results support claims that academically high-risk students and economically disadvantaged students may benefit from growth-mind-set interventions (see Paunesku et al., 2015; Raizada & Kishiyama, 2010), although these results should be interpreted with caution because (a) few effect sizes contributed to these results, (b) high-risk students did not differ significantly from non-high-risk students, and (c) relatively small sample sizes contributed to the low-SES group.

Part of the reshaping effort has been to make funding mind-set research a “national education priority” (Rattan et al., 2015, p. 723) because mind-sets have “profound effects” on school achievement (Dweck, 2008, para. 2). Our meta-analyses do not support this claim.

And

From a practical perspective, resources might be better allocated elsewhere than mind-set interventions. Across a range of treatment types, Hattie, Biggs, and Purdie (1996) [https://www.teachertoolkit.co.uk/wp-content/uploads/2014/04/effect-of-learning-skills.pdf ] found that the meta-analytic average effect size for a typical educational intervention on academic performance is 0.57. All meta-analytic effects of mind-set interventions on academic performance were < 0.35, and most were null. The evidence suggests that the “mindset revolution” might not be the best avenue to reshape our education system.

The presenter’s speaker fees.

Presenters of TED talks are not paid, but a successful talk can lead to lucrative speaking engagements. It is informative to Google the speaking fees of the presenters of highly accessed Ted talks. In the case of Carol Dweck, I found the booking agency,  All American Speakers.

carol dweck speaking

fee range

Mindsetonline provides products for sale as well as success stories about people and organizations adopting a growth mindset.

buy the bookbuy the software

businessa nd leadership

There is even a 4-item measure of mindset you can complete on line.  Each of the items is some paraphrasing of ‘you can’t change your intelligence very much’ either stated straightforwardly or reverse, ‘you can.’

Consumers beware! TED talks are not reliable dissemination of best evidence.

TED talks are to best evidence like historical fiction is to history.

Even TED talks by eminent psychologists often are little more than informercials for the self-help and lucrative speaking engagements and workshops.

Academics are under increasing pressure to demonstrate that there is more to the  impact of their work, in terms of citations of publications in prestigious journals. Social impact is being used to balance journal impact factors.

It is also being recognized that outreach involves the need to equip lay audiences to be able to grasp what are initially difficult or confusing concepts.

But pictures of color brains can be used to dumb down consumers and to disarm their intuitive skepticism about behavioral science working magic and miracles. Even PhD psychologists are inclined to be  overly impressed with references to neuroscience and pictures of color brains are introduced into the discussion. The vulnerability of lay audiences to neurononsense or neurobollocks is even greater.

False and exaggerated claims about academic interventions harm school systems, teachers, and ultimately, students. In communicating to lay audiences, psychologists need to be sensitive to the possible misunderstandings they are reinforcing. They have an ethical responsibility to do their best to critical thinking skills of their audiences, not damage it.

TED talks and declarations of potential conflicts of interest.

Personally, I found that calling out the pseudoscience behind claims for unproven medicine like acupuncture or homeopathy does not produce much blowback except mostly from proponents of these treatments. Similarly, campaigning for better disclosure of potential conflicts of interest does not meet much resistance when the focus is on pharmaceutical companies.

However, it’s a whole different matter to call out the pseudoscience behind self-help and exaggerated outbreak false claims about behavioral science being able to work miracles and magic. It seems to be a double standard in psychology by which is inappropriate to exaggerate the strength of findings when communicating with other professionals. On the other hand, in communicating with lay audiences, it’s perfectly okay.

We need to think about TED talks more like we think about talks by opinion leaders with ties to the pharmaceutical industry. Presenters  should start with a standard slide disclosing financial interests that may influence opinions offered about specific products mentioned in the talk. Given the pressure to get findings that will fit into the next TED talk, presenters should routinely disclose in their peer review articles that they give TED talks or have a booking agent.