Science Media Centre concedes negative reaction from scientific community to coverage of Esther Crawley’s SMILE trial.

“It was the criticism from within the scientific community that we had not anticipated.”

mind the brain logo

Editorial from the

science media centre logo

eat-crow-humble-pieSEPTEMBER 28, 2017

Inconvenient truths

http://www.sciencemediacentre.org/inconvenient-truths/

 

“It was the criticism from within the scientific community that we had not anticipated.”

“This time the SMC also came under fire from our friends in science…Quack buster extraordinaire David Colquhoun tweeted, ‘More reasons to be concerned about @SMC_London?’

Other friends wrote to us expressing concern about the unintended consequences of SMC briefings – with one saying that policy makers were furious at having to deal with the fallout from our climate briefing and others worried that the briefing on the CFS/ME trial would allow the only private company offering the treatment to profit by over-egging preliminary findings.

Eat more crowThose of us who are accustomed to the Science Media Centre UK (SMC) highly slanted coverage of select topics  can detect a familiar defensive, yet self-congratulatory tone to an editorial put out by the SMC in reaction to its broad coverage of Esther Crawley’s SMILE trial of the quack treatment, Phil Parker’s Lightning Process. Once again, critics, both patients and professionals, of ineffectual treatments being offered for chronic fatigue syndrome/myalgic encephalomyelitis  are lumped with climate change deniers. Ho-hum, this comparison is getting so clichéd.

Perhaps even better, the SMC editorial’s concessions of poor coverage of the SMILE trial drew sharp amplifications from commentators that SMC had botched the job.

b1f9cdb8747b66edb7587c798153d4bfHere are some comments below, with emphases added. But let’s not be lulled by SMC into assuming that these intelligent, highly articulate comments, not necessarily from the professional community. I wouldn’t be surprised if hiding behind the pseudonyms are some of the excellent citizen scientists that the patient community has had to grow in the face of vilification and stigmatization led by SMC.

I actually think I recognize a spokesperson from the patient community writing under the pseudonym ‘Scary vocal critic.’

Scary vocal critic says:

September 29, 2017 at 5:59 am

The way that this blog glosses over important details in order to promote a simplistic narrative is just another illustration of why so many are concerned by Fiona Fox’s work, and the impact [of] the Science Media Centre.

Let’ s look in a bit more detail at the SMILE trial, from Esther Crawley at Bristol University. This trial was intended to assess the efficacy of Phil Parker’s Lightning Process©. Phil Parker has a history of outlandish medical claims about his ability to heal others, selling training in “the use of divination medicine cards and tarot as a way of making predictions” and providing a biography which claimed: “Phil Parker is already known to many as an inspirational teacher, therapist, healer and author. His personal healing journey began when, whilst working with his patients as an osteopath. He discovered that their bodies would suddenly tell him important bits of information about them and their past, which to his surprise turned out to be factually correct! He further developed this ability to step into other people’s bodies over the years to assist them in their healing with amazing results. After working as a healer for 20 years, Phil Parker has developed a powerful and magical program to help you unlock your natural healing abilities. If you feel drawn to these courses then you are probably ready to join.” https://web.archive.org/web/20070615014926/http://www.healinghawk.com/prospectushealing.htm

While much of the teaching materials for the Lightning Process are not available for public scrutiny (LP being copyrighted and controlled by Phil Parker), it sells itself as being founded on neurolinguistic programming and osteopathy, which are themselves forms of quackery. Those who have been on the course have described a combination of strange rituals, intensive positive affirmations, and pseudoscientific neuro-babble; all adding up to promote the view that an individual’s ill-health can be controlled if only they are sufficiently committed to the Lightning Programme. Bristol University appears to have embraced the neurobabble, and in their press release about the SMILE results they describe LP thus: “It is a three-day training programme run by registered practitioners and designed to teach individuals a new set of techniques for improving life and health, through consciously switching on health promoting neurological pathways.”

https://www.bristol.ac.uk/news/2017/september/lightning-process.html

Unsurprisingly, many patients have complained about paying for LP and receiving manipulative quackery. This can have unpredictable consequences. This article reports a child attempting to kill themselves after going on the Lightning Process:  Before conducting a trial, the researchers involved had a responsibility to examine the course and training materials and remove all pseudo-science, yet this was not done. Instead, those patient groups raising concerns about the trial were smeared, and presented as being opposed to science.

The SMILE trial was always an unethical use of research funding, but if it had followed its original protocol, it would have been less likely to generate misleading results and headlines. The Skeptics Dictionary’s page on the Lightning Process features a contribution which explains that: “the Lightning Process RCT being carried out by Esther Crawley changed its primary outcome measure from school attendance to scores on a self-report questionnaire. Given that LP involves making claims to patients about their own ability to control symptoms in exactly the sort of way likely to lead to response bias, it seems very likely that this trial will now find LP to be ‘effective’. One of the problems with EBM is that it is often difficult to reliably measure the outcomes that are important to patients and account for the biases that occur in non-blinded trials, allowing for exaggerated claims of efficacy to be made to patients.”

The SMILE trial was a nonblinded, A vs A+B design, testing a ‘treatment’ which included positive affirmations, and then used subjective self-report questionnaires as a primary outcome. This is not a sensible way of conducting a trial, as anyone who has looked at how junk-science can be used to promote quackery will be aware.

You can see the original protocol for the SMILE trial here (although this protocol refers to merely a feasibility study, this is the same research, with the same ethical review code, the feasibility study having seemingly been converted to a full trial a year into the research):

The protocol that: “The primary outcome measure for the interventions will be school attendance/home tuition at 6 months.” It is worth noting that the new SMILE paper reported that there was no significant difference between groups for what was the trial’s primary outcome. There was a significant difference at 12 months, but by this point data on school attendance was missing for one third of the participants of the LP arm. The SMC failed to inform journalists of this outcome switching, instead presenting Prof Crawley as a critic converted by a rigorous examination of the evidence, despite her having told the ethics review board in 2010 that “she has worked before with the Bath [LP] practitioner who is good”. https://meagenda.wordpress.com/2011/01/06/letter-issued-by-nres-following-scrutiny-of-complaints-in-relation-to-smile-lighting-process-pilot-study/

Also, while the original protocol, and a later analysis plan, refer to verifying self-reported school attendance with school records, I could see no mention of this in the final paper, so it may be that even this more objective outcome measure has been rendered less useful and more prone to problems with response bias.

Back to Fiona Fox’s blog: “If you had only read the headlines for the CFS/ME story you may conclude that the treatment tested at Bristol might be worth a try if you are blighted by the illness, when in truth the author said repeatedly that the findings would first have to be replicated in a bigger trial.”

How terrible of sloppy headline writers to misrepresent research findings. This is from the abstract of Esther Crawley’s paper: “Conclusion The LP is effective and is probably cost-effective when provided in addition to SMC for mild/moderately affected adolescents with CFS/ME.” http://adc.bmj.com/content/early/2017/09/20/archdischild-2017-313375

Fox complains of “vocal critics of research” in the CFS and climate change fields. There has been a prolong campaign from the SMC to smear those patients and academics who have been pointing out the problems with poor quality UK research into CFS, attempting to lump them with climate change deniers, anti-vaccinationists and animal rights extremists. The SMC used this campaign as an example of when they had “engineered the coverage” by “seizing the agenda”:

http://www.sciencemediacentre.org/wp-content/uploads/2013/03/Review-of-the-first-three-years-of-the-mental-health-research-function-at-the-Science-Media-Centre.pdf

Despite dramatic claims of a fearsome group of dangerous extremists (“It’s safer to insult the Prophet Mohammed than to contradict the armed wing of the ME brigade”), a Freedom of Information request helped us gain some valuable information about exactly what behaviour most concerned victimised researchers such as Esther Crawley:

“Minutes from a 2013 meeting held at the Science Media Centre, an organisation that played an important role in promoting misleading claims about the PACE trial to the UK media, show these CFS researchers deciding that “harassment is most damaging in the form of vexatious FOIs [Freedom of Information requests]”.[13,16, 27-31] The other two examples of harassment provided were “complaints” and “House of Lords debates”.[13] It is questionable whether such acts should be considered forms of harassment.

http://www.centreforwelfarereform.org/news/major-breaktn-pace-trial/00296.html

[A full copy of the minutes is included at the above address.]

Since then, a seriously ill patient managed to win a legal battle against researchers attempting to release key trial data, picking apart the prejudices that were promoted and left the Judge to state that “assessment of activist behaviour was, in our view, grossly exaggerated and the only actual evidence was that an individual at a seminar had heckled Professor Chalder.” http://www.informationtribunal.gov.uk/DBFiles/Decision/i1854/Queen%20Mary%20University%20of%20London%20EA-2015-0269%20(12-8-16).PDF

So why would there be an attempt to present request for information, complaints, and mere debate, as forms of harassment? Rather embarrassingly for Fiona and the SMC, it has since become clear. Following the release of (still only some of) the data from the £5 million PACE trial it is now increasingly recognised within the academic community that patients were right to be concerned about the quality of these researchers’ work, and the way in which people had been misled about the trial’s rsults. The New York Times reported on calls for the retraction of a key PACE paper (Robin Murray, the journal’s editor and a close friend of Simon Wessely’s, does not seem keen to discuss and debate the problems with this work): https://www.nytimes.com/2017/03/18/opinion/sunday/getting-it-wrong-on-chronic-fatigue-syndrome.html The Journal of Health Psychology has published as special issue devoted to the PACE trial debacle: http://journals.sagepub.com/doi/full/10.1177/1359105317722370 The CDC has dropped promotion of CBT and GET: https://www.statnews.com/2017/09/25/chronic-fatigue-syndrome-cdc/ And NICE has decided to a full review of its guidelines for CFS is necessary, citing concerns about research such as PACE as one of the key reasons for this: https://www.nice.org.uk/guidance/cg53/resources/surveillance-report-2017-chronic-fatigue-syndromemyalgic-encephalomyelitis-or-encephalopathy-diagnosis-and-management-2007-nice-guideline-cg53-4602203537/chapter/how-we-made-the-decision https://www.thetimes.co.uk/edition/news/mutiny-by-me-sufferers-forces-a-climbdown-on-exercise-treatment-npj0spq0w

The SMC’s response to this has not been impressive.

Fox writes: “Both briefings fitted the usual mould: top quality scientists explaining their work to smart science journalists and making technical and complex studies accessible to readers.”

I’d be interested to know how it was Fox decided that Crawley was a top quality scientist. Also, it is worrying that the culture of UK science journalism seems to assume that making technical and complex studies (like SMILE?!) accessible for readers is their highest goal. It is not a surprise that it is foreign journalists who have produced more careful and accurate coverage of the PACE trial scandal.

Unlike the SMC and some CFS researchers, I do not consider complaints or debate to be a form of harassment, and would be quite happy to respond to anyone who disagrees with the concerns I have laid out here. I have had to simplify things, but believe that I have not done so in a way which favours my case. It seems that there are few people willing to try to publicly defend the PACE trial anymore, and I have never seen anyone from the SMC attempt to respond to anything other than a straw-man representation of their critics. Lets see what response these inconvenient truths receive.

Reply

Michael Emmans-Dean says:

October 2, 2017 at 8:22 am

The only point I would add to this excellent post is to ask why on earth the SMC decided to feature such a small, poorly-designed trial as SMILE. The most likely explanation is that it was intended as a smokescreen for an inconvenient truth. NICE’s retrieval of their CFS guideline from the long grass (the “static list”) is a far bigger story and it was announced in the same week that SMILE was published.

Reply

Fiona Roberts says:

September 29, 2017 at 9:03 am

Hear hear!

Jane Brody promoting the pseudoscience of Barbara Fredrickson in the New York Times

Journalists’ coverage of positive psychology and health is often shabby, even in prestigious outlets like The New York Times.

Jane Brody’s latest installment of the benefits of being positive on health relied heavily on the work of Barbara Fredrickson that my colleagues and I have thoroughly debunked.

All of us need to recognize that research concerning effects of positive psychology interventions are often disguised randomized controlled trials.

With that insight, we need to evaluate this research in terms of reporting standards like CONSORT and declarations of conflict of interests.

We need to be more skeptical about the ability of small changes in behavior being able to profoundly improve health.

When in doubt, assume that much of what we read in the media about positivity and health is false or at least exaggerated.

Jane Brody starts her article in The New York Times by describing how most mornings she is “grinning from ear to ear, uplifted not just by my own workout but even more so” by her interaction with toddlers on the way home from where she swims. When I read Brody’s “Turning Negative Thinkers Into Positive Ones.” I was not left grinning ear to ear. I was left profoundly bummed.

I thought real hard about what was so unsettling about Brody’s article. I now have some clarity.

I don’t mind suffering even pathologically cheerful people in the morning. But I do get bothered when they serve up pseudoscience as the real thing.

I had expected to be served up Brody’s usual recipe of positive psychology pseudoscience concocted  to coerce readers into heeding her Barnum advice about how they should lead their lives. “Smile or die!” Apologies to my friend Barbara Ehrenreich for my putting the retitling of her book outside of North America to use here. I invoke the phrase because Jane Brody makes the case that unless we do what she says, we risk hurting our health and shortening our lives. So we better listen up.

What bummed me most this time was that Brody was drawing on the pseudoscience of Barbara Fredrickson that my colleagues and I have worked so hard to debunk. We took the trouble of obtaining data sets for two of her key papers for reanalysis. We were dismayed by the quality of the data. To start with, we uncovered carelessness at the level of data entry that undermined her claims. But her basic analyses and interpretations did not hold up either.

Fredrickson publishes exaggerated claims about dramatic benefits of simple positive psychology exercises. Fredrickson is very effective in blocking or muting the publication of criticism and getting on with hawking her wares. My colleagues and I have talked to others who similarly met considerable resistance from editors in getting detailed critiques and re-analyses published. Fredrickson is also aided by uncritical people like Jane Brody to promote her weak and inconsistent evidence as strong stuff. It sells a lot of positive psychology merchandise to needy and vulnerable people, like self-help books and workshops.

If it is taken seriously, Fredrickson’s research concerns health effects of behavioral intervention. Yet, her findings are presented in a way that does not readily allow their integration with the rest of health psychology literature. It would be difficult, for instance, to integrate Fredrickson’s randomized trials of loving-kindness meditation with other research because she makes it almost impossible to isolate effect sizes in a way that they could be integrated with other studies in a meta-analysis. Moreover, Fredrickson has multiply published contradictory claims from the sae data set without acknowledging the duplicate publication. [Please read on. I will document all of these claims before the post ends.]

The need of self-help gurus to generate support for their dramatic claims in lucrative positive psychology self-help products is never acknowledged as a conflict of interest.  It should be.

Just imagine, if someone had a contract based on a book prospectus promising that the claims of their last pop psychology book would be surpassed. Such books inevitably paint life too simply, with simple changes in behavior having profound and lasting effects unlike anything obtained in the randomized trials of clinical and health psychology. Readers ought to be informed that these pressures to meet demands of a lucrative book contract could generate a strong confirmation bias. Caveat emptor auditor, but how about at least informing readers and let them decide whether following the money influences their interpretation of what they read?

Psychology journals almost never require disclosures of conflicts of interest of this nature. I am campaigning to make that practice routine, nondisclosure of such financial benefits tantamount to scientific misconduct. I am calling for readers to take to social media when these disclosures do not appear in scientific journals where they should be featured prominently. And holding editors responsible for non-enforcement . I can cite Fredrickson’s work as a case in point, but there are many other examples, inside and outside of positive psychology.

Back to Jane Brody’s exaggerated claims for Fredrickson’s work.

I lived for half a century with a man who suffered from periodic bouts of depression, so I understand how challenging negativism can be. I wish I had known years ago about the work Barbara Fredrickson, a psychologist at the University of North Carolina, has done on fostering positive emotions, in particular her theory that accumulating “micro-moments of positivity,” like my daily interaction with children, can, over time, result in greater overall well-being.

The research that Dr. Fredrickson and others have done demonstrates that the extent to which we can generate positive emotions from even everyday activities can determine who flourishes and who doesn’t. More than a sudden bonanza of good fortune, repeated brief moments of positive feelings can provide a buffer against stress and depression and foster both physical and mental health, their studies show.

“Research…demonstrates” (?). Brody is feeding stupid-making pablum to readers. Fredrickson’s kind of research may produce evidence one way or the other, but it is too strong a claim, an outright illusion, to even begin suggesting that it “demonstrates” (proves) what follows in this passage.

Where, outside of tabloids and self-help products, do the immodest claims that one or a few poor quality studies “demonstrate”?

Negative feelings activate a region of the brain called the amygdala, which is involved in processing fear and anxiety and other emotions. Dr. Richard J. Davidson, a neuroscientist and founder of the Center for Healthy Minds at the University of Wisconsin — Madison, has shown that people in whom the amygdala recovers slowly from a threat are at greater risk for a variety of health problems than those in whom it recovers quickly.

Both he and Dr. Fredrickson and their colleagues have demonstrated that the brain is “plastic,” or capable of generating new cells and pathways, and it is possible to train the circuitry in the brain to promote more positive responses. That is, a person can learn to be more positive by practicing certain skills that foster positivity.

We are knee deep in neuro-nonsense. Try asking a serious neuroscientists about the claims that this duo have “demonstrated that the brain is ‘plastic,’ or that practicing certain positivity skills change the brain with the health benefits that they claim via Brody. Or that they are studying ‘amygdala recovery’ associated with reduced health risk.

For example, Dr. Fredrickson’s team found that six weeks of training in a form of meditation focused on compassion and kindness resulted in an increase in positive emotions and social connectedness and improved function of one of the main nerves that helps to control heart rate. The result is a more variable heart rate that, she said in an interview, is associated with objective health benefits like better control of blood glucose, less inflammation and faster recovery from a heart attack.

I will dissect this key claim about loving-kindness meditation and vagal tone/heart rate variability shortly.

Dr. Davidson’s team showed that as little as two weeks’ training in compassion and kindness meditation generated changes in brain circuitry linked to an increase in positive social behaviors like generosity.

We will save discussing Richard Davidson for another time. But really, Jane, just two weeks to better health? Where is the generosity center in brain circuitry? I dare you to ask a serious neuroscientist and embarrass yourself.

“The results suggest that taking time to learn the skills to self-generate positive emotions can help us become healthier, more social, more resilient versions of ourselves,” Dr. Fredrickson reported in the National Institutes of Health monthly newsletter in 2015.

In other words, Dr. Davidson said, “well-being can be considered a life skill. If you practice, you can actually get better at it.” By learning and regularly practicing skills that promote positive emotions, you can become a happier and healthier person. Thus, there is hope for people like my friend’s parents should they choose to take steps to develop and reinforce positivity.

In her newest book, “Love 2.0,” Dr. Fredrickson reports that “shared positivity — having two people caught up in the same emotion — may have even a greater impact on health than something positive experienced by oneself.” Consider watching a funny play or movie or TV show with a friend of similar tastes, or sharing good news, a joke or amusing incidents with others. Dr. Fredrickson also teaches “loving-kindness meditation” focused on directing good-hearted wishes to others. This can result in people “feeling more in tune with other people at the end of the day,” she said.

Brody ends with 8 things Fredrickson and others endorse to foster positive emotions. (Why only 8 recommendations, why not come up with 10 and make them commandments?) These include “Do good things for other people” and “Appreciate the world around you. Okay, but do Fredrickson and Davidson really show that engaging in these activities have immediate and dramatic effects on our health? I have examined their research and I doubt it. I think the larger problem, though, is the suggestion that physically ill people facing shortened lives risk being blamed for being bad people. They obviously did not do these 8 things or else they would be healthy.

If Brody were selling herbal supplements or coffee enemas, we would readily label the quackery. We should do the same for advice about psychological practices that are promised to transform lives.

Brody’s sloppy links to support her claims: Love 2.0

Journalists who talk of “science”  and respect their readers will provide links to their actual sources in the peer-reviewed scientific literature. That way, readers who are motivated can independently review the evidence. Especially in an outlet as prestigious as The New York Times.

Jane Brody is outright promiscuous in the links that she provides, often secondary or tertiary sources. The first link provide for her discussion of Fredrickson’s Love 2.0 is actually to a somewhat negative review of the book. https://www.scientificamerican.com/article/mind-reviews-love-how-emotion-afftects-everything-we-feel/

Fredrickson builds her case by expanding on research that shows how sharing a strong bond with another person alters our brain chemistry. She describes a study in which best friends’ brains nearly synchronize when exchanging stories, even to the point where the listener can anticipate what the storyteller will say next. Fredrickson takes the findings a step further, concluding that having positive feelings toward someone, even a stranger, can elicit similar neural bonding.

This leap, however, is not supported by the study and fails to bolster her argument. In fact, most of the evidence she uses to support her theory of love falls flat. She leans heavily on subjective reports of people who feel more connected with others after engaging in mental exercises such as meditation, rather than on more objective studies that measure brain activity associated with love.

I would go even further than the reviewer. Fredrickson builds her case by very selectively drawing on the literature, choosing only a few studies that fit.  Even then, the studies fit only with considerable exaggeration and distortion of their findings. She exaggerates the relevance and strength of her own findings. In other cases, she says things that have no basis in anyone’s research.

I came across Love 2.0: How Our Supreme Emotion Affects Everything We Feel, Think, Do, and Become (Unabridged) that sells for $17.95. The product description reads:

We all know love matters, but in this groundbreaking book positive emotions expert Barbara Fredrickson shows us how much. Even more than happiness and optimism, love holds the key to improving our mental and physical health as well as lengthening our lives. Using research from her own lab, Fredrickson redefines love not as a stable behemoth, but as micro-moments of connection between people – even strangers. She demonstrates that our capacity for experiencing love can be measured and strengthened in ways that improve our health and longevity. Finally, she introduces us to informal and formal practices to unlock love in our lives, generate compassion, and even self-soothe. Rare in its scope and ambitious in its message, Love 2.0 will reinvent how you look at and experience our most powerful emotion.

There is a mishmash of language games going on here. Fredrickson’s redefinition of love is not based on her research. Her claim that love is ‘really’ micro-moments of connection between people  – even strangers is a weird re-definition. Attempt to read her book, if you have time to waste.

You will quickly see that much of what she says makes no sense in long-term relationships which is solid but beyond the honeymoon stage. Ask partners in long tem relationships and they will undoubtedly lack lots of such “micro-moments of connection”. I doubt that is adaptive for people seeking to build long term relationships to have the yardstick that if lots of such micro-moments don’t keep coming all the time, the relationship is in trouble. But it is Fredrickson who is selling the strong claims and the burden is on her to produce the evidence.

If you try to take Fredrickson’s work seriously, you wind up seeing she has a rather superficial view of a close relationships and can’t seem to distinguish them from what goes on between strangers in drunken one-night stands. But that is supposed to be revolutionary science.

We should not confuse much of what Fredrickson emphatically states with testable hypotheses. Many statements sound more like marketing slogans – what Joachim Kruger and his student Thomas Mairunteregger identify as the McDonaldalization of positive psychology. Like a Big Mac, Fredrickson’s Love 2.0 requires a lot of imagination to live up to its advertisement.

Fredrickson’s love the supreme emotion vs ‘Trane’s Love Supreme

Where Fredrickson’s selling of love as the supreme emotion is not simply an advertising slogan, it is a bad summary of the research on love and health. John Coltrane makes no empirical claim about love being supreme. But listening to him is an effective self-soothing after taking Love 2.0 seriously and trying to figure it out.  Simply enjoy and don’t worry about what it does for your positivity ratio or micro-moments, shared or alone.

Fredrickson’s study of loving-kindness meditation

Jane Brody, like Fredrickson herself depends heavily on a study of loving kindness meditation in proclaiming the wondrous, transformative health benefits of being loving and kind. After obtaining Fredrickson’s data set and reanalyzing it, my colleagues – James Heathers, Nick Brown, and Harrison Friedman – and I arrived at a very different interpretation of her study. As we first encountered it, the study was:

Kok, B. E., Coffey, K. A., Cohn, M. A., Catalino, L. I., Vacharkulksemsuk, T., Algoe, S. B., . . . Fredrickson, B. L. (2013). How positive emotions build physical health: Perceived positive social connections account for the upward spiral between positive emotions and vagal tone. Psychological Science, 24, 1123-1132.

Consolidated standards for reporting randomized trials (CONSORT) are widely accepted for at least two reasons. First, clinical trials should be clearly identified as such in order to ensure that the results are a recognized and available in systematic searches to be integrated with other studies. CONSORT requires that RCTs be clearly identified in the titles and abstracts. Once RCTs are labeled as such, the CONSORT checklist becomes a handy tallying of what needs to be reported.

It is only in supplementary material that the Kok and Fredrickson paper is identify as a clinical trial. Only in that supplement is the primary outcome is identified, even in passing. No means are reported anywhere in the paper or supplement. Results are presented in terms of what Kok and Fredrickson term “a variant of a mediational, parallel process, latent-curve model.” Basic statistics needed for its evaluation are left to readers’ imagination. Figure 1 in the article depicts the awe-inspiring parallel-process mediational model that guided the analyses. We showed the figure to a number of statistical experts including Andrew Gelman. While some elements were readily recognizable, the overall figure was not, especially the mysterious large dot (a causal pathway roundabout?) near the top.

So, not only might study not be detected as an RCT, there isn’t relevant information that could be used for calculating effect sizes.

Furthermore, if studies are labeled as RCTs, we immediately seek protocols published ahead of time that specify the basic elements of design and analyses and primary outcomes. At Psychological Science, studies with protocols are unusual enough to get the authors awarded a badge. In the clinical and health psychology literature, protocols are increasingly common, like flushing a toilet after using a public restroom. No one runs up and thanks you, “Thank you for flushing/publishing your protocol.”

If Fredrickson and her colleagues are going to be using the study to make claims about the health benefits of loving kindness meditation, they have a responsibility to adhere to CONSORT and to publish their protocol. This is particularly the case because this research was federally funded and results need to be transparently reported for use by a full range of stakeholders who paid for the research.

We identified a number of other problems and submitted a manuscript based on a reanalysis of the data. Our manuscript was promptly rejected by Psychological Science. The associate editor . Batja Mesquita noted that two of my co-authors, Nick Brown and Harris Friedman had co-authored a paper resulting in a partial retraction of Fredrickson’s, positivity ratio paper.

Brown NJ, Sokal AD, Friedman HL. The Complex Dynamics of Wishful Thinking: The Critical Positivity Ratio American Psychologist. 2013 Jul 15.

I won’t go into the details, except to say that Nick and Harris along with Alan Sokal unambiguously established that Fredrickson’s positivity ratio of 2.9013 positive to negative experiences was a fake fact. Fredrickson had been promoting the number  as an “evidence-based guideline” of a ratio acting as a “tipping point beyond which the full impact of positive emotions becomes unleashed.” Once Brown and his co-authors overcame strong resistance to getting their critique published, their paper garnered a lot of attention in social and conventional media. There is a hilariously funny account available at Nick Brown Smelled Bull.

Batja Mesquita argued that that the previously published critique discouraged her from accepting our manuscript. To do, she would be participating in “a witch hunt” and

 The combatant tone of the letter of appeal does not re-assure me that a revised commentary would be useful.

Welcome to one-sided tone policing. We appealed her decision, but Editor Eric Eich indicated, there was no appeal process at Psychological Science, contrary to the requirements of the Committee on Publication Ethics, COPE.

Eich relented after I shared an email to my coauthors in which I threatened to take the whole issue into social media where there would be no peer-review in the traditional outdated sense of the term. Numerous revisions of the manuscript were submitted, some of them in response to reviews by Fredrickson  and Kok who did not want a paper published. A year passed occurred before our paper was accepted and appeared on the website of the journal. You can read our paper here. I think you can see that fatal problems are obvious.

Heathers JA, Brown NJ, Coyne JC, Friedman HL. The elusory upward spiral a reanalysis of Kok et al.(2013). Psychological Science. 2015 May 29:0956797615572908.

In addition to the original paper not adhering to CONSORT, we noted

  1. There was no effect of whether participants were assigned to the loving kindness mediation vs. no-treatment control group on the key physiological variable, cardiac vagal tone. This is a thoroughly disguised null trial.
  2. Kok and Frederickson claimed that there was an effect of meditation on cardiac vagal tone, but any appearance of an effect was due to reduced vagal tone in the control group, which cannot readily be explained.
  3. Kok and Frederickson essentially interpreted changes in cardiac vagal tone as a surrogate outcome for more general changes in physical health. However, other researchers have noted that observed changes in cardiac vagal tone are not consistently related to changes in other health variables and are susceptible to variations in experimental conditions that have nothing to do with health.
  4. No attention was given to whether participants assigned to the loving kindness meditation actually practiced it with any frequency or fidelity. The article nonetheless reported that such data had been collected.

Point 2 is worth elaborating. Participants in the control condition received no intervention. Their assessment of cardiac vagal tone/heart rate variability was essentially a test/retest reliability test of what should have been a stable physiological characteristic. Yet, participants assigned to this no-treatment condition showed as much change as the participants who were assigned to meditation, but in the opposite direction. Kok and Fredrickson ignored this and attributed all differences to meditation. Houston, we have a problem, a big one, with unreliability of measurement in this study.

We could not squeeze all of our critique into our word limit, but James Heathers, who is an expert on cardiac vagal tone/heart rate variability elaborated elsewhere.

  • The study was underpowered from the outset, but sample size decreased from 65 to 52 to missing data.
  • Cardiac vagal tone is unreliable except in the context of carefully control of the conditions in which measurements are obtained, multiple measurements on each participant, and a much larger sample size. None of these conditions were met.
  • There were numerous anomalies in the data, including some participants included without baseline data, improbable baseline or follow up scores, and improbable changes. These alone would invalidate the results.
  • Despite not reporting  basic statistics, the article was full of graphs, impressive to the unimformed, but useless to readers attempting to make sense of what was done and with what results.

We later learned that the same data had been used for another published paper. There was no cross-citation and the duplicate publication was difficult to detect.

Kok, B. E., & Fredrickson, B. L. (2010). Upward spirals of the heart: Autonomic flexibility, as indexed by vagal tone, reciprocally and prospectively predicts positive emotions and social connectedness. Biological Psychology, 85, 432–436. doi:10.1016/j.biopsycho.2010.09.005

Pity the poor systematic reviewer and meta analyst trying to make sense of this RCT and integrate it with the rest of the literature concerning loving-kindness meditation.

This was not our only experience obtained data for a paper crucial to Fredrickson’s claims and having difficulty publishing  our findings. We obtained data for claims that she and her colleagues had solved the classical philosophical problem of whether we should pursue pleasure or meaning in our lives. Pursuing pleasure, they argue, will adversely affect genomic transcription.

We found we could redo extremely complicated analyses and replicate original findings but there were errors in the the original entering data that entirely shifted the results when corrected. Furthermore, we could replicate the original findings when we substituted data from a random number generator for the data collected from study participants. After similar struggles to what we experienced with Psychological Science, we succeeded in getting our critique published.

The original paper

Fredrickson BL, Grewen KM, Coffey KA, Algoe SB, Firestine AM, Arevalo JM, Ma J, Cole SW. A functional genomic perspective on human well-being. Proceedings of the National Academy of Sciences. 2013 Aug 13;110(33):13684-9.

Our critique

Brown NJ, MacDonald DA, Samanta MP, Friedman HL, Coyne JC. A critical reanalysis of the relationship between genomics and well-being. Proceedings of the National Academy of Sciences. 2014 Sep 2;111(35):12705-9.

See also:

Nickerson CA. No Evidence for Differential Relations of Hedonic Well-Being and Eudaimonic Well-Being to Gene Expression: A Comment on Statistical Problems in Fredrickson et al.(2013). Collabra: Psychology. 2017 Apr 11;3(1).

A partial account of the reanalysis is available in:

Reanalysis: No health benefits found for pursuing meaning in life versus pleasure. PLOS Blogs Mind the Brain

Wrapping it up

Strong claims about health effects require strong evidence.

  • Evidence produced in randomized trials need to be reported according to established conventions like CONSORT and clear labeling of duplicate publications.
  • When research is conducted with public funds, these responsibilities are increased.

I have often identified health claims in high profile media like The New York Times and The Guardian. My MO has been to trace the claims back to the original sources in peer reviewed publications, and evaluate both the media reports and the quality of the primary sources.

I hope that I am arming citizen scientists for engaging in these activities independent of me and even to arrive at contradictory appraisals to what I offer.

  • I don’t think I can expect to get many people to ask for data and perform independent analyses and certainly not to overcome the barriers my colleagues and I have met in trying to publish our results. I share my account of some of those frustrations as a warning.
  • I still think I can offer some take away messages to citizen scientists interested in getting better quality, evidence-based information on the internet.
  • Assume most of the claims readers encounter about psychological states and behavior being simply changed and profoundly influencing physical health are false or exaggerated. When in doubt, disregard the claims and certainly don’t retweet or “like” them.
  • Ignore journalists who do not provide adequate links for their claims.
  • Learn to identify generally reliable sources and take journalists off the list when they have made extravagant or undocumented claims.
  • Appreciate the financial gains to be made by scientists who feed journalists false or exaggerated claims.

Advice to citizen scientists who are cultivating more advanced skills:

Some key studies that Brody invokes in support of her claims being science-based are poorly conducted and reported clinical trials that are not labeled as such. This is quite common in positive psychology, but you need to cultivate skills to even detect that is what is going on. Even prestigious psychology journals are often lax in labeling studies as RCTs and in enforcing reporting standards. Authors’ conflicts of interest are ignored.

It is up to you to

  • Identify when the claims you are being fed should have been evaluated in a clinical trial.
  • Be skeptical when the original research is not clearly identified as clinical trial but nonetheless compares participants who received the intervention and those who did not.
  • Be skeptical when CONSORT is not followed and there is no published protocol.
  • Be skeptical of papers published in journals that do not enforce these requirements.

Disclaimer

I think I have provided enough details for readers to decide for themselves whether I am unduly influenced by my experiences with Barbara Fredrickson and her data. She and her colleagues have differing accounts of her research and of the events I have described in this blog.

As a disclosure, I receive money for writing these blog posts, less than $200 per post. I am also marketing a series of e-books,  including Coyne of the Realm Takes a Skeptical Look at Mindfulness and Coyne of the Realm Takes a Skeptical Look at Positive Psychology.

Maybe I am just making a fuss to attract attention to these enterprises. Maybe I am just monetizing what I have been doing for years virtually for free. Regardless, be skeptical. But to get more information and get on a mailing list for my other blogging, go to coyneoftherealm.com and sign up.

Sex and the single amygdala: A tale almost saved by a peek at the data

So sexy! Was bringing up ‘risky sex’ merely a strategy to publish questionable and uninformative science?

wikipedia 1206_FMRIMy continuing question: Can skeptics who are not specialists, but who are science-minded and have some basic skills, learn to quickly screen and detect questionable science in the journals and media coverage?

You don’t need a weatherman to know which way the wind blows.” – Bob Dylandylan wind blows

I hope so. One goal of my blogging is to arouse readers’ skepticism and provide them some tools so that they can decide for themselves what to believe, what to reject, and what needs a closer look or a check against trusted sources.

Skepticism is always warranted in science, but it is particularly handy when confronting the superficial application of neuroscience to every aspect of human behavior. Neuroscience is increasingly being brought into conversations to sell ideas and products when it is neither necessary nor relevant. Many claims about how the brain is involved are false or exaggerated not only in the media, but in the peer-reviewed journals themselves.

A while ago I showed how a neuroscientist and a workshop guru teamed up to try to persuade clinicians with functional magnetic resonance imaging (fMRI) data  that a couples therapy was more sciencey than the rest. Although I took a look at some complicated neuroscience, a lot of my reasoning [1, 2, 3] merely involved applying basic knowledge of statistics and experimental design. I raised sufficient skepticism to dismiss the neuroscientist and psychotherapy guru’s claims, Even putting aside the excellent specialist insights provided by Neurocritic and his friend Magneto.

In this issue of Mind the Brain, I’m pursuing another tip from Neurocritic about some faulty neuroscience in need of debunking.

The paper

Victor, E. C., Sansosti, A. A., Bowman, H. C., & Hariri, A. R. (2015). Differential Patterns of Amygdala and Ventral Striatum Activation Predict Gender-Specific Changes in Sexual Risk Behavior. The Journal of Neuroscience, 35(23), 8896-8900.

Unfortunately, the paper is behind a pay wall. If you can’t get it through a university library portal, you can send a request for a PDF to the corresponding author, elizabeth.victor@duke.edu.

The abstract

Although the initiation of sexual behavior is common among adolescents and young adults, some individuals express this behavior in a manner that significantly increases their risk for negative outcomes including sexually transmitted infections. Based on accumulating evidence, we have hypothesized that increased sexual risk behavior reflects, in part, an imbalance between neural circuits mediating approach and avoidance in particular as manifest by relatively increased ventral striatum (VS) activity and relatively decreased amygdala activity. Here, we test our hypothesis using data from seventy 18- to 22-year-old university students participating in the Duke Neurogenetics Study. We found a significant three-way interaction between amygdala activation, VS activation, and gender predicting changes in the number of sexual partners over time. Although relatively increased VS activation predicted greater increases in sexual partners for both men and women, the effect in men was contingent on the presence of relatively decreased amygdala activation and the effect in women was contingent on the presence of relatively increased amygdala activation. These findings suggest unique gender differences in how complex interactions between neural circuit function contributing to approach and avoidance may be expressed as sexual risk behavior in young adults. As such, our findings have the potential to inform the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.

My thought processes

Hmm, sexual risk behavior -meaning number of partners? How many new partners during a follow-up period constitutes “risky” and does it matter whether safe sex was practiced? Well, ignoring these issues and calling it “sexual risk behavior “allows the authors to claim relevance to hot topics like HIV prevention….

But let’s cut to the chase: I’m always skeptical about a storyline depending on a three-way statistical interaction. These effects are highly unreliable, particularly in a sample size of only N = 70. I’m suspicious why investigators ahead of time staking their claims on a three-way interaction, not something simpler. I will be looking for evidence that they started with this hypothesis in mind, rather than cooking it up after peeking at the data.

fixed-designs-for-psychological-research-35-638Three-way interactions involve dividing a sample up into at eight boxes, in this case, 2 x (2) x (2). Such interactions can be mind-boggling to interpret, and this one is no exception

Although relatively increased VS activation predicted greater increases in sexual partners for both men and women, the effect in men was contingent on the presence of relatively decreased amygdala activation and the effect in women was contingent on the presence of relatively increased amygdala activation.

And then the “simple” interpretation?

These findings suggest unique gender differences in how complex interactions between neural circuit function contributing to approach and avoidance may be expressed as sexual risk behavior in young adults.

And the public health implications?

As such, our findings have the potential to inform the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.

hs-amygdalaJust how should these data inform public health strategies beyond what we knew before we stumbled upon this article? Really, should we stick people’s heads in a machine and gather fMRI data  before offering them condoms? Should we encourage computer dating services to post along with a recent headshot, recent fMRI images showing that prospective dates do not have their risky behavior center in the amygdala activated? Or encourage young people to get their heads examined with an fMRI before deciding whether it’s wise to sleep with somebody new?

So it’s difficult to see the practical relevance of these findings, but let’s stick around and consider the paragraph that Neurocritic singled out.

The paragraph

outlierThe majority of the sample reported engaging in vaginal sex at least once in their lifetime (n = 42, 60%). The mean number of vaginal sexual partners at baseline was 1.28 (SD =0.68). The mean increase in vaginal sexual partners at the last follow-up was 0.71 (SD = 1.51). There were no significant differences between men and women in self-reported baseline or change in self-reported number of sexual partners (t=0.05, p=0.96; t=1.02, p= 0.31, respectively). Although there was not a significant association between age and self-reported number of partners at baseline (r = 0.17, p= 0.16), younger participants were more likely to report a greater increase in partners over time (r =0.24, p =0.04). Notably, distribution analyses revealed two individuals with outlying values (3 SD from M; both subjects reported an increase in 8 partners between baseline and follow up). Given the low rate of sexual risk behavior reported in the sample, these outliers were not excluded, as they likely best represent young adults engaging in sexual risk behavior.

What triggers skepticism?

This paragraph is quite revealing if we just ponder it a bit.

First, notice there is only a single significant correlation (p=.04) in a subgroup analysis. Differences between men and women were examined finding no significant findings in either baseline or changes in number of sexual partners over the length of the observation. However, disregarding that finding, the authors went on to explore changes in number of partners over time among the younger participants and, bingo, there was their p =0.04.

Whoa! Age was never mentioned in the abstract. We are now beyond the 2 x 2 x 2 interaction mentioned in the abstract and rooting through another dimension, younger versus older.

But, worse, getting that significance required retaining two participants with eight new sexual partners each during the follow-up period. The decision to retain these participants was made after the pattern of results was examined with and without inclusion of these outliers. The authors say so and essentially say they decided because it made a better story.

The only group means and standard deviation included these two participants. Even including the participants, the average number of new sexual partners was less than one during some follow-up. We have no idea whether that one was risky or not. It’s a safer assumption that having eight new partners is risky, but even that we don’t know for sure.

Keep in mind for future reference: Investigators are supposed to make decisions about outliers without reference to the fate of the hypothesis being studied. And knowing nothing about this particular study, most authorities would say if two people out of 70 are way out there on a particular variable that otherwise has little variance, you should exclude them.

It is considered a Questionable Research Practice to make decisions about inclusion/exclusion based on what story the outcome of this decision allows the authors to tell. It is p-hacking, and significance chasing.

And note the distribution of numbers of vaginal sex partners. Twenty eight participants had none at the end of the study. Most accumulated less than one during the follow up, and even that mean number was distorted by two having eight partners. Hmm, it is going to be hard to get multivariate statistics to work appropriately when we get to the fancy neuroscience data. We could go off on discussions of multivariate normal or Poisson distributions or just think a bit..

We can do a little detective work and determine that one outlier was a male, another a female. (*1) Let’s go back to our eight little boxes of participants that are involved in the interpretation of the three-way interaction. It’s going to make a great difference exactly where the deviant male and female are dropped into one of the boxes or whether they are left out.

And think about sampling issues. What if, for reasons having nothing to with the study, neither of these outliers had shown up? Or if only one of them had showed up, it would skew the results in a particular direction, depending on whether the participant was the male or female.

Okay, if we were wasting our time continuing to read the article after finding what we did in the abstract, we are certainly wasting more of our time by continuing after reading this paragraph. But let’s keep poking around as an educational exercise.

The rest of the methods and results sections

We learn from the methods section that there was an ethnically diverse sample with a highly variable follow-up, from zero days to 3.9 years (M = 188.72 d, SD = 257.15; range = 0 d–3.19 years). And there were only 24 men in the original sample for the paper of 70 participants.

We don’t know whether these two outliers had eight sexual partners within a week of the first assessment or they were the ones captured in extending the study to almost 4 years. That matters somewhat, but we also have to worry whether this was an appropriate sample – with so few participants in it in the first place and even fewer who had sex by the end of the study – and length of follow-up to do such a study. The mean follow-up of about six months and huge standard deviation suggest there is not a lot of evidence of risky behavior, at least in terms of casual vaginal sex.

This is all getting very funky.

So I wondered about the larger context of the study, with increasing doubts that the authors had gone to all this trouble just to test an a priori hypothesis about risky sex.

We are told that the larger context is the ongoing “Duke Neurogenetics Study (DNS), which assesses a wide range of behavioral and biological traits.” The extensive list of inclusions and exclusions suggests a much more ambitious study. If we had more time, we could go look up the Duke Neurogenetics Study and see if that’s the case. But I have a strong suspicion that the study was not organized around the specific research questions of this paper (*2). I really can’t tell without any preregistration of this particular paper but I certainly have questions about how much Hypothesizing after the Results Are Known (HARKing) is going on here in the refining of hypotheses and measures, and decisions about which data to report.

Further explorations of the results section

I remind readers that I know little about fMRI data. Put it aside and we can discover some interesting things reading through the brief results section.

Main effects of task

As expected, our fMRI paradigms elicited robust affect-related amygdala and reward-related VS activity across the entire parent sample of 917 participants (Fig. 1). In our substudy sample of 70 participants, there were no significant effects of gender (t(70) values < 0.88, p values >0.17) or age (r values < 0.22; p values > 0.07) on VS or amygdala activity in either hemisphere.

figure1Hmm, let’s focus on the second sentence first. The authors tell us absolutely nothing is going on in terms of differences in amygdala and reward-related VS activity in relation to age and gender in the sample of 70 participants in the current study. In fact, we don’t even need to know what “amygdala and reward-related VS activity” is to wonder why the first sentence of this paragraph directs us to a graph not of the 70 participants, but a larger sample of 917 participants. And when we go to figure 1, we see some wild wowie zowie, hit-the-reader-between-the-eyes differences (in technical terms, intraocular trauma) for women. And claims of p < 0.000001 twice. But wait! One might think significance of that magnitude would have to come from the 917 participants, except the labeling of the X-axis must come from the substudy of the 70 participants for whom data concerning number of sex partners was collected. Maybe the significance comes from the anchoring of one of the graph lines by the one wayout outlier.

Note that the outlier woman with eight partners anchors the blue line for High Left Amygdala. Without inclusion of that single woman, the nonsignificant trends between women with High Left Amygdala versus women with Low Left Amygdala would be reversed.

figure2The authors make much of the differences between Figure 1 showing Results for Women and Figure 2 showing Results for Men. The comparison seems dramatic except that, once again, the one outlier sends the red line for Low Left Amygdala off from the blue line for High Left Amygdala. Otherwise, there is no story to tell. Mind-boggling, but I think we can safely conclude that something is amiss in these Frankenstein graphs.

Okay, we should stop beating a corpse of an article. There are no vital signs left.

Alternatively, we could probe the section on Poisson regressions and minimally note some details. There is the flash of some strings of zeros in the P values, but it seems complicated and then we are warned off with “no factors survive Bonferroni correction.” And then in the next paragraph, we get to exploring dubious interactions. And there is the final insult of the authors bringing in a two-way interaction trending toward significance among men, p =.051.

But we were never told how all this would lead as we were promised in the end of the abstract, “to the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.”

Rushing through the discussion section, we note the disclosure that

The nature of these unexpected gender differences on clear and warrants further consideration.

So, the authors confess that they did not start with expectations of finding a gender difference. They had nothing to report from a subset of data from an ambitious project put together for other purposes with an ill-suited follow-up for the research question (and even an ill-suited experimental task. They made a decision to include two outliers, salvaged some otherwise weak and inconsistent differences, and then constructed a story that depended on their inclusion. Bingo, they can survive confirmation bias and get published.

Readers might have been left with just their skepticism about the three-way interaction described in the abstract. However, the authors implicated themselves by disclosing in the article their examination of a distribution and reasons for including outlier. Then they further disclosed they did not start with a hypothesis about gender differences.

Why didn’t the editor and reviewers at Journal of Neuroscience (impact factor 6.344) do their job and cry foul? Questionable research practices (QRPs) are brought to us courtesy of questionable publication practices (QPPs).

And then we end with the confident

These limitations notwithstanding, our current results suggest the importance of considering gender-specific patterns of interactions between functional neural circuits supporting approach and avoidance in the expression of sexual risk behavior in young adults.

Yet despite this vague claim, the authors still haven’t explained how this research could be translated to practice.

Takeaway points for the future.

Without a tip from NeuroCritic, I might not have otherwise zeroed in on the dubious complex statistical interaction on which the storyline in the abstract depended. I also benefited from the authors for whatever reason telling us that they had peeked at the data and telling us further in the discussion that they had not anticipated the gender difference. With current standards for transparency and no preregistration of such studies, it would’ve been easy for us to miss what was done because the authors did not need to alert us. Until there are more and better standards enforced, we just need to be extra skeptical of claims of the application of neuroscience to everyday life.

Trust your skepticism.

Apply whatever you know about statistics and experimental methods. You probably know more than you think you do

Beware of modest sized neuroscience studies for which authors develop storylines from the patterning authors can discover in their data, not from a priori hypotheses suggested by a theory. If you keep looking around in the scientific literature and media coverage of it, I think you will find a lot of this QRP and QPP.

Don’t go into a default believe-it mode just because an article is peer-reviewed.

Notes

  1. If both the outliers were of the same gender, it would have been enough for that gender to have had significantly more sex partners than the other.
  1. Later we had told in the Discussion section that particular stimuli for which fMRI data were available were not chosen for relevance to the research question claimed for this this paper.

We did not measure VS and amygdala activity in response to sexually provocative stimuli but rather to more general representations of reward and affective arousal. It is possible that variability in VS and amygdala activity to such explicit stimuli may have different or nonexistent gender-specific patterns that may or may not map onto sexual risk behaviors.

Special thanks to Neurocritic for suggesting this blog post and for feedback, as well as to Neuroskeptic, Jessie Sun, and Hayley Jach for helpful feedback. However, @CoyneoftheRealm bears sole responsibility for any excesses or errors in this post.