Jane Brody promoting the pseudoscience of Barbara Fredrickson in the New York Times

Journalists’ coverage of positive psychology and health is often shabby, even in prestigious outlets like The New York Times.

Jane Brody’s latest installment of the benefits of being positive on health relied heavily on the work of Barbara Fredrickson that my colleagues and I have thoroughly debunked.

All of us need to recognize that research concerning effects of positive psychology interventions are often disguised randomized controlled trials.

With that insight, we need to evaluate this research in terms of reporting standards like CONSORT and declarations of conflict of interests.

We need to be more skeptical about the ability of small changes in behavior being able to profoundly improve health.

When in doubt, assume that much of what we read in the media about positivity and health is false or at least exaggerated.

Jane Brody starts her article in The New York Times by describing how most mornings she is “grinning from ear to ear, uplifted not just by my own workout but even more so” by her interaction with toddlers on the way home from where she swims. When I read Brody’s “Turning Negative Thinkers Into Positive Ones.” I was not left grinning ear to ear. I was left profoundly bummed.

I thought real hard about what was so unsettling about Brody’s article. I now have some clarity.

I don’t mind suffering even pathologically cheerful people in the morning. But I do get bothered when they serve up pseudoscience as the real thing.

I had expected to be served up Brody’s usual recipe of positive psychology pseudoscience concocted  to coerce readers into heeding her Barnum advice about how they should lead their lives. “Smile or die!” Apologies to my friend Barbara Ehrenreich for my putting the retitling of her book outside of North America to use here. I invoke the phrase because Jane Brody makes the case that unless we do what she says, we risk hurting our health and shortening our lives. So we better listen up.

What bummed me most this time was that Brody was drawing on the pseudoscience of Barbara Fredrickson that my colleagues and I have worked so hard to debunk. We took the trouble of obtaining data sets for two of her key papers for reanalysis. We were dismayed by the quality of the data. To start with, we uncovered carelessness at the level of data entry that undermined her claims. But her basic analyses and interpretations did not hold up either.

Fredrickson publishes exaggerated claims about dramatic benefits of simple positive psychology exercises. Fredrickson is very effective in blocking or muting the publication of criticism and getting on with hawking her wares. My colleagues and I have talked to others who similarly met considerable resistance from editors in getting detailed critiques and re-analyses published. Fredrickson is also aided by uncritical people like Jane Brody to promote her weak and inconsistent evidence as strong stuff. It sells a lot of positive psychology merchandise to needy and vulnerable people, like self-help books and workshops.

If it is taken seriously, Fredrickson’s research concerns health effects of behavioral intervention. Yet, her findings are presented in a way that does not readily allow their integration with the rest of health psychology literature. It would be difficult, for instance, to integrate Fredrickson’s randomized trials of loving-kindness meditation with other research because she makes it almost impossible to isolate effect sizes in a way that they could be integrated with other studies in a meta-analysis. Moreover, Fredrickson has multiply published contradictory claims from the sae data set without acknowledging the duplicate publication. [Please read on. I will document all of these claims before the post ends.]

The need of self-help gurus to generate support for their dramatic claims in lucrative positive psychology self-help products is never acknowledged as a conflict of interest.  It should be.

Just imagine, if someone had a contract based on a book prospectus promising that the claims of their last pop psychology book would be surpassed. Such books inevitably paint life too simply, with simple changes in behavior having profound and lasting effects unlike anything obtained in the randomized trials of clinical and health psychology. Readers ought to be informed that these pressures to meet demands of a lucrative book contract could generate a strong confirmation bias. Caveat emptor auditor, but how about at least informing readers and let them decide whether following the money influences their interpretation of what they read?

Psychology journals almost never require disclosures of conflicts of interest of this nature. I am campaigning to make that practice routine, nondisclosure of such financial benefits tantamount to scientific misconduct. I am calling for readers to take to social media when these disclosures do not appear in scientific journals where they should be featured prominently. And holding editors responsible for non-enforcement . I can cite Fredrickson’s work as a case in point, but there are many other examples, inside and outside of positive psychology.

Back to Jane Brody’s exaggerated claims for Fredrickson’s work.

I lived for half a century with a man who suffered from periodic bouts of depression, so I understand how challenging negativism can be. I wish I had known years ago about the work Barbara Fredrickson, a psychologist at the University of North Carolina, has done on fostering positive emotions, in particular her theory that accumulating “micro-moments of positivity,” like my daily interaction with children, can, over time, result in greater overall well-being.

The research that Dr. Fredrickson and others have done demonstrates that the extent to which we can generate positive emotions from even everyday activities can determine who flourishes and who doesn’t. More than a sudden bonanza of good fortune, repeated brief moments of positive feelings can provide a buffer against stress and depression and foster both physical and mental health, their studies show.

“Research…demonstrates” (?). Brody is feeding stupid-making pablum to readers. Fredrickson’s kind of research may produce evidence one way or the other, but it is too strong a claim, an outright illusion, to even begin suggesting that it “demonstrates” (proves) what follows in this passage.

Where, outside of tabloids and self-help products, do the immodest claims that one or a few poor quality studies “demonstrate”?

Negative feelings activate a region of the brain called the amygdala, which is involved in processing fear and anxiety and other emotions. Dr. Richard J. Davidson, a neuroscientist and founder of the Center for Healthy Minds at the University of Wisconsin — Madison, has shown that people in whom the amygdala recovers slowly from a threat are at greater risk for a variety of health problems than those in whom it recovers quickly.

Both he and Dr. Fredrickson and their colleagues have demonstrated that the brain is “plastic,” or capable of generating new cells and pathways, and it is possible to train the circuitry in the brain to promote more positive responses. That is, a person can learn to be more positive by practicing certain skills that foster positivity.

We are knee deep in neuro-nonsense. Try asking a serious neuroscientists about the claims that this duo have “demonstrated that the brain is ‘plastic,’ or that practicing certain positivity skills change the brain with the health benefits that they claim via Brody. Or that they are studying ‘amygdala recovery’ associated with reduced health risk.

For example, Dr. Fredrickson’s team found that six weeks of training in a form of meditation focused on compassion and kindness resulted in an increase in positive emotions and social connectedness and improved function of one of the main nerves that helps to control heart rate. The result is a more variable heart rate that, she said in an interview, is associated with objective health benefits like better control of blood glucose, less inflammation and faster recovery from a heart attack.

I will dissect this key claim about loving-kindness meditation and vagal tone/heart rate variability shortly.

Dr. Davidson’s team showed that as little as two weeks’ training in compassion and kindness meditation generated changes in brain circuitry linked to an increase in positive social behaviors like generosity.

We will save discussing Richard Davidson for another time. But really, Jane, just two weeks to better health? Where is the generosity center in brain circuitry? I dare you to ask a serious neuroscientist and embarrass yourself.

“The results suggest that taking time to learn the skills to self-generate positive emotions can help us become healthier, more social, more resilient versions of ourselves,” Dr. Fredrickson reported in the National Institutes of Health monthly newsletter in 2015.

In other words, Dr. Davidson said, “well-being can be considered a life skill. If you practice, you can actually get better at it.” By learning and regularly practicing skills that promote positive emotions, you can become a happier and healthier person. Thus, there is hope for people like my friend’s parents should they choose to take steps to develop and reinforce positivity.

In her newest book, “Love 2.0,” Dr. Fredrickson reports that “shared positivity — having two people caught up in the same emotion — may have even a greater impact on health than something positive experienced by oneself.” Consider watching a funny play or movie or TV show with a friend of similar tastes, or sharing good news, a joke or amusing incidents with others. Dr. Fredrickson also teaches “loving-kindness meditation” focused on directing good-hearted wishes to others. This can result in people “feeling more in tune with other people at the end of the day,” she said.

Brody ends with 8 things Fredrickson and others endorse to foster positive emotions. (Why only 8 recommendations, why not come up with 10 and make them commandments?) These include “Do good things for other people” and “Appreciate the world around you. Okay, but do Fredrickson and Davidson really show that engaging in these activities have immediate and dramatic effects on our health? I have examined their research and I doubt it. I think the larger problem, though, is the suggestion that physically ill people facing shortened lives risk being blamed for being bad people. They obviously did not do these 8 things or else they would be healthy.

If Brody were selling herbal supplements or coffee enemas, we would readily label the quackery. We should do the same for advice about psychological practices that are promised to transform lives.

Brody’s sloppy links to support her claims: Love 2.0

Journalists who talk of “science”  and respect their readers will provide links to their actual sources in the peer-reviewed scientific literature. That way, readers who are motivated can independently review the evidence. Especially in an outlet as prestigious as The New York Times.

Jane Brody is outright promiscuous in the links that she provides, often secondary or tertiary sources. The first link provide for her discussion of Fredrickson’s Love 2.0 is actually to a somewhat negative review of the book. https://www.scientificamerican.com/article/mind-reviews-love-how-emotion-afftects-everything-we-feel/

Fredrickson builds her case by expanding on research that shows how sharing a strong bond with another person alters our brain chemistry. She describes a study in which best friends’ brains nearly synchronize when exchanging stories, even to the point where the listener can anticipate what the storyteller will say next. Fredrickson takes the findings a step further, concluding that having positive feelings toward someone, even a stranger, can elicit similar neural bonding.

This leap, however, is not supported by the study and fails to bolster her argument. In fact, most of the evidence she uses to support her theory of love falls flat. She leans heavily on subjective reports of people who feel more connected with others after engaging in mental exercises such as meditation, rather than on more objective studies that measure brain activity associated with love.

I would go even further than the reviewer. Fredrickson builds her case by very selectively drawing on the literature, choosing only a few studies that fit.  Even then, the studies fit only with considerable exaggeration and distortion of their findings. She exaggerates the relevance and strength of her own findings. In other cases, she says things that have no basis in anyone’s research.

I came across Love 2.0: How Our Supreme Emotion Affects Everything We Feel, Think, Do, and Become (Unabridged) that sells for $17.95. The product description reads:

We all know love matters, but in this groundbreaking book positive emotions expert Barbara Fredrickson shows us how much. Even more than happiness and optimism, love holds the key to improving our mental and physical health as well as lengthening our lives. Using research from her own lab, Fredrickson redefines love not as a stable behemoth, but as micro-moments of connection between people – even strangers. She demonstrates that our capacity for experiencing love can be measured and strengthened in ways that improve our health and longevity. Finally, she introduces us to informal and formal practices to unlock love in our lives, generate compassion, and even self-soothe. Rare in its scope and ambitious in its message, Love 2.0 will reinvent how you look at and experience our most powerful emotion.

There is a mishmash of language games going on here. Fredrickson’s redefinition of love is not based on her research. Her claim that love is ‘really’ micro-moments of connection between people  – even strangers is a weird re-definition. Attempt to read her book, if you have time to waste.

You will quickly see that much of what she says makes no sense in long-term relationships which is solid but beyond the honeymoon stage. Ask partners in long tem relationships and they will undoubtedly lack lots of such “micro-moments of connection”. I doubt that is adaptive for people seeking to build long term relationships to have the yardstick that if lots of such micro-moments don’t keep coming all the time, the relationship is in trouble. But it is Fredrickson who is selling the strong claims and the burden is on her to produce the evidence.

If you try to take Fredrickson’s work seriously, you wind up seeing she has a rather superficial view of a close relationships and can’t seem to distinguish them from what goes on between strangers in drunken one-night stands. But that is supposed to be revolutionary science.

We should not confuse much of what Fredrickson emphatically states with testable hypotheses. Many statements sound more like marketing slogans – what Joachim Kruger and his student Thomas Mairunteregger identify as the McDonaldalization of positive psychology. Like a Big Mac, Fredrickson’s Love 2.0 requires a lot of imagination to live up to its advertisement.

Fredrickson’s love the supreme emotion vs ‘Trane’s Love Supreme

Where Fredrickson’s selling of love as the supreme emotion is not simply an advertising slogan, it is a bad summary of the research on love and health. John Coltrane makes no empirical claim about love being supreme. But listening to him is an effective self-soothing after taking Love 2.0 seriously and trying to figure it out.  Simply enjoy and don’t worry about what it does for your positivity ratio or micro-moments, shared or alone.

Fredrickson’s study of loving-kindness meditation

Jane Brody, like Fredrickson herself depends heavily on a study of loving kindness meditation in proclaiming the wondrous, transformative health benefits of being loving and kind. After obtaining Fredrickson’s data set and reanalyzing it, my colleagues – James Heathers, Nick Brown, and Harrison Friedman – and I arrived at a very different interpretation of her study. As we first encountered it, the study was:

Kok, B. E., Coffey, K. A., Cohn, M. A., Catalino, L. I., Vacharkulksemsuk, T., Algoe, S. B., . . . Fredrickson, B. L. (2013). How positive emotions build physical health: Perceived positive social connections account for the upward spiral between positive emotions and vagal tone. Psychological Science, 24, 1123-1132.

Consolidated standards for reporting randomized trials (CONSORT) are widely accepted for at least two reasons. First, clinical trials should be clearly identified as such in order to ensure that the results are a recognized and available in systematic searches to be integrated with other studies. CONSORT requires that RCTs be clearly identified in the titles and abstracts. Once RCTs are labeled as such, the CONSORT checklist becomes a handy tallying of what needs to be reported.

It is only in supplementary material that the Kok and Fredrickson paper is identify as a clinical trial. Only in that supplement is the primary outcome is identified, even in passing. No means are reported anywhere in the paper or supplement. Results are presented in terms of what Kok and Fredrickson term “a variant of a mediational, parallel process, latent-curve model.” Basic statistics needed for its evaluation are left to readers’ imagination. Figure 1 in the article depicts the awe-inspiring parallel-process mediational model that guided the analyses. We showed the figure to a number of statistical experts including Andrew Gelman. While some elements were readily recognizable, the overall figure was not, especially the mysterious large dot (a causal pathway roundabout?) near the top.

So, not only might study not be detected as an RCT, there isn’t relevant information that could be used for calculating effect sizes.

Furthermore, if studies are labeled as RCTs, we immediately seek protocols published ahead of time that specify the basic elements of design and analyses and primary outcomes. At Psychological Science, studies with protocols are unusual enough to get the authors awarded a badge. In the clinical and health psychology literature, protocols are increasingly common, like flushing a toilet after using a public restroom. No one runs up and thanks you, “Thank you for flushing/publishing your protocol.”

If Fredrickson and her colleagues are going to be using the study to make claims about the health benefits of loving kindness meditation, they have a responsibility to adhere to CONSORT and to publish their protocol. This is particularly the case because this research was federally funded and results need to be transparently reported for use by a full range of stakeholders who paid for the research.

We identified a number of other problems and submitted a manuscript based on a reanalysis of the data. Our manuscript was promptly rejected by Psychological Science. The associate editor . Batja Mesquita noted that two of my co-authors, Nick Brown and Harris Friedman had co-authored a paper resulting in a partial retraction of Fredrickson’s, positivity ratio paper.

Brown NJ, Sokal AD, Friedman HL. The Complex Dynamics of Wishful Thinking: The Critical Positivity Ratio American Psychologist. 2013 Jul 15.

I won’t go into the details, except to say that Nick and Harris along with Alan Sokal unambiguously established that Fredrickson’s positivity ratio of 2.9013 positive to negative experiences was a fake fact. Fredrickson had been promoting the number  as an “evidence-based guideline” of a ratio acting as a “tipping point beyond which the full impact of positive emotions becomes unleashed.” Once Brown and his co-authors overcame strong resistance to getting their critique published, their paper garnered a lot of attention in social and conventional media. There is a hilariously funny account available at Nick Brown Smelled Bull.

Batja Mesquita argued that that the previously published critique discouraged her from accepting our manuscript. To do, she would be participating in “a witch hunt” and

 The combatant tone of the letter of appeal does not re-assure me that a revised commentary would be useful.

Welcome to one-sided tone policing. We appealed her decision, but Editor Eric Eich indicated, there was no appeal process at Psychological Science, contrary to the requirements of the Committee on Publication Ethics, COPE.

Eich relented after I shared an email to my coauthors in which I threatened to take the whole issue into social media where there would be no peer-review in the traditional outdated sense of the term. Numerous revisions of the manuscript were submitted, some of them in response to reviews by Fredrickson  and Kok who did not want a paper published. A year passed occurred before our paper was accepted and appeared on the website of the journal. You can read our paper here. I think you can see that fatal problems are obvious.

Heathers JA, Brown NJ, Coyne JC, Friedman HL. The elusory upward spiral a reanalysis of Kok et al.(2013). Psychological Science. 2015 May 29:0956797615572908.

In addition to the original paper not adhering to CONSORT, we noted

  1. There was no effect of whether participants were assigned to the loving kindness mediation vs. no-treatment control group on the key physiological variable, cardiac vagal tone. This is a thoroughly disguised null trial.
  2. Kok and Frederickson claimed that there was an effect of meditation on cardiac vagal tone, but any appearance of an effect was due to reduced vagal tone in the control group, which cannot readily be explained.
  3. Kok and Frederickson essentially interpreted changes in cardiac vagal tone as a surrogate outcome for more general changes in physical health. However, other researchers have noted that observed changes in cardiac vagal tone are not consistently related to changes in other health variables and are susceptible to variations in experimental conditions that have nothing to do with health.
  4. No attention was given to whether participants assigned to the loving kindness meditation actually practiced it with any frequency or fidelity. The article nonetheless reported that such data had been collected.

Point 2 is worth elaborating. Participants in the control condition received no intervention. Their assessment of cardiac vagal tone/heart rate variability was essentially a test/retest reliability test of what should have been a stable physiological characteristic. Yet, participants assigned to this no-treatment condition showed as much change as the participants who were assigned to meditation, but in the opposite direction. Kok and Fredrickson ignored this and attributed all differences to meditation. Houston, we have a problem, a big one, with unreliability of measurement in this study.

We could not squeeze all of our critique into our word limit, but James Heathers, who is an expert on cardiac vagal tone/heart rate variability elaborated elsewhere.

  • The study was underpowered from the outset, but sample size decreased from 65 to 52 to missing data.
  • Cardiac vagal tone is unreliable except in the context of carefully control of the conditions in which measurements are obtained, multiple measurements on each participant, and a much larger sample size. None of these conditions were met.
  • There were numerous anomalies in the data, including some participants included without baseline data, improbable baseline or follow up scores, and improbable changes. These alone would invalidate the results.
  • Despite not reporting  basic statistics, the article was full of graphs, impressive to the unimformed, but useless to readers attempting to make sense of what was done and with what results.

We later learned that the same data had been used for another published paper. There was no cross-citation and the duplicate publication was difficult to detect.

Kok, B. E., & Fredrickson, B. L. (2010). Upward spirals of the heart: Autonomic flexibility, as indexed by vagal tone, reciprocally and prospectively predicts positive emotions and social connectedness. Biological Psychology, 85, 432–436. doi:10.1016/j.biopsycho.2010.09.005

Pity the poor systematic reviewer and meta analyst trying to make sense of this RCT and integrate it with the rest of the literature concerning loving-kindness meditation.

This was not our only experience obtained data for a paper crucial to Fredrickson’s claims and having difficulty publishing  our findings. We obtained data for claims that she and her colleagues had solved the classical philosophical problem of whether we should pursue pleasure or meaning in our lives. Pursuing pleasure, they argue, will adversely affect genomic transcription.

We found we could redo extremely complicated analyses and replicate original findings but there were errors in the the original entering data that entirely shifted the results when corrected. Furthermore, we could replicate the original findings when we substituted data from a random number generator for the data collected from study participants. After similar struggles to what we experienced with Psychological Science, we succeeded in getting our critique published.

The original paper

Fredrickson BL, Grewen KM, Coffey KA, Algoe SB, Firestine AM, Arevalo JM, Ma J, Cole SW. A functional genomic perspective on human well-being. Proceedings of the National Academy of Sciences. 2013 Aug 13;110(33):13684-9.

Our critique

Brown NJ, MacDonald DA, Samanta MP, Friedman HL, Coyne JC. A critical reanalysis of the relationship between genomics and well-being. Proceedings of the National Academy of Sciences. 2014 Sep 2;111(35):12705-9.

See also:

Nickerson CA. No Evidence for Differential Relations of Hedonic Well-Being and Eudaimonic Well-Being to Gene Expression: A Comment on Statistical Problems in Fredrickson et al.(2013). Collabra: Psychology. 2017 Apr 11;3(1).

A partial account of the reanalysis is available in:

Reanalysis: No health benefits found for pursuing meaning in life versus pleasure. PLOS Blogs Mind the Brain

Wrapping it up

Strong claims about health effects require strong evidence.

  • Evidence produced in randomized trials need to be reported according to established conventions like CONSORT and clear labeling of duplicate publications.
  • When research is conducted with public funds, these responsibilities are increased.

I have often identified health claims in high profile media like The New York Times and The Guardian. My MO has been to trace the claims back to the original sources in peer reviewed publications, and evaluate both the media reports and the quality of the primary sources.

I hope that I am arming citizen scientists for engaging in these activities independent of me and even to arrive at contradictory appraisals to what I offer.

  • I don’t think I can expect to get many people to ask for data and perform independent analyses and certainly not to overcome the barriers my colleagues and I have met in trying to publish our results. I share my account of some of those frustrations as a warning.
  • I still think I can offer some take away messages to citizen scientists interested in getting better quality, evidence-based information on the internet.
  • Assume most of the claims readers encounter about psychological states and behavior being simply changed and profoundly influencing physical health are false or exaggerated. When in doubt, disregard the claims and certainly don’t retweet or “like” them.
  • Ignore journalists who do not provide adequate links for their claims.
  • Learn to identify generally reliable sources and take journalists off the list when they have made extravagant or undocumented claims.
  • Appreciate the financial gains to be made by scientists who feed journalists false or exaggerated claims.

Advice to citizen scientists who are cultivating more advanced skills:

Some key studies that Brody invokes in support of her claims being science-based are poorly conducted and reported clinical trials that are not labeled as such. This is quite common in positive psychology, but you need to cultivate skills to even detect that is what is going on. Even prestigious psychology journals are often lax in labeling studies as RCTs and in enforcing reporting standards. Authors’ conflicts of interest are ignored.

It is up to you to

  • Identify when the claims you are being fed should have been evaluated in a clinical trial.
  • Be skeptical when the original research is not clearly identified as clinical trial but nonetheless compares participants who received the intervention and those who did not.
  • Be skeptical when CONSORT is not followed and there is no published protocol.
  • Be skeptical of papers published in journals that do not enforce these requirements.

Disclaimer

I think I have provided enough details for readers to decide for themselves whether I am unduly influenced by my experiences with Barbara Fredrickson and her data. She and her colleagues have differing accounts of her research and of the events I have described in this blog.

As a disclosure, I receive money for writing these blog posts, less than $200 per post. I am also marketing a series of e-books,  including Coyne of the Realm Takes a Skeptical Look at Mindfulness and Coyne of the Realm Takes a Skeptical Look at Positive Psychology.

Maybe I am just making a fuss to attract attention to these enterprises. Maybe I am just monetizing what I have been doing for years virtually for free. Regardless, be skeptical. But to get more information and get on a mailing list for my other blogging, go to coyneoftherealm.com and sign up.

Will following positive psychology advice make you happier and healthier?

smile or dieSmile or Die the European retitling of Barbara Ehrenreich’s realist, anti-positive-psychology book Bright Sided:How Positive Thinking Is Undermining Americacaptures the threat of some positive psychology marketers’ advice: if you do not buy what we sell, you will face serious consequences to your health.

Barbara Fredrickson, along with co-authors including Steven Cole, make the threat that if we simply pursue pleasure in our lives rather than meaning, there will be dire consequences for our immune system by way of the effects on genomic expression.

People who are happy but have little-to-no sense of meaning in their lives have the same gene expression patterns as people who are enduring chronic adversity.

A group consisting of Nick Brown, Doug McDonald, Manoj Samanta, Harris Friedman and myself obtained and reanalyzed the data on which Fredrickson et al based their claim. We concluded:

Not only is Fredrickson et al.’s article conceptually deficient, but more crucially statistical analyses are fatally flawed, to the point that their claimed results are in fact essentially meaningless.

objecrtive approach to moralIn workshops, books, and lucrative talks to corporate gatherings, Fredrickson promises that practicing the loving-kindness meditation that she markets will send you on an upward spiral of physical and mental health that ends who knows where.

My co-authors – this time, Nick Brown, Harris Friedman and James Heathers– and I examined her paper and obtained her data. Re-analyses found no evidence that loving-kindness meditation improved physical health. The proxy measure for physical health in this study – cardiac vagal tone – is not actually reliably related to objective measures of physical health and probably wouldn’t be accepted in other contexts. And it was not affected by loving-kindness meditation anyway.

Katrinaloverimages http://www.fanpop.com/clubs/katerinalover/images/30154750/title/dont-worry-happy-photo
Katrinaloverimages
http://www.fanpop.com/clubs/katerinalover/images/30154750/title/dont-worry-happy-photo

The simplest interpretation of Fredrickson’s interrelated and perhaps overlapping studies of loving-kindness meditation is that lots of people drop out from follow-up and any apparent effect of the meditation is actually due to unexplained deterioration in the control group. And though data concerning the participants’ practice of mediation were collected, none were presented concerning whether participants assigned to mediation actually practiced it or how it affected physical and mental health outcomes. Why were the data collected if they were not going to be reported? They could be used to address the crucial question of whether actually practicing meditation affects health and well-being.

Another queen of positive psychology advice, Sonia Lyubomirsky, proclaims in a highly cited paper:

The field of positive psychology is young, yet much has already been accomplished that practitioners can effectively integrate into their daily practices. As our metaanalysis confirms, positive psychology interventions can materially improve the wellbeing of many.

I showed these claims are based on a faulty meta-analysis of methodologically-poor studies. In addition to Lyubomirsky’s highly-cited meta-analysis, I examined a more recent and better meta-analysis by Bolier and colleagues. It showed that the smaller and poorer-quality a study of positive psychology interventions is, the stronger the effect size. With the more recent studies included in Bolier’s meta-analysis, I concluded:

The existing literature does not provide robust support for the efficacy of positive psychology interventions for depressive symptoms. The absence of evidence is not necessarily evidence of an absence of an effect. However, more definitive conclusions await better quality studies with adequate sample sizes and suitable control of possible risk of bias. Widespread dissemination of positive psychology interventions, particularly with glowing endorsements and strong claims of changing lives, is premature in the absence of evidence they are effective.

I’m quite confident that this conclusion holds for effects on positive affect and general well-being as well.

Actually, when Lyubomirsky attempted to demonstrate the efficacy she claims for positive psychology interventions, she obtained null results but relegated her findings to a book chapter that was not peer reviewed. Yet, her marketing of the claim that positive psychology interventions improve well-being continues undaunted and gets echoed in the most recent papers coming out of the positive psychology community, such as:

Robust evidence exists that positive psychology interventions are effective in enhancing well-being and ameliorating depression.

Advice gurus claim that practicing positive psychology interventions will lead to health and well-being without a good scientific basis. But another literature attempts to identify small changes in everyday and laboratory behavior that can have lasting benefits. These studies are not explicitly evaluating interventions, but the claim is that they identify small behaviors with potentially big implications for well-being and happiness.

Let’s start with an example from the Wall Street Journal (WSJ):

Walk this way: Acting happy can make it so

Research shows people can improve their mood with small changes in behavior

Elizabeth Dunn, Associate Professor of Psychology at the University of British Columbia, provides an orientation:

There are these little doses of social interactions that are available in our day” that can brighten our mood and create a sense of belonging. “I don’t think people recognize this.”

depressed postureThe article starts with a discussion of work by Johannes Michalak from the Department of Psychology and Psychotherapy at Witten Herdecke University, Germany. In one study, 30 depressed psychiatric inpatients were randomized to instructions to sit in either a slumped (n =15) or an upright (n =15) position and then completed a memory test. The idea is that an emotion like depression is embodied. Adopting a slumped posture should increase a depressive negative bias in recall. The abstract of the original article reports:

Upright-sitting patients showing unbiased recall of positive and negative words but slumped patients showing recall biased towards more negative words.

Michalak conducted another study in which the gait of 39 college students was manipulated with biofeedback so as to simulate either being depressed or nondepressed as they walked on a treadmill. During the period on the treadmill, the experimenter read 40 words to them and they were tested for recall. The abstract of the original study reports:

The difference between recalled positive and recalled negative words was much lower in participants who adopted a depressed walking style as compared to participants who walked as if they were happy.

As would be with expected with such small sample sizes, results were weak. Analyses were unnecessarily complicated. It’s not clear that effects would persist if more basic statistics were  presented. For instance, did patients assigned to the “depressive” slump condition in the first study recall fewer positive and more negative words, or both, or neither? Certainly in the second study with college students, there were no differences in recall of positive words and only small differences in recall of negative words. Claims in the abstract were based on the construction of a more complicated composite positive variable.

Michalak is following a familiar strategy in the positive psychology literature – indeed, one that is more widely followed in psychology: If you cannot obtain positive findings in straightforward, simple analyses, then (1) adopt flexible rules of analyses, such as selective introduction of covariates and making up new composite variables; (2) don’t report the simple statistics and analyses in tables where readers could check them; and (3) spin your results in the abstract because that is what most readers will rely on in deciding what your study found

Michalak claims that these studies point to manipulation of the embodiment of depression as a means of treating depression:

There is a mutual influence between mood and body and movement…There might be specific types of movements that are specific characteristics of depression and this feeds the lower mood. So it’s a vicious cycle.

Presumably, with this as a premise, depressed patients could obtain a clinically-significant improvement in mood if they sat up straight and walked faster. Maybe, but this is at best speculative and premature. Michalak does not directly test the take-away message the author of the WSJ article wants to give: Even if you are not depressed, you can improve you mood by sitting up straight and adopt what Michalak calls a “happy walking style.”

To be fair to Michalak, he may be pumping up the strength and significance of his findings and promoting himself a bit. But unlike the rest of the authors discussed in the WSJ article, he is not yet prematurely turning some scientific papers of modest significance and strength of findings into press releases, TED talks and positive psychology products like books and workshops.

But let’s turn to the work of Nicholas Epley that is next described in the article. Epley is a Professor of Behavioral Science, University of Chicago Booth School of Business and author of Mindwise: How We Understand What Others Think, Believe, Feel, and Want. Epley does not have a TED talk, but got mentioned in the Business Blog of the Financial Times as not needing one. And he’s available through the Washington Speakers Bureau, whose website proclaims it is “connecting you with the world’s greatest minds”.

According to the WSJ article:

“I used to sit in quiet solitude on the train,” Dr. Epley said. “I don’t anymore. I know now from our data that learning something interesting about the person sitting next to me would be more fun than pretty much anything else I’d be doing then,” he said.

This is a reference to his article

Epley, N., & Schroeder, J. (2014). Mistakenly seeking solitude. Journal of Experimental Psychology: General, 143(5), 1980.

talking to strangersThe study actually involves trains, buses, and taxis and giving different participants instructions to either connect with strangers, remain disconnected, or commute as normal.

In the train experiment, a composite measure was substituted for a simpler measure of whether these various strategies made participants happier:

To obtain an overall measure of positivity, we first calculated positive mood (happy minus sad), then standardized positive mood and pleasantness, and then averaged those two measures into a single index.

Epley study one
Error bars represent the standard error around the mean of each condition.

A one-way analysis of variance indicated a significant difference among the three conditions (p< .05) explaining a modest 6% of participants’ variation in the composite measure of mood. However, consulting Figure 1 in the article suggests the effect was in the difference between instructions to remain disconnected versus the other two conditions, not between connecting with strangers versus commuting as normal. See an excerpt of Figure 1 to the left.

Results were somewhat stronger when the experiment was replicated on buses, (p = .02), with 10% of the variance in participant mood explained by the condition to which participants were assigned. See figure to the right.Epley bus experiment 2

When the experiment involved talking to a taxi driver, significant results were obtained (p <.01). But this time, pairwise differences between conditions were tested and there was no significant difference between the connecting and the control condition, only between the control condition and the condition in which participants were instructed not to talk to the taxi driver.

This may not be rocket science, but it is apparently worthy of press releases, media coverage, and positive psychology products. The results are overall weak and may even disappear in straightforward analyses with simple measures of happiness. The most robust interpretation I could construct was if someone asks you to refrain from talking to others on the train or bus or even a taxi driver, you probably should ignore them. I offer this advice for free, and have no intention of presenting it in a TED talk with unattributed anecdotes.

Re-enter Elizabeth Dunn, Associate Professor of Psychology at University of British Columbia. Dr. Dunn is the author of Happy Money: The Science of Smarter Spending, has presented a  TED talk. She is available as a speaker through the Lavin Agency, which according to its website, is “making the world a smarter place.” The WSJ article reports on:

Sandstrom, G. M., & Dunn, E. W. (2013). Is efficiency overrated? Minimal social interactions lead to belonging and positive affect. Social Psychological and Personality Science,

Participants were instructed to either avoid any unnecessary conversation with a barista at Starbuck’s and simply be efficient in getting their coffee or:

“have a genuine interaction with the cashier—smile, make eye contact to establish a connection, and have a brief conversation.”

The journal article reports that participants instructed to make a “genuine connection” had more positive affect and less negative affect than those instructed to avoid unnecessary conversation. Unfortunately, unlike the Epley experiment, we are not given any comparison with a control condition, which would’ve clarified whether the effect was primarily due to instructing participants to have a “genuine connection” or to avoid conversation.

Then there are Professor Dunn’s student Jordi Quoidbach’s chocolate experiments, which have been promoted not only in this WSJ article, but in Dunn’s op-ed in the New York Times, “Don’t indulge. Be happy.”

The first of the studies was:

Quoidbach, J., Dunn, E. W., Petrides, K. V., & Mikolajczak, M. (2010). Money Giveth, Money Taketh Away: The Dual Effect of Wealth on Happiness. Psychological Science.

stack-50-euro-bills-isolated-white-19404122The study involved priming participants with a reminder of wealth – a photo of a large stack of Euro bills – or a similar photo that was blurred beyond recognition in the control condition. The 40 participants were then instructed to eat a piece of chocolate and complete a follow-up questionnaire.

As seen in the other studies, simple analyses were suppressed in favor of a more complex analysis. Namely, preliminary examination of the data revealed that female participants savored chocolate more than males. So, rather than simple t-test, analyses of covariance were conducted with gender and prior attitude towards chocolate as control variables. Note that they were only 20 participants per group to begin with, and so results of these multivariate analyses are quite dubious. Participants primed with the money photo spent less time eating the piece of chocolate and were rated by observers as enjoying it less.

Studies involving priming participants with seemingly irrelevant, but suggestive stimuli such as this one are now held in low regard and some feel they have contributed to the crisis of confidence in social psychology. Nobel Prize winner Daniel Kahneman suggests the lack of replicability of social priming research is “a train wreck looming” for social psychology. Results are often too good to be true and cannot be replicated. Jordi Quoidbach and her colleagues cite studies by Vohs(1,2) as supporting the validity of the manipulation, however, the two primary studies by Vohs could not be independently replicated.

Overall, this is an underpowered-study with results that probably depended on flexible analyses rather than simple ones. We would probably ignore it, except it appeared in the prestigious journal Psychological Science and has been hyped in the media and positive psychology products.

The second chocolate study was:

Quoidbach, J., & Dunn, E. W. (2013). Give It Up: A Strategy for Combating Hedonic Adaptation. Social Psychological and Personality Science, 4(5), 563-568.

chocolateThe study involved asking 64 participants to eat two pieces of chocolate into lab sessions separated by a week. Analyses were based on the 55 who showed up for the second session. Participants had been randomized to one of three conditions: restricted access (n = 16) in which participants were told not to buy any chocolate until the next lab session; abundant access (n = 18) in which participants were given two pounds of chocolate and told to eat as much as they comfortably could before the next lab session; and a control condition in which no explicit instructions (n = 21) were given. In the second lab session, all participants were given a second piece of chocolate.

Once again we have a small study in which the authors deny readers an opportunity to examine simple statistical tests of what should be a simple hypothesis: that restricted versus free access has an effect on enjoyment of a piece of chocolate. Instead of a simple one-way analysis of variance, the authors looked at their data and decided to do the (unnecessarily) more complex analysis of covariance. Nonetheless, we can still see that in pairwise comparisons between groups, there are differences between the restricted access and the abundant access and control group. Yet there were no differences between the abundant access instructions and having no instructions for what to do in the week between sessions.

The authors did not provide readers with appropriate analyses of group differences in changes in overall positive affect between the two sessions. Nonetheless, within-group t-tests revealed a decline in overall positive affect only for the abundance condition.

So, another small study in which positive results probably depended on tricky flexible analyses. We would not be discussing this if it were not in a relatively prestigious journal, discussed in the WSJ, and written about by one of the authors in an op-ed piece in the New York Times. I invite your comparison of my analysis to the hyped presentation and exaggerated significance for the study claimed in the op-ed.

This blog post ends quite differently than I originally intended. I wanted to take some highly-promoted findings in the positive psychology literature about the effects of small things on overall well-being. I looked to a WSJl article reporting findings from basically prestigious journals with recognizably big name promoters  of positive psychology.

I had expected that positive psychology people out promoting their work and selling their products could surely come up with some unambiguous findings. I could then discuss how we could decide whether to attempt to translate those findings into strategies in our everyday lives and whether we could expect them to be sustained with any lasting impact on our well-being. Unfortunately, I didn’t get that far. Findings turned out to be not particularly positive despite being presented as such. That became an interesting story in itself, even if I will still have to search for robust findings from the positive psychology literature in order to discuss the likelihood that following “scientific” positive psychology advice will make us happier overall.

Despite heavily-marketed claims to the contrary, positive psychology interventions do not consistently improve mental or physical health and well-being. The myth that these interventions are efficacious is perpetuated by a mutually-admiring, self-promotional collective that protects its claims from independent peer review and scrutiny.

As with the positive psychology intervention literature, it is a quick leap from the authors submitting a manuscript to a peer-reviewed journal to making claims in the media, including op-ed pieces in the New York Times, and then releasing products like workshops and books that are lavishly praised by other members of the positive psychology community.

It is apparently too much to expect that positive psychology advice givers will take time out from their self-promotion to replicate what are essentially pilot studies before hitting the road and writing op eds again. And too much to expect that the Association of Psychological Science journals Psychological Science and Social Psychological and Personality Science will insist on transparent reporting of adequately powered studies as a condition for publication.

The incentives for scientifically sound positive psychology advice just aren’t there.

Special thanks to the Skeptical Cat, who is smarter, more independent,  and less easily led than the Skeptical Dog.skeptical sleuth cat 8 30 14-1

 

Reanalysis: No health benefits found for pursuing meaning in life versus pleasure

NOTE: After I wrote this blog post, I received via PNAS the reply from Steve Cole and Barbara Fredrickson to our article.  I did not have time to thoroughly digest it, but will address it in a future blog post. My preliminary impression is that their reply is, ah…a piece of work. For a start, they attack our mechanical bitmapping of their data as an unvalidated statistical procedure. But calling it a statistical procedure is like Sarah Palin calling Africa a country. And they again assert the validity of  their scoring of a self-report questionnaire without documentation. As seen below, I had already offered to donate $100 to charity if they can produce the unpublished analyses that justified this idiosyncratic scoring. The offer stands. They claim that our factor analyses were in appropriate because the sample size was too small, but we used their data, which they claimed to have factor analyzed. Geesh. But more on their reply later.

Our new PNAS article questions the reliability of results and interpretations in a high profile previous PNAS article.

Fredrickson, Barbara L., Karen M. Grewen, Kimberly A. Coffey, Sara B. Algoe, Ann M. Firestine, Jesusa MG Arevalo, Jeffrey Ma, and Steven W. Cole. “A functional genomic perspective on human well-being.” Proceedings of the National Academy of Sciences 110, no. 33 (2013):   13684-13689.

 

From http://theoaklandjournal.com/oaklandnj/health-happiness-vs-meaning/
Oakland Journal http://tinyurl.com/lpbqqn6
Click to enlarge

Was the original article a matter of “science” made for press release? Our article poses issues concerning the gullibility of the scientific community and journalists regarding claims of breakthrough discoveries from small studies with provocative, but fuzzy theorizing and complicated methodologies and statistical analyses that apparently even the authors themselves do not understand.

  •  Multiple analyses of original data do not find separate factors indicating striving for pleasure versus purpose
  • Random number generators yield best predictors of gene expression from the original data

[Warning, numbers ahead. This blog post contains some excerpts from the results section that contain lots of numbers and require some sophistication to interpret. I encourage readers to at least skim these sections, to allow independent evaluation of some of things that I will say in the rest of the blog.]

A well-orchestrated media blitz for the PNAS article had triggered my skepticism. The Economist, CNN, The Atlantic Monthly and countless newspapers seemingly sang praise in unison for the significance of the article.

objecrtive approach to moralMaybe the research reported in PNAS was, as one the authors, Barbara Fredrickson claimed, a major breakthrough in behavioral genomics, a science-based solution to an age-old philosophical problem of how to lead one’s life.  Or, as she has later claimed in a July 2014 talk in Amsterdam, the PNAS article provided an objective basis for moral philosophy.

Maybe it showed

People who are happy but have little to no sense of meaning in their lives—proverbially, simply here for the party—have the same gene expression patterns as people who are responding to and enduring chronic adversity.

Skeptical? Maybe you are paying too much attention to your conscious mind. What does it know? According to author Steve Cole

What this study tells us is that doing good and feeling good have very different effects on the human genome, even though they generate similar levels of positive emotion… “Apparently, the human genome is much more sensitive to different ways of achieving happiness than are conscious minds.”

Or maybe this PNAS article was an exceptional example of the kind of nonsense, pure bunk, you can find in a prestigious journal.

Assembling a Team.

I blogged about the PNAS article. People whom I have yet to meet expressed concerns similar to mine. We began collaborating, overcoming considerable differences in personal style but taking advantage of complementary skills and background.

It all started with a very tentative email exchange with Nick Brown. He brought on his co-author from his American Psychologist article demolishing any credibility to a precise positivity ratio, Harris Friedman. Harris in turn brought on Doug McDonald to examine Fredrickson and Cole’s claims that factor analysis supported their clean distinction between two forms of well-being with opposite effects on health.

Manoj Samanta found us by way of my blog post and then a Google search that took him electric fishto Nick and Harris’ article with Alan Sokal. Manoj cited my post in his own blog. When Nick saw it, he contacted him. Manoj was working in genomics, attempting to map the common genomic basis for the evolution of electric organs in fish from around the world, but was a physicist in recovery. He was delighted to work with a couple of guys who had a co-authored a paper with his hero from grad school, Alan Sokal. Manoj interpreted Fredrickson and Cole’s seeming unnecessarily complicated approach to genomic analysis. Nick set off to deconstruct and reproduce Cole’s regression analyses predicting genomic expression.  He discovered that Cole’s procedure generated statistically significant (but meaningless) results from over two-thirds of the thousands of ways of splitting the psychometric data.  Even using random numbers produced huge numbers of junk results.

The final group was Nick, Doug, Manoj, Harris, and myself. Others came and went from our email exchanges, some accepting our acknowledgment in the paper, while others asked us explicitly not to acknowledge them.

The team gave an extraordinarily careful look at the article, noting its fuzzy theorizing and conceptual deficiencies, but we did much more than that. We obtained the original data and asked the authors of the original paper about their complex analytic methods. We then reanalyzed the data, following their specific advice. We tried alternative analyses and even re-did the same analyses with randomly generated data. Overall, our hastily assembled group performed and interpreted 1000s of analyses, more than many productive labs do in a year.

The embargo on our paper in PNAS is now off.

I can report our conclusion that

Not only is Fredrickson et al.’s article conceptually deficient, but more crucially statistical analyses are fatally flawed, to the point that their claimed results are in fact essentially meaningless.

A summary of our PNAS article is available here and the final draft is here.

Fuzzy thinking creates theoretical and general methodological  problems

Fractal_FunFredrickson et al. claimed that two types of strivings for well-being, eudaimonic and hedonic have distinct and opposite effects on physical health, by way of “molecular signaling pathways” or genomic expression, despite an unusually high correlation for two supposedly different variables. I had challenged the authors about the validity of their analyses in my earlier blog post and then in a letter to PNAS, but got blown off. Their reply dismissed my concerns, citing analyses that they have never shown, either in the original article or the reply.

In our article, we noted a subtlety in the distinction between eudamonia and hedonia.

Eudaimonic well-being, generally defined (including by Fredrickson et al.) in terms of tendencies to strive for meaning, appears to be trait-like, since such striving for meaning is typically an ongoing life strategy.

Hedonic well-being, in contrast is typically defined in terms of a person’s (recent) affective experiences, and is state-like; regardless of the level of meaning in one’s life, everyone experiences “good” and “bad” days.

The problem is

If well-being is a state, then a person’s level of well-being will change over time and perhaps at a very fast rate.  If we only measure well-being at one time point, as Fredrickson et al. did, then unless we obtain a genetic sample at the same time, the likelihood that the well-being score will actually accurately reflect level of genomic expression will be diminished if not eliminated.

In an interview with David Dobbs, Steven Cole seems to suggest an irreversibility to thecole big slide changes that eudaimonic and hedonic strivings produce:

“Your experiences today will influence the molecular composition of your body for the next two to three months,” he tells his audience, “or, perhaps, for the rest of your life. Plan your day accordingly.”

Hmm. Really? Evidence?

Eudaimonic and hedonic well-being constructs may have a long history in philosophy, but empirically separating them is an unsolved problem. And taken together, the two constructs by no means capture the complexity of well-being.

Is a scientifically adequate taxonomy of well-being on which to do research even possible? Maybe, but doubts are raised when one considers the overcrowded field of well-being concepts available in the literature—

General well-being, subjective well-being, psychological well-being, ontological well-being, spiritual well-being, religious well-being, existential well-being, chaironic well-being, emotional well-being, and physical well-being—along with the various constructs which treated as essentially synonymous with well-being, such as self-esteem, life-satisfaction, and, lest we forget, happiness.

No one seems to be paying attention to this confusing proliferation of similar constructs and how they are supposed to relate to each other. But in the realm of negative emotion, the problem is well known and variously referred to as the “big mush” or “crud factor”. Actually, there is a good deal of difficulty separating out positive well-being concepts from their obverse concepts, negative well-being.

Fredrickson and colleagues found that eudaimonia and especially hedonic well-being were strongly, but negatively related to depression. Their measures of depression qualified as a covariate or confound for their analyses, but somehow disappeared from further consideration. If it had been retained, it would have further reduced the analyses to gobbledygook. Technically speaking, the residual of hedonia-controlling-for (highly correlated)-eudaimonia-and-depression does not even have a family resemblance to hedonia and is probably nonsense.

Fredrickson et al. measured well-being with which they called the Short Flourishing Scale, AKA and better known in the literature as the Mental Health Continuum-Short Form (MHC-SF).

We looked and we were not able to identify any published evidence of a two factor solution in which distinct eudaimonic and hedonic well-being factors adequately characterized MHC-SF data.

The closest thing we could find was

Keyes et al. (10) referred to these groupings of hedonic and eudaimonic items as “clusters,” an ostensibly neutral term that seems to deliberately avoid the word “factor.”

However, his split of the MHC-SF items into hedonic and eudaimonic categories appears to have been made mainly to to allow arbitrary classifying of persons as “languishing” versus “flourishing.” Yup, positive psychology is now replacing the stigma of conventional psychology’s deficiency model of depressed versus not depressed with a strength model of languishing versus flourishing.

In contrast to the rest of the MHC-SF literature,  Fredrickson at el referred to a factor analysis of – implicitly in their original PNAS paper, and then explicitly in reply to my PNAS letter – yielding two distinct factors (“Hedonic” and “Eudaimonic”), corresponding to Keyes’ languishing versus flourishing diagnoses (i.e., items SF1–SF3 for Hedonic and SF4–SF14 for Eudaimonic).

The data from Fredrickson et al were mostly in the public domain. After getting further psychometric data from Fredrickson’s lab, we set off set off on a thorough reanalysis that should have revealed whatever basis for their claims there might be.

In exploratory factor analyses, which we ran using different extraction (e.g., principal axis, maximum likelihood) and rotation (orthogonal, oblique) methods, we found two factors with eigenvalues greater than 1 with all items producing a loading of .50 on at least one factor.

That’s lots of analyses, but results were consistent:

Examination of factor loading coefficients consistently showed that the first factor was comprised of elevated loadings from 11 items (SF1, SF2, SF3, SF4, SF5, SF9, SF10, SF11, SF12, SF13, and SF14), while the second factor housed high loadings from 3 items (SF6, SF7, and SF8).

Click to enlarge
Click to enlarge

If this is the factor structure Fredrickson and colleagues claim, eudaimonic well-being would have to be the last three items. But look at them in the figure on the left and particularly look at the qualification below. The items seem to reflect living in a particular kind of environment that is safe and supportive of people like the respondent. Actually, these results seem to lend support to my complaint that positive psychology is mainly for rich people: to flourish, one must live in a special environment. If you languish, it is your fault.

Click to enlarge
Click to enlarge

Okay, we did not find much support for the claims of Fredrickson and colleagues, but we gave them another chance with a confirmatory factor analysis (CFA). With this analysis, we would not be looking for the best solution, only learning if either one or two factor models are defensible.

For the one-factor model, goodness-of-fit statistics indicated grossly inadequate fit (χ2 = 227.64, df = 77, GFI = .73, CFI = .83, RMSEA = .154).  Although the equivalent statistics for the correlated two-factor model were slightly better, they still came out as poor (χ2 = 189.40, df = 76, GFI = .78, CFI = .87, RMSEA = .135).

Thus, even though our findings tended to support the view that well-being is best represented as at least a two dimensional construct, we did not confirm Fredrickson et al.’s claim (6) that the MHC-SF produces two factors conforming to hedonic and eudaimonic well-being.

Hey Houston, we’ve got a problem.

As Ryff and Singer (15) put it, “Lacking evidence of scale validity and reliability, subsequent work is pointless” (p. 276).

Maybe we should have thrown in the towel. But if Fredrickson and colleagues could

From Hilda Bastian
From Hilda Bastian

nonetheless proceed to multivariate analyses relating the self-report data to genomic expression, we decided that we would follow in the same path.

Relating self-report data to genomic expression: Random can be better

Fredrickson et al. analytic approach to genomic expression seemed unnecessarily complicated. They repeated regression analyses 53 times (which we came to call RR53) in which they regressed each of 53 genes of interest on eudaimonic and hedonic well-being and a full range of confounding/control variables.  Recall that they had only 80 participants. This approach leaves them lots of room for capitalizing on chance.

So, why not simply regress

the scores for hedonic and eudaimonic well-being on the average expression of the 53 genes of interest, after changing the sign of the values of those genes that were expected to be down-regulated. [?]

After all the authors had said

[T]he goal of this study is to test associations between eudaimonic and hedonic well-being and average levels of expression of specific sets of genes” (p. 1)

We started with our simpler approach.

We conducted a number of such regressions, using different methods of evaluating the “average level of expression” of the 53 CTRA genes of interest (e.g., taking the mean of their raw values, or the mean of their z-scores), but in all cases the model ANOVA was not statistically significant.

Undaunted, we next applied the RR53 regression procedure to see whether it could, in contrast to our simpler “naive” approach, yield such highly significant results with the factors we had derived.

You can read the more technical description of our procedures in our article and its supplementary materials, but our results were

The t-tests for the regression coefficients corresponding to the predictor variables of interest, namely hedonic and eudaimonic well-being, were almost all non-significant (p > .05 in 104 out of 106 cases; mean p = .567, SD = 0.251), and in the two remaining cases (gene FOSL1, for both “hedonic,” p = .047, and “eudaimonic,” p = .030), the overall model ANOVA was not statistically significant (p = .146).

We felt that drawing any substantive conclusions from these coefficients is inappropriate.

Nonetheless, we continued….

We…created two new variables, which we named PWB (corresponding to items SF1–SF5 and SF9–SF14) and EPSE (corresponding to items SF6–SF8).  When we applied Fredrickson et al.’s regression procedure using these variables as the two principal predictor variables of interest (replacing the Hedonic and Eudaimonic factor variables), we discovered that the “effects” of this factor pair were about twice as high as those for the Hedonic and Eudaimonic pair (PWB: up-regulation by 13.6%, p < .001; EPSE: down-regulation by 18.0%, p < .001; see Figures 3 and 4 in the Supporting Information).

Wow, if we accept statistical significance over all other considerations, we actually did better than Fredrickson et al.

Taken seriously, it suggests that the participants’ genes are not only expressing “molecular well-being” but even more vigorously, some other response that we presume Fredrickson et al. might call “molecular social evaluation.”

Or we might conclude that living in a particular kind of environment, is good for your genomic expression.

But we were skeptical about whether we could give substantive interpretations of any kind and so we went wild, using the RR53 procedure with every possible way of splitting up the self-report data. Yup, that is a lot of analyses.

Excluding duplicates due to symmetry, there are 8,191 possible such combinations.  Of these, we found that 5,670 (69.2%) gave statistically significant results using the method described on pp. 1–2 of Fredrickson et al.’s Supporting Information (7) (i.e., the t-tests of the fold differences corresponding to the two elements of the pair of pseudo-factors were both significant at the .05 level), with 3,680 of these combinations (44.9% of the total) having both components significant at the .001 level.

Furthermore, 5,566 combinations (68.0%) generated statistically significant pairs of fold difference values that were greater in magnitude than Fredrickson et al.’s (6, figure 2A) Hedonic and Eudaimonic factors.

While one possible explanation of these results is that differential gene expression is associated with almost any factor combination of the psychometric data, with the study participants’ genes giving simultaneous “molecular expression” to several thousand factors which psychologists have not yet identified, we suspected that there might be a more parsimonious explanation.

But we did not stop there. Bring on the random number generator.

As a further test of the validity of the RR53 procedure, we replaced Fredrickson et al.’s psychometric data (6) with random numbers (i.e., every item/respondent cell was replaced by a random integer in the range 0–5) and re-ran the R program.  We did this in two different ways.  First, we replaced the psychometric data with normally-distributed random numbers, such that the item-level means and standard deviations were close to the equivalent values for the original data.  With these pseudo-data, 3,620 combinations of pseudo-factors (44.2%) gave a pair of fold difference values having t-tests significantly different from zero at the .05 level; of these, 1,478 (18.0% of the total) were both statistically significant at the .001 level.  (We note that, assuming independence of up- and down-regulation of genes, the probability of the latter result occurring by chance with random psychometric data if the RR53 regression procedure does indeed identify differential gene expression as a function of psychometric factors, ought to be—literally—one in a million, i.e. 0.001², rather than somewhere between one in five and one in six.)  Second, we used uniformly-distributed random numbers (i.e., all “responses” were equally likely to appear for any given item and respondent).  With these “white noise” data, we found that 2,874 combinations of pseudo-factors (35.1%) gave a pair of fold difference values having t-tests statistically significantly different from zero at the .05 level, of which 893 (10.9% of the total) were both significant at the .001 level.  Finally, we re-ran the program once more, using the same uniformly distributed random numbers, but this time excluding the demographic data and control genes; thus, the only non-random elements supplied to the RR53 procedure were the expression values of the 53 CTRA genes.  Despite the total lack of any information with which to correlate these gene expression values, the procedure generated 2,540 combinations of pseudo-factors (31.0%) with a pair of fold difference values having t-tests statistically significantly different from zero at the .05 level, of which 235 (2.9% of the total) were both significant at the .001 level.

Thus, in all cases, we obtained far more statistically significant results using Fredrickson et al.’s methods (6) than would be predicted by chance alone for truly independent variables (i.e., .052 × 8191 ≈ 20), even when the psychometric data were replaced by meaningless random numbers.  To try to identify the source of these puzzling results, we ran simple bivariate correlations on the gene expression variables, which revealed moderate to strong correlations between many of them, suggesting that our significant results were mainly the product of shared variance across criterion variables.  We therefore went back to the original psychometric data, and “scrambled” the CTRA gene expression data, reassigning each cell value for a given gene to a participant selected at random, thus minimizing any within-participants correlation between these values.  When we re-ran the regressions with these data, the number of statistically significant results dropped to just 44 (.54%).

The punchline

To summarize: even when fed entirely random psychometric punchlinedata, the RR53 regression procedure generates large numbers of results that appear, according to these authors’ interpretation, to establish a statistically significant relationship between self-reported well-being and gene expression.  We believe that this regression procedure is, simply put, totally lacking in validity.  It appears to be nothing more than a mechanism for producing apparently statistically significant effects from non-significant regression coefficients, driven by a high degree of correlation between many of the criterion variables.

poof1Despite exhaustive efforts, we could not replicate the authors’ simple factor structure differentiating hedonic versus eudaimonic well-being, upon which their genomic analyses so crucially depended. Then we showed that the complicated RR53 procedure turned random nonsense into statistically significant results. Poof, there is no there there (as Gertrude Stein once said about Oakland, California) in their paper, no evidence of “molecular signaling pathways that transduce positive psychological states into somatic physiology,” just nonsense.

How in the taxonomy of bad science, do we classify first this slipup and the earlier one in American Psychologist? Poor methodological habits, run-of-the-mill scientific sloppiness, innocent probabilistic error, injudicious hype, or simply an unbridled enthusiasm with inadequate grasp of methods and statistics?

Play nice and avoid the trap of negative psychology?

keep-calm-and-radiate-positivityOur PNAS article exposed the unreliability of the results and interpretation offered in a paper claimed to be a game changing breakthrough in our understanding of how positive psychology affects health by way of genomic expression. Science is slow and incomplete in self-correcting. But corrections, even of outright nonsense, seldom garner the attention of the original error. It is just not as newsworthy to find that claims of minor adjustments in everyday behavior modifying gene expression are nonsense as to make unsustainable claims in the first place.

Given the rewards offered by media coverage and even prestigious journals, authors can be expected to be incorrigible in terms of giving in to the urge to orchestrate media attention for ill understood results generated by dubious methods applied in small samples. But the rest of the scientific community and journalists need to keep in mind that most breakthrough discoveries are false, unreplicable, or at least wildly exaggerated.

The authors were offered a chance to respond to my muted and tightly constrained letter to PNAS. Cole and Fredrickson made references to analyses they have never presented and offered misinterpretations of the literature that I cited. I consider their response disingenuous and dismissive of any dialogue. I am willing to apologize for this assessment if they produce the factor analyses of the self-report data to which they pointed. I will even donate $100 to the American Cancer Society if they can produce it. I doubt they will.

Concerns about the unreliability of the scientific and biomedical literature have risen tothanks_for_not_kvetching_small_mugs the threshold of precipitating concern fromthe director of NIMH, Francis Collins. On the other hand,a backlash has  called out critics for encouraging a negative psychology warned to temper our criticism. Evidence of the excesses of critics include “’voodoo correlation’ claims, ‘p-hacking’ investigations, websites like Retraction Watch, Neuroskeptic, [and] a handful of other blogs devoted to exposing bad science”, to caution us that “moral outrage has been conflated with scientific rigor.” We are told we are damaging the credibility of science with criticism and that we should engage authors in clarification rather than criticize them. But I think our experience with this PNAS article demonstrates just how much work it takes to deconstruct outrageous claims based on methods and results that authors poorly understand but nonetheless promote in social media campaigns.. Certainly, there are grounds for skepticism based on prior probabilities, and to be skeptical is not cynical. But is not cynical to construct the pseudoscience of a positivity ratio and then a faux objective basis for moral philosophy?