Can we predict suicide from Twitter language?

Can we predict county-level death by suicide from Twitter data? We tried. Our surprising results added weight to results of our re-analyses of Twitter data attempting to predict death from heart disease.  Analyzing Twitter data in bulk does not add to our understanding geographical variations in health outcomes.

mind the brain logo

Can we predict county-level death by suicide from Twitter data? We tried. Our surprising results added weight to results of our re-analyses of Twitter data attempting to predict death from heart disease.  Analyzing Twitter data in bulk does not add to our understanding geographical variations in health outcomes.

Nick Brown and I (*) recently posted a preprint:

No Evidence That Twitter Language Reliably Predicts Heart Disease: A Reanalysis of Eichstaedt et al. (2015a)

We reanalyze Eichstaedt et al.’s (2015a) claim to have shown that language patterns among Twitter users, aggregated at the level of U.S. counties, predicted county-level mortality rates from atherosclerotic heart disease (AHD), with “negative” language being associated with higher rates of death from AHD and “positive” language associated with lower rates…We conclude that there is no evidence that analyzing Twitter data in bulk in this way can add anything useful to our ability to understand geographical variation in AHD mortality rates.

You can find the original article here:

Eichstaedt JC, Schwartz HA, Kern ML, Park G, Labarthe DR, Merchant RM, Jha S, Agrawal M, Dziurzynski LA, Sap M, Weeg C. Psychological language on Twitter predicts county-level heart disease mortality. Psychological Science. 2015 Feb;26(2):159-69.


A press release from Association for Psychological Science heaped lavish praise on the original article. It can be found here.

“Twitter seems to capture a lot of the same information that you get from health and demographic indicators,” co-author Gregory Park said, “but it also adds something extra. So predictions from Twitter can actually be more accurate than using a set of traditional variables.

 Our overarching conclusion:

… There is a very large amount of noise in the measures of the meaning of Twitter data used by Eichstaedt et al., and these authors’ complex analysis techniques (involving, for example, several steps to deal with high multicollinearity) are merely modeling this noise to produce the illusion of a psychological mechanism that acts at the level of people’s county of residence.

Our look at key assumptions and re-analyses

The choice of atherosclerotic heart disease (AHD) as the health outcome fits with lay understanding of what causes heart attacks of interest, but was unfortunate.

Folk beliefs about negative emotion causing heart attacks had been bolstered by some initial promising findings in small samples suggesting a link between Type A behavior pattern (TABP) and cardiac events and mortality. In our preprint, we discuss how subsequent, better controlled studies did not confirm these results.

Type A behavior pattern cannot readily be distinguished from other negative emotion variables. These negative emotion variables converge in what is been called by Paul Meehl a “crud factor” or by others, a “big mess.” Such negative affect variables are non-informative risk markers, not true risk factors. These variables have too many correlates in background, pre-existing variables, including poor physical health; and in concurrent variables that cannot readily be separated in statistical analyses, even with prospective data. See “Negative emotions and health: why do we keep stalking bears when we only find scat”for a further discussion.

While we were finishing up our manuscript, an article came out that analyzed and succinctly summarized this issue:

A substantial part of the distress–IHD [ischaemic heart disease] association is explained by confounding and functional limitations . . . . Emphasis should be on psychological distress as a marker of healthcare need and IHD risk, rather than a causative factor.”

AHD is actually a chronic condition, slowly developing over a lifetime. Many of the crucial determinants of whether someone later shows signs and symptoms of AHD occur in childhood or adolescence.

Americans are a highly mobile population, and when they reach middle age with its increase in heart attacks, they may have moved geographically far away from where they lived when their chronic disease developed. The counties in which participants are identified for the purposes of this Twitter study are not the counties in which they developed their condition.

Most of the people who are tweeting in a county are younger than the people likely to be dying from AHD. So, we are assessing one population to predict health events in another.

Some of our other findings that are discussed more fully in our preprint:

Coding of AHD as the cause of death in this study was highly unreliable and subject to major variability across counties.

The process for selecting counties to be included in the study was biased.

The Twitter-based dictionaries used for coding appear not to be a faithful summary of the words that were actually typed by users. There were puzzling omissions.

Arbitrary and presumably post-hoc choices were apparently made in some of the dictionary-based analyses and these choices strengthened the appearance of an association between Twitter language and death from AHD.

There were numerous problems associated with the use of counties as the unit of analysis, which vary greatly in size (between) as well as heterogeneity (within) of sociodemographic or socioemotional factors, as well as the proportion of county residents who were actually on Twitter.

The predictive power of the model, including the associated maps, appears to be questionable.

While we were working on the manuscript that became a preprint, another relevant paper came out:

Jensen, E. A. (2017). Putting the methodological brakes on claims to measure national happiness through Twitter: Methodological limitations in social media analytics. PLOS ONE, 12(9), e0180080.

We  endorse its conclusion:

When researchers approach a data set, they need to understand and publicly account for not only the limits of the data set, but also the limits of which questions they can ask . . . and what interpretations are appropriate (p. 6).

Using Twitter data to predict death by suicide

Ok, I have already spoiled the story by giving up front the argument that trying to predict health outcomes from big Twitter data is not a good idea.

But a case can be made that if we are going to predict a health outcome from Twitter, suicide is a better candidate than AHD. This was Nick’s idea, but I wanted to emphasize it more than he did.

Although suicide can be the result of long-term mental health problems and other stressors, a person’s psychological state in the months and days leading up to the point at which they take their own life clearly has a substantial degree of relevance to their decision. Hence, we might expect any county-level psychological factors that act directly on the health and welfare of members of the local community to be more closely reflected in the mortality statistics for suicide than those for a chronic disease such as AHD.

We [collective “we” the authors, but actually Nick] also downloaded comparable mortality data for the ICD-10 categories X60–X84, collectively labeled “Intentional self-harm”—in order to test the idea that suicide might be at least as well predicted by Twitter language as AHD—as well as the data for several other causes of death (including all-cause mortality) for comparison purposes.

We therefore examined the relationship of the set of causes of death listed by the CDC as “self-harm” with Twitter language usage, using the procedures reported in the first subsections entitled “Language variables from Twitter” and “Statistical analysis” of Eichstaedt et al.’s (2015a, p. 161) Method section. Because of the limitation of the CDC Wonder database, noted earlier, whereby mortality rates are only available when at least 10 deaths per year are recorded in a given county, data for self-harm were only available for 741 counties; however, these represented 89.9% of the population of Eichstaedt et al.’s set of 1,347 counties.

Our findings

self-harm and twitter


In the “Dictionaries” analysis, we found that mortality from self-harm was negatively correlated with all five “negative” language factors, with three of these correlations (for anger, negative-relationship, and negative-emotion words) being statistically significant at the .05 level (see our Table 1). That is, counties whose residents made greater use of negative language on Twitter had lower rates of suicide, or, to borrow Eichstaedt et al.’s (2015a, p. 162) words, use of negative language was “significantly protective” against self-harm; this statistical significance was unchanged when income and education were added as covariates. In a further contrast to AHD mortality, two of the three positive language factors (positive relations and positive emotions) were positively correlated with mortality from self-harm, although these correlations were not statistically significant.

Next, we analyzed the relationship between Twitter language and self-harm outcomes at the “Topics” level. Among the topics most highly correlated with increased risk of self-harm were those associated with spending time surrounded by nature (e.g., grand, creek, hike; r = .214, CI[1] = [.144, .281]), romantic love (e.g., beautiful, love, girlfriend; r = .176, CI = [.105, .245]), and positive evaluation of one’s social situation (e.g., family, friends, wonderful; r = .175, CI = [.104, .244]). There were also topics of discussion that appeared to be strongly “protective” against the risk of self-harm, such as baseball (e.g., game, Yankees, win; r = −.317, CI = [−.381, −.251]), binge drinking (e.g., drunk, sober, hungover; r = −.249, CI = [−.316, −.181]), and watching reality TV (e.g., Jersey, Shore, episode; r = −.200, CI = [−.269, −.130]). All of the correlations between these topics and self-harm outcomes, both positive and negative, were significant at the same Bonferroni-corrected significance level (i.e., .05/2,000 = .000025) used by Eichstaedt et al. (2015a), and remained significant at that level after adjusting for income and education. That is, several topics that were ostensibly associated with “positive,” “eudaimonic” approaches to life predicted higher rates of county-level self-harm mortality, whereas apparently hedonistic topics were associated with lower rates of self-harm mortality, and the magnitude of these associations was at least as great—and in a few cases, even greater—than those found by Eichstaedt et al. These topics are shown in “word cloud” form (generated at in our Figure 2 (cf. Eichstaedt et al.’s Figure 1).

time spent with nature


If anyone insists on giving this finding a substantive interpretation…

This discovery would seem to pose a problem for Eichstaedt et al.’s (2015a, p. 166) claim to have shown the existence of “community-level psychological factors that are important for the cardiovascular health of communities.” Apparently the “positive” versions of these factors, while acting via some unspecified mechanism to make the community as a whole less susceptible to developing hardening of the arteries, also simultaneously manage to make the same people more likely to commit suicide, and vice versa. It seems that more research into the possible risks of increased levels of self-harm would seem to be needed before any program to enhance these “community-level psychological factors” were to be undertaken.

But actually, no, we don’t want to do that.

Of course, there is no suggestion that the study of the language used on Twitter by the inhabitants of any particular county has any real predictive value for the local suicide rate; we believe that such associations are likely to be the entirely spurious results of imperfect measurements and chance factors, and to use Twitter data to predict which areas might be about to experience higher suicide rates is likely to prove extremely inaccurate (and perhaps ethically questionable as well).


*When published, this preprint will serve as one of the articles that will be bundled in Nick Brown’s PhD thesis submitted to University Medical Centre., Groningen. As Nick’s adviser, I was pleased to have a role that justified an authorship. I want to be clear, however, my role was more like a midwife observing a natural birth than an OBGyn having to induce labor. Nick can’t say what I can say: there is some real brilliance to this paper. The brilliance belongs to Nick, not me.  And I mean brilliance in the restricted American sense, not promiscuous British sense, like that is a brilliant dessert.

I encourage you to dig in and enjoy. There are lots of treats and curious observations. Nick notably retrieved and analyzed the data, but also did some programming to capture the color depiction of counties and ADHD rates. He identified some anomalies and then developed his own depiction with some corrections to the original. Truly amazing.

map differences


Creating illusions of wondrous effects of yoga and meditation on health: A skeptic exposes tricks

The tour of the sausage factory is starting, here’s your brochure telling you’ll see.


A recent review has received a lot of attention with it being used for claims that mind-body interventions have distinct molecular signatures that point to potentially dramatic health benefits for those who take up these practices.

What Is the Molecular Signature of Mind–Body Interventions? A Systematic Review of Gene Expression Changes Induced by Meditation and Related Practices.  Frontiers in Immunology. 2017;8.

Few who are tweeting about this review or its press coverage are likely to have read it or to understand it, if they read it. Most of the new agey coverage in social media does nothing more than echo or amplify the message of the review’s press release.  Lazy journalists and bloggers can simply pass on direct quotes from the lead author or even just the press release’s title, ‘Meditation and yoga can ‘reverse’ DNA reactions which cause stress, new study suggests’:

“These activities are leaving what we call a molecular signature in our cells, which reverses the effect that stress or anxiety would have on the body by changing how our genes are expressed.”


“Millions of people around the world already enjoy the health benefits of mind-body interventions like yoga or meditation, but what they perhaps don’t realise is that these benefits begin at a molecular level and can change the way our genetic code goes about its business.”

[The authors of this review actually identified some serious shortcomings to the studies they reviewed. I’ll be getting to some excellent points at the end of this post that run quite counter to the hype. But the lead author’s press release emphasized unwarranted positive conclusions about the health benefits of these practices. That is what is most popular in media coverage, especially from those who have stuff to sell.]

Interpretation of the press release and review authors’ claims requires going back to the original studies, which most enthusiasts are unlikely to do. If readers do go back, they will have trouble interpreting some of the deceptive claims that are made.

Yet, a lot is at stake. This review is being used to recommend mind-body interventions for people having or who are at risk of serious health problems. In particular, unfounded claims that yoga and mindfulness can increase the survival of cancer patients are sometimes hinted at, but occasionally made outright.

This blog post is written with the intent of protecting consumers from such false claims and providing tools so they can spot pseudoscience for themselves.

Discussion in the media of the review speaks broadly of alternative and complementary interventions. The coverage is aimed at inspiring  confidence in this broad range of treatments and to encourage people who are facing health crises investing time and money in outright quackery. Seemingly benign recommendations for yoga, tai chi, and mindfulness (after all, what’s the harm?) often become the entry point to more dubious and expensive treatments that substitute for established treatments.  Once they are drawn to centers for integrative health care for classes, cancer patients are likely to spend hundreds or even thousands on other products and services that are unlikely to benefit them. One study reported:

More than 72 oral or topical, nutritional, botanical, fungal and bacterial-based medicines were prescribed to the cohort during their first year of IO care…Costs ranged from $1594/year for early-stage breast cancer to $6200/year for stage 4 breast cancer patients. Of the total amount billed for IO care for 1 year for breast cancer patients, 21% was out-of-pocket.

Coming up, I will take a skeptical look at the six randomized trials that were highlighted by this review.  But in this post, I will provide you with some tools and insights so that you do not have to make such an effort in order to make an informed decision.

Like many of the other studies cited in the review, these randomized trials were quite small and underpowered. But I will focus on the six because they are as good as it gets. Randomized trials are considered a higher form of evidence than simple observational studies or case reports [It is too bad the authors of the review don’t even highlight what studies are randomized trials. They are lumped with others as “longitudinal studies.]

As a group, the six studies do not actually add any credibility to the claims that mind-body interventions – specifically yoga, tai chi, and mindfulness training or retreats improve health by altering DNA.  We can be no more confident with what the trials provide than we would be without them ever having been done.

I found the task of probing and interpreting the studies quite labor-intensive and ultimately unrewarding.

I had to get past poor reporting of what was actually done in the trials, to which patients, and with what results. My task often involved seeing through cover ups with authors exercising considerable flexibility in reporting what measures were they actually collected and what analyses were attempted, before arriving at the best possible tale of the wondrous effects of these interventions.

Interpreting clinical trials should not be so hard, because they should be honestly and transparently reported and have a registered protocol and stick to it. These reports of trials were sorely lacking, The full extent of the problems took some digging to uncover, but some things emerged before I got to the methods and results.

The introductions of these studies consistently exaggerated the strength of existing evidence for the effects of these interventions on health, even while somehow coming to the conclusion that this particular study was urgently needed and it might even be the “first ever”. The introductions to the six papers typically cross-referenced each other, without giving any indication of how poor quality the evidence was from the other papers. What a mutual admiration society these authors are.

One giveaway is how the introductions  referred to the biggest, most badass, comprehensive and well-done review, that of Goyal and colleagues.

That review clearly states that the evidence for the effects of mindfulness is poor quality because of the lack of comparisons with credible active treatments. The typical randomized trial of mindfulness involves a comparison with no-treatment, a waiting list, or patients remaining in routine care where the target problem is likely to be ignored.  If we depend on the bulk of the existing literature, we cannot rule out the likelihood that any apparent benefits of mindfulness are due to having more positive expectations, attention, and support over simply getting nothing.  Only a handful  of hundreds of trials of mindfulness include appropriate, active treatment comparison/control groups. The results of those studies are not encouraging.

One of the first things I do in probing the introduction of a study claiming health benefits for mindfulness is see how they deal with the Goyal et al review. Did the study cite it, and if so, how accurately? How did the authors deal with its message, which undermines claims of the uniqueness or specificity of any benefits to practicing mindfulness?

For yoga, we cannot yet rule out that it is better than regular exercising – in groups or alone – having relaxing routines. The literature concerning tai chi is even smaller and poorer quality, but there is the same need to show that practicing tai chi has any benefits over exercising in groups with comparable positive expectations and support.

Even more than mindfulness, yoga and tai chi attract a lot of pseudoscientific mumbo jumbo about integrating Eastern wisdom and Western science. We need to look past that and insist on evidence.

Like their introductions, the discussion sections of these articles are quite prone to exaggerating how strong and consistent the evidence is from existing studies. The discussion sections cherry pick positive findings in the existing literature, sometimes recklessly distorting them. The authors then discuss how their own positively spun findings fit with what is already known, while minimizing or outright neglecting discussion of any of their negative findings. I was not surprised to see one trial of mindfulness for cancer patients obtain no effects on depressive symptoms or perceived stress, but then go on to explain mindfulness might powerfully affect the expression of DNA.

If you want to dig into the details of these studies, the going can get rough and the yield for doing a lot of mental labor is low. For instance, these studies involved drawing blood and analyzing gene expression. Readers will inevitably encounter passages like:

In response to KKM treatment, 68 genes were found to be differentially expressed (19 up-regulated, 49 down-regulated) after adjusting for potentially confounded differences in sex, illness burden, and BMI. Up-regulated genes included immunoglobulin-related transcripts. Down-regulated transcripts included pro-inflammatory cytokines and activation-related immediate-early genes. Transcript origin analyses identified plasmacytoid dendritic cells and B lymphocytes as the primary cellular context of these transcriptional alterations (both p < .001). Promoter-based bioinformatic analysis implicated reduced NF-κB signaling and increased activity of IRF1 in structuring those effects (both p < .05).

Intimidated? Before you defer to the “experts” doing these studies, I will show you some things I noticed in the six studies and how you can debunk the relevance of these studies for promoting health and dealing with illness. Actually, I will show that even if these 6 studies got the results that the authors claimed- and they did not- at best, the effects would trivial and lost among the other things going on in patients’ lives.

Fortunately, there are lots of signs that you can dismiss such studies and go on to something more useful, if you know what to look for.

Some general rules:

  1. Don’t accept claims of efficacy/effectiveness based on underpowered randomized trials. Dismiss them. The rule of thumb is reliable to dismiss trials that have less than 35 patients in the smallest group. Over half the time, true moderate sized effects will be missed in such studies, even if they are actually there.

Due to publication bias, most of the positive effects that are published from such sized trials will be false positives and won’t hold up in well-designed, larger trials.

When significant positive effects from such trials are reported in published papers, they have to be large to have reached significance. If not outright false, these effect sizes won’t be matched in larger trials. So, significant, positive effect sizes from small trials are likely to be false positives and exaggerated and probably won’t replicate. For that reason, we can consider small studies to be pilot or feasibility studies, but not as providing estimates of how large an effect size we should expect from a larger study. Investigators do it all the time, but they should not: They do power calculations estimating how many patients they need for a larger trial from results of such small studies. No, no, no!

Having spent decades examining clinical trials, I am generally comfortable dismissing effect sizes that come from trials with less than 35 patients in the smaller group. I agree with a suggestion that if there are two larger trials are available in a given literature, go with those and ignore the smaller studies. If there are not at least two larger studies, keep the jury out on whether there is a significant effect.

Applying the Rule of 35, 5 of the 6 trials can be dismissed and the sixth is ambiguous because of loss of patients to follow up.  If promoters of mind-body interventions want to convince us that they have beneficial effects on physical health by conducting trials like these, they have to do better. None of the individual trials should increase our confidence in their claims. Collectively, the trials collapse in a mess without providing a single credible estimate of effect size. This attests to the poor quality of evidence and disrespect for methodology that characterizes this literature.

  1. Don’t be taken in by titles to peer-reviewed articles that are themselves an announcement that these interventions work. Titles may not be telling the truth.

What I found extraordinary is that five of the six randomized trials had a title that indicating a positive effect was found. I suspect that most people encountering the title will not actually go on to read the study. So, they will be left with the false impression that positive results were indeed obtained. It’s quite a clever trick to make the title of an article, by which most people will remember it, into a false advertisement for what was actually found.

For a start, we can simply remind ourselves that with these underpowered studies, investigators should not even be making claims about efficacy/effectiveness. So, one trick of the developing skeptic is to confirm that the claims being made in the title don’t fit with the size of the study. However, actually going to the results section one can find other evidence of discrepancies between what was found in what is being claimed.

I think it’s a general rule of thumb that we should be careful of titles for reports of randomized that declare results. Even when what is claimed in the title fits with the actual results, it often creates the illusion of a greater consistency with what already exists in the literature. Furthermore, even when future studies inevitably fail to replicate what is claimed in the title, the false claim lives on, because failing to replicate key findings is almost never a condition for retracting a paper.

  1. Check the institutional affiliations of the authors. These 6 trials serve as a depressing reminder that we can’t go on researchers’ institutional affiliation or having federal grants to reassure us of the validity of their claims. These authors are not from Quack-Quack University and they get funding for their research.

In all cases, the investigators had excellent university affiliations, mostly in California. Most studies were conducted with some form of funding, often federal grants.  A quick check of Google would reveal from at least one of the authors on a study, usually more, had federal funding.

  1. Check the conflicts of interest, but don’t expect the declarations to be informative. But be skeptical of what you find. It is also disappointing that a check of conflict of interest statements for these articles would be unlikely to arouse the suspicion that the results that were claimed might have been influenced by financial interests. One cannot readily see that the studies were generally done settings promoting alternative, unproven treatments that would benefit from the publicity generated from the studies. One cannot see that some of the authors have lucrative book contracts and speaking tours that require making claims for dramatic effects of mind-body treatments could not possibly be supported by: transparent reporting of the results of these studies. As we will see, one of the studies was actually conducted in collaboration with Deepak Chopra and with money from his institution. That would definitely raise flags in the skeptic community. But the dubious tie might be missed by patients in their families vulnerable to unwarranted claims and unrealistic expectations of what can be obtained outside of conventional medicine, like chemotherapy, surgery, and pharmaceuticals.

Based on what I found probing these six trials, I can suggest some further rules of thumb. (1) Don’t assume for articles about health effects of alternative treatments that all relevant conflicts of interest are disclosed. Check the setting in which the study was conducted and whether it was in an integrative [complementary and alternative, meaning mostly unproven.] care setting was used for recruiting or running the trial. Not only would this represent potential bias on the part of the authors, it would represent selection bias in recruitment of patients and their responsiveness to placebo effects consistent with the marketing themes of these settings.(2) Google authors and see if they have lucrative pop psychology book contracts, Ted talks, or speaking gigs at positive psychology or complementary and alternative medicine gatherings. None of these lucrative activities are typically expected to be disclosed as conflicts of interest, but all require making strong claims that are not supported by available data. Such rewards are perverse incentives for authors to distort and exaggerate positive findings and to suppress negative findings in peer-reviewed reports of clinical trials. (3) Check and see if known quacks have prepared recruitment videos for the study, informing patients what will be found (Serious, I was tipped off to look and I found that).

  1. Look for the usual suspects. A surprisingly small, tight, interconnected group is generating this research. You could look the authors up on Google or Google Scholar or  browse through my previous blog posts and see what I have said about them. As I will point out in my next blog, one got withering criticism for her claim that drinking carbonated sodas but not sweetened fruit drinks shortened your telomeres so that drinking soda was worse than smoking. My colleagues and I re-analyzed the data of another of the authors. We found contrary to what he claimed, that pursuing meaning, rather than pleasure in your life, affected gene expression related to immune function. We also showed that substituting randomly generated data worked as well as what he got from blood samples in replicating his original results. I don’t think it is ad hominem to point out a history for both of the authors of making implausible claims. It speaks to source credibility.
  1. Check and see if there is a trial registration for a study, but don’t stop there. You can quickly check with PubMed if a report of a randomized trial is registered. Trial registration is intended to ensure that investigators commit themselves to a primary outcome or maybe two and whether that is what they emphasized in their paper. You can then check to see if what is said in the report of the trial fits with what was promised in the protocol. Unfortunately, I could find only one of these was registered. The trial registration was vague on what outcome variables would be assessed and did not mention the outcome emphasized in the published paper (!). The registration also said the sample would be larger than what was reported in the published study. When researchers have difficulty in recruitment, their study is often compromised in other ways. I’ll show how this study was compromised.

Well, it looks like applying these generally useful rules of thumb is not always so easy with these studies. I think the small sample size across all of the studies would be enough to decide this research has yet to yield meaningful results and certainly does not support the claims that are being made.

But readers who are motivated to put in the time of probing deeper come up with strong signs of p-hacking and questionable research practices.

  1. Check the report of the randomized trial and see if you can find any declaration of one or two primary outcomes and a limited number of secondary outcomes. What you will find instead is that the studies always have more outcome variables than patients receiving these interventions. The opportunities for cherry picking positive findings and discarding the rest are huge, especially because it is so hard to assess what data were collected but not reported.
  1. Check and see if you can find tables of unadjusted primary and secondary outcomes. Honest and transparent reporting involves giving readers a look at simple statistics so they can decide if results are meaningful. For instance, if effects on stress and depressive symptoms are claimed, are the results impressive and clinically relevant? Almost in all cases, there is no peeking allowed. Instead, authors provide analyses and statistics with lots of adjustments made. They break lots of rules in doing so, especially with such a small sample. These authors are virtually assured to get results to crow about.

Famously, Joe Simmons and Leif Nelson hilariously published claims that briefly listening to the Beatles’ “When I’m 64” left students a year and a half older younger than if they were assigned to listening to “Kalimba.”  Simmons and Leif Nelson knew this was nonsense, but their intent was to show what researchers can do if they have free reign with how they analyze their data and what they report and  . They revealed the tricks they used, but they were so minor league and amateurish compared to what the authors of these trials consistently did in claiming that yoga, tai chi, and mindfulness modified expression of DNA.

Stay tuned for my next blog post where I go through the six studies. But consider this, if you or a loved one have to make an immediate decision about whether to plunge into the world of woo woo unproven medicine in hopes of  altering DNA expression. I will show the authors of these studies did not get the results they claimed. But who should care if they did? Effects were laughably trivial. As the authors of this review about which I have been complaining noted:

One other problem to consider are the various environmental and lifestyle factors that may change gene expression in similar ways to MBIs [Mind-Body Interventions]. For example, similar differences can be observed when analyzing gene expression from peripheral blood mononuclear cells (PBMCs) after exercise. Although at first there is an increase in the expression of pro-inflammatory genes due to regeneration of muscles after exercise, the long-term effects show a decrease in the expression of pro-inflammatory genes (55). In fact, 44% of interventions in this systematic review included a physical component, thus making it very difficult, if not impossible, to discern between the effects of MBIs from the effects of exercise. Similarly, food can contribute to inflammation. Diets rich in saturated fats are associated with pro-inflammatory gene expression profile, which is commonly observed in obese people (56). On the other hand, consuming some foods might reduce inflammatory gene expression, e.g., drinking 1 l of blueberry and grape juice daily for 4 weeks changes the expression of the genes related to apoptosis, immune response, cell adhesion, and lipid metabolism (57). Similarly, a diet rich in vegetables, fruits, fish, and unsaturated fats is associated with anti-inflammatory gene profile, while the opposite has been found for Western diet consisting of saturated fats, sugars, and refined food products (58). Similar changes have been observed in older adults after just one Mediterranean diet meal (59) or in healthy adults after consuming 250 ml of red wine (60) or 50 ml of olive oil (61). However, in spite of this literature, only two of the studies we reviewed tested if the MBIs had any influence on lifestyle (e.g., sleep, diet, and exercise) that may have explained gene expression changes.

How about taking tango lessons instead? You would at least learn dance steps, get exercise, and decrease any social isolation. And so what if there were more benefits than taking up these other activities?



Calling out pseudoscience, radically changing the conversation about Amy Cuddy’s power posing paper

Part 1: Reviewed as the clinical trial that it is, the power posing paper should never have been published.

Has too much already been written about Amy Cuddy’s power pose paper? The conversation should not be stopped until its focus shifts and we change our ways of talking about psychological science.

The dominant narrative is now that a junior scientist published an influential paper on power posing and was subject to harassment and shaming by critics, pointing to the need for greater civility in scientific discourse.

Attention has shifted away from the scientific quality of the paper and the dubious products the paper has been used to promote and on the behavior of its critics.

Amy Cuddy and powerful allies are given forums to attack and vilify critics, accusing them of damaging the environment in which science is done and discouraging prospective early career investigators from entering the field.

Meanwhile, Amy Cuddy commands large speaking fees and has a top-selling book claiming the original paper provides strong science for simple behavioral manipulations altering mind-body relations and producing socially significant behavior.

This misrepresentation of psychological science does potential harm to consumers and the reputation of psychology among lay persons.

This blog post is intended to restart the conversation with a reconsideration of the original paper as a clinical and health psychology randomized trial (RCT) and, on that basis, identifying the kinds of inferences that are warranted from it.

In the first of a two post series, I argue that:

The original power pose article in Psychological Science should never been published.

-Basically, we have a therapeutic analog intervention delivered in 2 1-minute manipulations by unblinded experimenters who had flexibility in what they did,  what they communicated to participants, and which data they chose to analyze and how.

-It’s unrealistic to expect that 2 1-minute behavioral manipulations would have robust and reliable effects on salivary cortisol or testosterone 17 minutes later.

-It’s absurd to assume that the hormones mediated changes in behavior in this context.

-If Amy Cuddy retreats to the idea that she is simply manipulating “felt power,” we are solidly in the realm of trivial nonspecific and placebo effects.

The original power posing paper

Carney DR, Cuddy AJ, Yap AJ. Power posing brief nonverbal displays affect neuroendocrine levels and risk tolerance. Psychological Science. 2010 Oct 1;21(10):1363-8.

The Psychological Science article can be construed as a brief mind-body intervention consisting of 2 1-minute behavioral manipulations. Central to the attention that the paper attracted is that argument that this manipulation  affected psychological state and social performance via the effects of the manipulation on the neuroendocrine system.

The original study is in effect, a disguised randomized clinical trial (RCT) of a biobehavioral intervention. Once this is recognized, a host of standards can come into play for reporting this study and interpreting the results.


All major journals and publishers including Association for Psychological Science have adopted the Consolidated Standards of Reporting Trials (CONSORT). Any submission of a manuscript reporting a clinical trial is required to be accompanied by a checklist  indicating that the article reports that particular details of how the trial was conducted. Item 1 on the checklist specifies that both the title and abstract indicate the study was a randomized trial. This is important and intended to aid readers in evaluating the study, but also for the study to be picked up in systematic searches for reviews that depend on screening of titles and abstracts.

I can find no evidence that Psychological Science adheres to CONSORT. For instance, my colleagues and I provided a detailed critique of a widely promoted study of loving-kindness meditation that was published in Psychological Science the same year as Cuddy’s power pose study. We noted that it was actually a poorly reported null trial with switched outcomes. With that recognition, we went on to identify serious conceptual, methodological and statistical problems. After overcoming considerable resistance, we were able  to publish a muted version of our critique. Apparently reviewers of the original paper had failed to evaluate it in terms of it being an RCT.

The submission of the completed CONSORT checklist has become routine in most journals considering manuscripts for studies of clinical and health psychology interventions. Yet, additional CONSORT requirements that developed later about what should be included in abstracts are largely being ignored.

It would be unfair to single out Psychological Science and the Cuddy article for noncompliance to CONSORT for abstracts. However, the checklist can be a useful frame of reference for noting just how woefully inadequate the abstract was as a report of a scientific study.

CONSORT for abstracts

Hopewell S, Clarke M, Moher D, Wager E, Middleton P, Altman DG, Schulz KF, CONSORT Group. CONSORT for reporting randomized controlled trials in journal and conference abstracts: explanation and elaboration. PLOS Medicine. 2008 Jan 22;5(1):e20.

Journal and conference abstracts should contain sufficient information about the trial to serve as an accurate record of its conduct and findings, providing optimal information about the trial within the space constraints of the abstract format. A properly constructed and well-written abstract should also help individuals to assess quickly the validity and applicability of the findings and, in the case of abstracts of journal articles, aid the retrieval of reports from electronic databases.

Even if CONSORT for abstracts did not exist, we could argue that readers, starting with the editor and reviewers were faced with an abstract with extraordinary claims that required better substantiation. They were disarmed by a lack of basic details from evaluating these claims.

In effect, the abstract reduces the study to an experimercial for products about to be marketed in corporate talks and workshops, but let’s persist in evaluating it as an abstract as a scientific study.

Humans and other animals express power through open, expansive postures, and they express powerlessness through closed, contractive postures. But can these postures actually cause power? The results of this study confirmed our prediction that posing in high-power nonverbal displays (as opposed to low-power nonverbal displays) would cause neuroendocrine and behavioral changes for both male and female participants: High-power posers experienced elevations in testosterone, decreases in cortisol, and increased feelings of power and tolerance for risk; low-power posers exhibited the opposite pattern. In short, posing in displays of power caused advantaged and adaptive psychological, physiological, and behavioral changes, and these findings suggest that embodiment extends beyond mere thinking and feeling, to physiology and subsequent behavioral choices. That a person can, by assuming two simple 1-min poses, embody power and instantly become more powerful has real-world, actionable implications.

I don’t believe I have ever encountered in an abstract the extravagant claims with which this abstract concludes. But readers are not provided any basis for evaluating the claim until the Methods section. Undoubtedly, many holding opinions about the paper did not read that far.


Forty-two participants (26 females and 16 males) were randomly assigned to the high-power-pose or low-power-pose condition.

Testosterone levels were in the normal range at both Time 1 (M = 60.30 pg/ml, SD = 49.58) and Time 2 (M = 57.40 pg/ml, SD = 43.25). As would be suggested by appropriately taken and assayed samples (Schultheiss & Stanton, 2009), men were higher than women on testosterone at both Time 1, F(1, 41) = 17.40, p < .001, r = .55, and Time 2, F(1, 41) = 22.55, p < .001, r = .60. To control for sex differences in testosterone, we used participant’s sex as a covariate in all analyses. All hormone analyses examined changes in hormones observed at Time 2, controlling for Time 1. Analyses with cortisol controlled for testosterone, and vice versa.2

Too small a study to provide an effect size

Hold on! First. Only 42 participants  (26 females and 16 males) would readily be recognized as insufficient for an RCT, particularly in an area of research without past RCTs.

After decades of witnessing the accumulation of strong effect sizes from underpowered studies, many of us have reacted by requiring 35 participants per group as the minimum acceptable level for a generalizable effect size. Actually, that could be an overly liberal criterion. Why?

Many RCTs are underpowered, yet a lack of enforcement of preregistration allows positive results by redefining the primary outcomes after results are known. A psychotherapy trial with 30 or less patients in the smallest cell has less than a 50% probability of detecting a moderate sized significant effect, even if it is present (Coyne,Thombs, & Hagedoorn, 2010). Yet an examination of the studies mustered for treatments being evidence supported by APA Division 12 ( ) indicates that many studies were too underpowered to be reliably counted as evidence of efficacy, but were included without comment about this problem. Taking an overview, it is striking the extent to which the literature continues depend on small, methodologically flawed RCTs conducted by investigators with strong allegiances to one of the treatments being evaluated. Yet, which treatment is preferred by investigators is a better predictor of the outcome of the trial than the specific treatment being evaluated (Luborsky et al., 2006).

Earlier my colleagues and I had argued for the non-accumulative  nature of evidence from small RCTs:

Kraemer, Gardner, Brooks, and Yesavage (1998) propose excluding small, underpowered studies from meta-analyses. The risk of including studies with inadequate sample size is not limited to clinical and pragmatic decisions being made on the basis of trials that cannot demonstrate effectiveness when it is indeed present. Rather, Kraemer et al. demonstrate that inclusion of small, underpowered trials in meta-analyses produces gross overestimates of effect size due to substantial, but unquantifiable confirmatory publication bias from non-representative small trials. Without being able to estimate the size or extent of such biases, it is impossible to control for them. Other authorities voice support for including small trials, but generally limit their argument to trials that are otherwise methodologically adequate (Sackett & Cook, 1993; Schulz & Grimes, 2005). Small trials are particularly susceptible to common methodological problems…such as lack of baseline equivalence of groups; undue influence of outliers on results; selective attrition and lack of intent-to-treat analyses; investigators being unblinded to patient allotment; and not having a pre-determined stopping point so investigators are able to stop a trial when a significant effect is present.

In the power posing paper, there was the control for sex in all analyses because a peek at the data revealed baseline sex differences in testosterone dwarfing any other differences. What do we make of investigators conducting a study depending on testosterone mediating a behavioral manipulation who did not anticipate large baseline sex differences in testosterone?

In a Pubpeer comment leading up to this post , I noted:

We are then told “men were higher than women on testosterone at both Time 1, F(1, 41) = 17.40, p < .001, r = .55, and Time 2, F(1, 41) = 22.55, p < .001, r = .60. To control for sex differences in testosterone, we used participant’s sex as a covariate in all analyses. All hormone analyses examined changes in hormones observed at Time 2, controlling for Time 1. Analyses with cortisol controlled for testosterone, and vice versa.”

The findings alluded to in the abstract should be recognizable as weird and uninterpretable. Most basically, how could the 16 males be distributed across the two groups so that the authors could confidently say that differences held for both males and females? Especially when all analyses control for sex? Sex is highly correlated with testosterone and so an analysis that controlled for both the variables, sex and testosterone would probably not generalize to testosterone without such controls.

We are never given the basic statistics in the paper to independently assess what the authors are doing, not the correlation between cortisol and testosterone, only differences in time 2 cortisol controlling for time 1 cortisol, time 1 testosterone and gender. These multivariate statistics are not  very generalizable in a sample with 42 participants distributed across 2 groups. Certainly not for the 26 females and 16  males taken separately.

The behavioral manipulation

The original paper reports:

Participants’ bodies were posed by an experimenter into high-power or low-power poses. Each participant held two poses for 1 min each. Participants’ risk taking was measured with a gambling task; feelings of power were measured with self-reports. Saliva samples, which were used to test cortisol and testosterone levels, were taken before and approximately 17 min after the power-pose manipulation.

And then elaborates:

To configure the test participants into the poses, the experimenter placed an electrocardiography lead on the back of each participant’s calf and underbelly of the left arm and explained, “To test accuracy of physiological responses as a function of sensor placement relative to your heart, you are being put into a certain physical position.” The experimenter then manually configured participants’ bodies by lightly touching their arms and legs. As needed, the experimenter provided verbal instructions (e.g., “Keep your feet above heart level by putting them on the desk in front of you”). After manually configuring participants’ bodies into the two poses, the experimenter left the room. Participants were videotaped; all participants correctly made and held either two high-power or two low-power poses for 1 min each. While making and holding the poses, participants completed a filler task that consisted of viewing and forming impressions of nine faces.

The behavioral task and subjective self-report assessment

Measure of risk taking and powerful feelings. After they finished posing, participants were presented with the gambling task. They were endowed with $2 and told they could keep the money—the safe bet—or roll a die and risk losing the $2 for a payoff of $4 (a risky but rational bet; odds of winning were 50/50). Participants indicated how “powerful” and “in charge” they felt on a scale from 1 (not at all) to 4 (a lot).

An imagined bewildered review from someone accustomed to evaluating clinical trials

Although the authors don’t seem to know what they’re doing, we have an underpowered therapy analogue study with extraordinary claims. It’s unconvincing  that the 2 1-minute behavioral manipulations would change subsequent psychological states and behavior with any extralaboratory implications.

The manipulation poses a puzzle to research participants, challenging them to figure out what is being asked of them. The $2 gambling task presumably is meant to simulate effects on real-world behavior. But the low stakes could mean that participants believed the task evaluated whether they “got” the purpose of the intervention and behaved accordingly. Within that perspective, the unvalidated subjective self-report rating scale would serve as a clue to the intentions of the experimenter and an opportunity to show the participants were smart. The  manipulation of putting participants  into a low power pose is even more unconvincing as a contrasting active intervention or a control condition.  Claims that this manipulation did anything but communicate experimenter expectancies are even less credible.

This is a very weak form of evidence: A therapy analogue study with such a brief, low intensity behavioral manipulation followed by assessments of outcomes that might just inform participants of what they needed to do to look smart (i.e., demand characteristics). Add in that the experimenters were unblinded and undoubted had flexibility in how they delivered the intervention and what they said to participants. As a grossly underpowered trial, the study cannot make a contribution to the literature and certainly not an effect size.

Furthermore, if the authors had even a basic understanding of gender differences in social status or sex differences in testosterone, they would have stratified the study with respect to participate gender, not attempted to obtain control by post hoc statistical manipulation.

I could comment on signs of p-hacking and widespread signs of inappropriate naming, use, and interpretation of statistics, but why bother? There are no vital signs of a publishable paper here.

Is power posing salvaged by fashionable hormonal measures?

 Perhaps the skepticism of the editor and reviewers was overcome by the introduction of mind-body explanations  of what some salivary measures supposedly showed. Otherwise, we would be left with a single subjective self-report measure and a behavioral task susceptible to demand characteristics and nonspecific effects.

We recognize that the free availability of powerful statistical packages risks people using them without any idea of the appropriateness of their use or interpretation. The same observation should be made of the ready availability of means of collecting spit samples from research participants to be sent off to outside laboratories for biochemical analysis.

The clinical health psychology literature is increasingly filled with studies incorporating easily collected saliva samples intended to establish that psychological interventions influence mind-body relations. These have become particularly applied in attempts to demonstrate that mindfulness meditation and even tai chi can have beneficial effects on physical health and even cancer outcomes.

Often inaccurately described as as “biomarkers,” rather than merely as biological measurements, there is seldom little learned by inclusion of such measures that is generalizable within participants or across studies.

Let’s start with salivary-based cortisol measures.

A comprehensive review  suggests that:

  • A single measurement on a participant  or a pre-post pair of assessments would not be informative.
  • Single measurements are unreliable and large intra-and inter-individual differences not attributable to intervention can be in play.
  • Minor variations in experimental procedures can have large, unwanted effects.
  • The current standard is cortisol awakening response in the diurnal slope over more than one day, which would not make sense for the effects of 2 1-minute behavioral manipulations.
  • Even with sophisticated measurement strategies there is low agreement across and even within studies and low agreement with behavioral and self-report data.
  • The idea of collecting saliva samples would serve the function the investigators intended is an unscientific, but attractive illusion.

Another relevant comprehensive theoretical review and synthesis of cortisol reactivity was available at the time the power pose study was planned. The article identifies no basis for anticipating that experimenters putting participants into a 1-minute expansive poses would lower cortisol. And certainly no basis for assuming that putting participants into a 1-minute slumped position would raise cortisol. Or what such findings could possibly mean.

But we are clutching at straws. The authors’ interpretations of their hormonal data depend on bizarre post hoc decisions about how to analyze their data in a small sample in which participant sex is treated in incomprehensible  fashion. The process of trying to explain spurious results risks giving the results a credibility that authors have not earned for them. And don’t even try to claim we are getting signals of hormonal mediation from this study.

Another system failure: The incumbent advantage given to a paper that should not have been published.

Even when publication is based on inadequate editorial oversight and review, any likelihood or correction is diminished by published results having been blessed as “peer reviewed” and accorded an incumbent advantage over whatever follows.

A succession of editors have protected the power pose paper from post-publication peer review. Postpublication review has been relegated to other journals and social media, including PubPeer and blogs.

Soon after publication of  the power pose paper, a critique was submitted to Psychological Science, but it was desk rejected. The editor informally communicated to the author that the critique read like a review and teh original article had already been peer reviewed.

The critique by Steven J. Stanton nonetheless eventually appeared in Frontiers in Behavioral Neuroscience and is worth a read.

Stanton took seriously the science being invoked in the claims of the power pose paper.

A sampling:

Carney et al. (2010) collapsed over gender in all testosterone analyses. Testosterone conforms to a bimodal distribution when including both genders (see Figure 13; Sapienza et al., 2009). Raw testosterone cannot be considered a normally distributed dependent or independent variable when including both genders. Thus, Carney et al. (2010) violated a basic assumption of the statistical analyses that they reported, because they used raw testosterone from pre- and post-power posing as independent and dependent variables, respectively, with all subjects (male and female) included.


^Mean cortisol levels for all participants were reported as 0.16 ng/mL pre-posing and 0.12 ng/mL post-posing, thus showing that for all participants there was an average decrease of 0.04 ng/mL from pre- to post-posing, regardless of condition. Yet, Figure 4 of Carney et al. (2010) shows that low-power posers had mean cortisol increases of roughly 0.025 ng/mL and high-power posers had mean cortisol decreases of roughly 0.03 ng/mL. It is unclear given the data in Figure 4 how the overall cortisol change for all participants could have been a decrease of 0.04 ng/mL.

Another editor of Psychological Science received a critical comment from Marcus Crede and Leigh A. Phillips. After the first round of reviews, the Crede and Philips removed references to changes in the published power pose paper from earlier drafts that they had received from the first author, Dana Carney. However, Crede and Phillips withdrew their critique when asked to respond to a review by Amy Cuddy in a second resubmission.

The critique is now forthcoming in Social Psychological and Personality Science

Revisiting the Power Pose Effect: How Robust Are the Results Reported by Carney, Cuddy and Yap (2010) to Data Analytic Decisions

The article investigates effects of choices made in p-hacking in the original paper. An excerpt from the abstract

In this paper we use multiverse analysis to examine whether the findings reported in the original paper by Carney, Cuddy, and Yap (2010) are robust to plausible alternative data analytic specifications: outlier identification strategy; the specification of the dependent variable; and the use of control variables. Our findings indicate that the inferences regarding the presence and size of an effect on testosterone and cortisol are  highly sensitive to data analytic specifications. We encourage researchers to routinely explore the influence of data analytic choices on statistical inferences and also encourage editors and  reviewers to require explicit examinations of the influence of alternative data analytic  specifications on the inferences that are drawn from data.

Dana Carney, the first author of the has now posted an explanation why she no longer believes the originally reported findings are genuine and why “the evidence against the existence of power poses is undeniable.” She discloses a number of important confounds and important “researcher degrees of freedom in the analyses reported in the published paper.

Coming Up Next

A different view of the Amy Cuddy’s Ted talk in terms of its selling of pseudoscience to consumers and its acknowledgment of a strong debt to Cuddy’s adviser Susan Fiske.

A disclosure of some of the financial interests that distort discussion of the scientific flaws of the power pose.

How the reflexive response of the replicationados inadvertently reinforced the illusion that the original pose study provided meaningful effect sizes.

How Amy Cuddy and her allies marshalled the resources of the Association for Psychological Science to vilify and intimidate critics of bad science and of the exploitation of consumers by psychological pseudoscience.

How journalists played into this vilification.

What needs to be done to avoid a future fiasco for psychology like the power pose phenomenon and protect reformers of the dissemination of science.

Note: Time to reiterate that all opinions expressed here are solely those of Coyne of the Realm and not necessarily of PLOS blogs, PLOS One or his other affiliations.

Jane Brody promoting the pseudoscience of Barbara Fredrickson in the New York Times

Journalists’ coverage of positive psychology and health is often shabby, even in prestigious outlets like The New York Times.

Jane Brody’s latest installment of the benefits of being positive on health relied heavily on the work of Barbara Fredrickson that my colleagues and I have thoroughly debunked.

All of us need to recognize that research concerning effects of positive psychology interventions are often disguised randomized controlled trials.

With that insight, we need to evaluate this research in terms of reporting standards like CONSORT and declarations of conflict of interests.

We need to be more skeptical about the ability of small changes in behavior being able to profoundly improve health.

When in doubt, assume that much of what we read in the media about positivity and health is false or at least exaggerated.

Jane Brody starts her article in The New York Times by describing how most mornings she is “grinning from ear to ear, uplifted not just by my own workout but even more so” by her interaction with toddlers on the way home from where she swims. When I read Brody’s “Turning Negative Thinkers Into Positive Ones.” I was not left grinning ear to ear. I was left profoundly bummed.

I thought real hard about what was so unsettling about Brody’s article. I now have some clarity.

I don’t mind suffering even pathologically cheerful people in the morning. But I do get bothered when they serve up pseudoscience as the real thing.

I had expected to be served up Brody’s usual recipe of positive psychology pseudoscience concocted  to coerce readers into heeding her Barnum advice about how they should lead their lives. “Smile or die!” Apologies to my friend Barbara Ehrenreich for my putting the retitling of her book outside of North America to use here. I invoke the phrase because Jane Brody makes the case that unless we do what she says, we risk hurting our health and shortening our lives. So we better listen up.

What bummed me most this time was that Brody was drawing on the pseudoscience of Barbara Fredrickson that my colleagues and I have worked so hard to debunk. We took the trouble of obtaining data sets for two of her key papers for reanalysis. We were dismayed by the quality of the data. To start with, we uncovered carelessness at the level of data entry that undermined her claims. But her basic analyses and interpretations did not hold up either.

Fredrickson publishes exaggerated claims about dramatic benefits of simple positive psychology exercises. Fredrickson is very effective in blocking or muting the publication of criticism and getting on with hawking her wares. My colleagues and I have talked to others who similarly met considerable resistance from editors in getting detailed critiques and re-analyses published. Fredrickson is also aided by uncritical people like Jane Brody to promote her weak and inconsistent evidence as strong stuff. It sells a lot of positive psychology merchandise to needy and vulnerable people, like self-help books and workshops.

If it is taken seriously, Fredrickson’s research concerns health effects of behavioral intervention. Yet, her findings are presented in a way that does not readily allow their integration with the rest of health psychology literature. It would be difficult, for instance, to integrate Fredrickson’s randomized trials of loving-kindness meditation with other research because she makes it almost impossible to isolate effect sizes in a way that they could be integrated with other studies in a meta-analysis. Moreover, Fredrickson has multiply published contradictory claims from the sae data set without acknowledging the duplicate publication. [Please read on. I will document all of these claims before the post ends.]

The need of self-help gurus to generate support for their dramatic claims in lucrative positive psychology self-help products is never acknowledged as a conflict of interest.  It should be.

Just imagine, if someone had a contract based on a book prospectus promising that the claims of their last pop psychology book would be surpassed. Such books inevitably paint life too simply, with simple changes in behavior having profound and lasting effects unlike anything obtained in the randomized trials of clinical and health psychology. Readers ought to be informed that these pressures to meet demands of a lucrative book contract could generate a strong confirmation bias. Caveat emptor auditor, but how about at least informing readers and let them decide whether following the money influences their interpretation of what they read?

Psychology journals almost never require disclosures of conflicts of interest of this nature. I am campaigning to make that practice routine, nondisclosure of such financial benefits tantamount to scientific misconduct. I am calling for readers to take to social media when these disclosures do not appear in scientific journals where they should be featured prominently. And holding editors responsible for non-enforcement . I can cite Fredrickson’s work as a case in point, but there are many other examples, inside and outside of positive psychology.

Back to Jane Brody’s exaggerated claims for Fredrickson’s work.

I lived for half a century with a man who suffered from periodic bouts of depression, so I understand how challenging negativism can be. I wish I had known years ago about the work Barbara Fredrickson, a psychologist at the University of North Carolina, has done on fostering positive emotions, in particular her theory that accumulating “micro-moments of positivity,” like my daily interaction with children, can, over time, result in greater overall well-being.

The research that Dr. Fredrickson and others have done demonstrates that the extent to which we can generate positive emotions from even everyday activities can determine who flourishes and who doesn’t. More than a sudden bonanza of good fortune, repeated brief moments of positive feelings can provide a buffer against stress and depression and foster both physical and mental health, their studies show.

“Research…demonstrates” (?). Brody is feeding stupid-making pablum to readers. Fredrickson’s kind of research may produce evidence one way or the other, but it is too strong a claim, an outright illusion, to even begin suggesting that it “demonstrates” (proves) what follows in this passage.

Where, outside of tabloids and self-help products, do the immodest claims that one or a few poor quality studies “demonstrate”?

Negative feelings activate a region of the brain called the amygdala, which is involved in processing fear and anxiety and other emotions. Dr. Richard J. Davidson, a neuroscientist and founder of the Center for Healthy Minds at the University of Wisconsin — Madison, has shown that people in whom the amygdala recovers slowly from a threat are at greater risk for a variety of health problems than those in whom it recovers quickly.

Both he and Dr. Fredrickson and their colleagues have demonstrated that the brain is “plastic,” or capable of generating new cells and pathways, and it is possible to train the circuitry in the brain to promote more positive responses. That is, a person can learn to be more positive by practicing certain skills that foster positivity.

We are knee deep in neuro-nonsense. Try asking a serious neuroscientists about the claims that this duo have “demonstrated that the brain is ‘plastic,’ or that practicing certain positivity skills change the brain with the health benefits that they claim via Brody. Or that they are studying ‘amygdala recovery’ associated with reduced health risk.

For example, Dr. Fredrickson’s team found that six weeks of training in a form of meditation focused on compassion and kindness resulted in an increase in positive emotions and social connectedness and improved function of one of the main nerves that helps to control heart rate. The result is a more variable heart rate that, she said in an interview, is associated with objective health benefits like better control of blood glucose, less inflammation and faster recovery from a heart attack.

I will dissect this key claim about loving-kindness meditation and vagal tone/heart rate variability shortly.

Dr. Davidson’s team showed that as little as two weeks’ training in compassion and kindness meditation generated changes in brain circuitry linked to an increase in positive social behaviors like generosity.

We will save discussing Richard Davidson for another time. But really, Jane, just two weeks to better health? Where is the generosity center in brain circuitry? I dare you to ask a serious neuroscientist and embarrass yourself.

“The results suggest that taking time to learn the skills to self-generate positive emotions can help us become healthier, more social, more resilient versions of ourselves,” Dr. Fredrickson reported in the National Institutes of Health monthly newsletter in 2015.

In other words, Dr. Davidson said, “well-being can be considered a life skill. If you practice, you can actually get better at it.” By learning and regularly practicing skills that promote positive emotions, you can become a happier and healthier person. Thus, there is hope for people like my friend’s parents should they choose to take steps to develop and reinforce positivity.

In her newest book, “Love 2.0,” Dr. Fredrickson reports that “shared positivity — having two people caught up in the same emotion — may have even a greater impact on health than something positive experienced by oneself.” Consider watching a funny play or movie or TV show with a friend of similar tastes, or sharing good news, a joke or amusing incidents with others. Dr. Fredrickson also teaches “loving-kindness meditation” focused on directing good-hearted wishes to others. This can result in people “feeling more in tune with other people at the end of the day,” she said.

Brody ends with 8 things Fredrickson and others endorse to foster positive emotions. (Why only 8 recommendations, why not come up with 10 and make them commandments?) These include “Do good things for other people” and “Appreciate the world around you. Okay, but do Fredrickson and Davidson really show that engaging in these activities have immediate and dramatic effects on our health? I have examined their research and I doubt it. I think the larger problem, though, is the suggestion that physically ill people facing shortened lives risk being blamed for being bad people. They obviously did not do these 8 things or else they would be healthy.

If Brody were selling herbal supplements or coffee enemas, we would readily label the quackery. We should do the same for advice about psychological practices that are promised to transform lives.

Brody’s sloppy links to support her claims: Love 2.0

Journalists who talk of “science”  and respect their readers will provide links to their actual sources in the peer-reviewed scientific literature. That way, readers who are motivated can independently review the evidence. Especially in an outlet as prestigious as The New York Times.

Jane Brody is outright promiscuous in the links that she provides, often secondary or tertiary sources. The first link provide for her discussion of Fredrickson’s Love 2.0 is actually to a somewhat negative review of the book.

Fredrickson builds her case by expanding on research that shows how sharing a strong bond with another person alters our brain chemistry. She describes a study in which best friends’ brains nearly synchronize when exchanging stories, even to the point where the listener can anticipate what the storyteller will say next. Fredrickson takes the findings a step further, concluding that having positive feelings toward someone, even a stranger, can elicit similar neural bonding.

This leap, however, is not supported by the study and fails to bolster her argument. In fact, most of the evidence she uses to support her theory of love falls flat. She leans heavily on subjective reports of people who feel more connected with others after engaging in mental exercises such as meditation, rather than on more objective studies that measure brain activity associated with love.

I would go even further than the reviewer. Fredrickson builds her case by very selectively drawing on the literature, choosing only a few studies that fit.  Even then, the studies fit only with considerable exaggeration and distortion of their findings. She exaggerates the relevance and strength of her own findings. In other cases, she says things that have no basis in anyone’s research.

I came across Love 2.0: How Our Supreme Emotion Affects Everything We Feel, Think, Do, and Become (Unabridged) that sells for $17.95. The product description reads:

We all know love matters, but in this groundbreaking book positive emotions expert Barbara Fredrickson shows us how much. Even more than happiness and optimism, love holds the key to improving our mental and physical health as well as lengthening our lives. Using research from her own lab, Fredrickson redefines love not as a stable behemoth, but as micro-moments of connection between people – even strangers. She demonstrates that our capacity for experiencing love can be measured and strengthened in ways that improve our health and longevity. Finally, she introduces us to informal and formal practices to unlock love in our lives, generate compassion, and even self-soothe. Rare in its scope and ambitious in its message, Love 2.0 will reinvent how you look at and experience our most powerful emotion.

There is a mishmash of language games going on here. Fredrickson’s redefinition of love is not based on her research. Her claim that love is ‘really’ micro-moments of connection between people  – even strangers is a weird re-definition. Attempt to read her book, if you have time to waste.

You will quickly see that much of what she says makes no sense in long-term relationships which is solid but beyond the honeymoon stage. Ask partners in long tem relationships and they will undoubtedly lack lots of such “micro-moments of connection”. I doubt that is adaptive for people seeking to build long term relationships to have the yardstick that if lots of such micro-moments don’t keep coming all the time, the relationship is in trouble. But it is Fredrickson who is selling the strong claims and the burden is on her to produce the evidence.

If you try to take Fredrickson’s work seriously, you wind up seeing she has a rather superficial view of a close relationships and can’t seem to distinguish them from what goes on between strangers in drunken one-night stands. But that is supposed to be revolutionary science.

We should not confuse much of what Fredrickson emphatically states with testable hypotheses. Many statements sound more like marketing slogans – what Joachim Kruger and his student Thomas Mairunteregger identify as the McDonaldalization of positive psychology. Like a Big Mac, Fredrickson’s Love 2.0 requires a lot of imagination to live up to its advertisement.

Fredrickson’s love the supreme emotion vs ‘Trane’s Love Supreme

Where Fredrickson’s selling of love as the supreme emotion is not simply an advertising slogan, it is a bad summary of the research on love and health. John Coltrane makes no empirical claim about love being supreme. But listening to him is an effective self-soothing after taking Love 2.0 seriously and trying to figure it out.  Simply enjoy and don’t worry about what it does for your positivity ratio or micro-moments, shared or alone.

Fredrickson’s study of loving-kindness meditation

Jane Brody, like Fredrickson herself depends heavily on a study of loving kindness meditation in proclaiming the wondrous, transformative health benefits of being loving and kind. After obtaining Fredrickson’s data set and reanalyzing it, my colleagues – James Heathers, Nick Brown, and Harrison Friedman – and I arrived at a very different interpretation of her study. As we first encountered it, the study was:

Kok, B. E., Coffey, K. A., Cohn, M. A., Catalino, L. I., Vacharkulksemsuk, T., Algoe, S. B., . . . Fredrickson, B. L. (2013). How positive emotions build physical health: Perceived positive social connections account for the upward spiral between positive emotions and vagal tone. Psychological Science, 24, 1123-1132.

Consolidated standards for reporting randomized trials (CONSORT) are widely accepted for at least two reasons. First, clinical trials should be clearly identified as such in order to ensure that the results are a recognized and available in systematic searches to be integrated with other studies. CONSORT requires that RCTs be clearly identified in the titles and abstracts. Once RCTs are labeled as such, the CONSORT checklist becomes a handy tallying of what needs to be reported.

It is only in supplementary material that the Kok and Fredrickson paper is identify as a clinical trial. Only in that supplement is the primary outcome is identified, even in passing. No means are reported anywhere in the paper or supplement. Results are presented in terms of what Kok and Fredrickson term “a variant of a mediational, parallel process, latent-curve model.” Basic statistics needed for its evaluation are left to readers’ imagination. Figure 1 in the article depicts the awe-inspiring parallel-process mediational model that guided the analyses. We showed the figure to a number of statistical experts including Andrew Gelman. While some elements were readily recognizable, the overall figure was not, especially the mysterious large dot (a causal pathway roundabout?) near the top.

So, not only might study not be detected as an RCT, there isn’t relevant information that could be used for calculating effect sizes.

Furthermore, if studies are labeled as RCTs, we immediately seek protocols published ahead of time that specify the basic elements of design and analyses and primary outcomes. At Psychological Science, studies with protocols are unusual enough to get the authors awarded a badge. In the clinical and health psychology literature, protocols are increasingly common, like flushing a toilet after using a public restroom. No one runs up and thanks you, “Thank you for flushing/publishing your protocol.”

If Fredrickson and her colleagues are going to be using the study to make claims about the health benefits of loving kindness meditation, they have a responsibility to adhere to CONSORT and to publish their protocol. This is particularly the case because this research was federally funded and results need to be transparently reported for use by a full range of stakeholders who paid for the research.

We identified a number of other problems and submitted a manuscript based on a reanalysis of the data. Our manuscript was promptly rejected by Psychological Science. The associate editor . Batja Mesquita noted that two of my co-authors, Nick Brown and Harris Friedman had co-authored a paper resulting in a partial retraction of Fredrickson’s, positivity ratio paper.

Brown NJ, Sokal AD, Friedman HL. The Complex Dynamics of Wishful Thinking: The Critical Positivity Ratio American Psychologist. 2013 Jul 15.

I won’t go into the details, except to say that Nick and Harris along with Alan Sokal unambiguously established that Fredrickson’s positivity ratio of 2.9013 positive to negative experiences was a fake fact. Fredrickson had been promoting the number  as an “evidence-based guideline” of a ratio acting as a “tipping point beyond which the full impact of positive emotions becomes unleashed.” Once Brown and his co-authors overcame strong resistance to getting their critique published, their paper garnered a lot of attention in social and conventional media. There is a hilariously funny account available at Nick Brown Smelled Bull.

Batja Mesquita argued that that the previously published critique discouraged her from accepting our manuscript. To do, she would be participating in “a witch hunt” and

 The combatant tone of the letter of appeal does not re-assure me that a revised commentary would be useful.

Welcome to one-sided tone policing. We appealed her decision, but Editor Eric Eich indicated, there was no appeal process at Psychological Science, contrary to the requirements of the Committee on Publication Ethics, COPE.

Eich relented after I shared an email to my coauthors in which I threatened to take the whole issue into social media where there would be no peer-review in the traditional outdated sense of the term. Numerous revisions of the manuscript were submitted, some of them in response to reviews by Fredrickson  and Kok who did not want a paper published. A year passed occurred before our paper was accepted and appeared on the website of the journal. You can read our paper here. I think you can see that fatal problems are obvious.

Heathers JA, Brown NJ, Coyne JC, Friedman HL. The elusory upward spiral a reanalysis of Kok et al.(2013). Psychological Science. 2015 May 29:0956797615572908.

In addition to the original paper not adhering to CONSORT, we noted

  1. There was no effect of whether participants were assigned to the loving kindness mediation vs. no-treatment control group on the key physiological variable, cardiac vagal tone. This is a thoroughly disguised null trial.
  2. Kok and Frederickson claimed that there was an effect of meditation on cardiac vagal tone, but any appearance of an effect was due to reduced vagal tone in the control group, which cannot readily be explained.
  3. Kok and Frederickson essentially interpreted changes in cardiac vagal tone as a surrogate outcome for more general changes in physical health. However, other researchers have noted that observed changes in cardiac vagal tone are not consistently related to changes in other health variables and are susceptible to variations in experimental conditions that have nothing to do with health.
  4. No attention was given to whether participants assigned to the loving kindness meditation actually practiced it with any frequency or fidelity. The article nonetheless reported that such data had been collected.

Point 2 is worth elaborating. Participants in the control condition received no intervention. Their assessment of cardiac vagal tone/heart rate variability was essentially a test/retest reliability test of what should have been a stable physiological characteristic. Yet, participants assigned to this no-treatment condition showed as much change as the participants who were assigned to meditation, but in the opposite direction. Kok and Fredrickson ignored this and attributed all differences to meditation. Houston, we have a problem, a big one, with unreliability of measurement in this study.

We could not squeeze all of our critique into our word limit, but James Heathers, who is an expert on cardiac vagal tone/heart rate variability elaborated elsewhere.

  • The study was underpowered from the outset, but sample size decreased from 65 to 52 to missing data.
  • Cardiac vagal tone is unreliable except in the context of carefully control of the conditions in which measurements are obtained, multiple measurements on each participant, and a much larger sample size. None of these conditions were met.
  • There were numerous anomalies in the data, including some participants included without baseline data, improbable baseline or follow up scores, and improbable changes. These alone would invalidate the results.
  • Despite not reporting  basic statistics, the article was full of graphs, impressive to the unimformed, but useless to readers attempting to make sense of what was done and with what results.

We later learned that the same data had been used for another published paper. There was no cross-citation and the duplicate publication was difficult to detect.

Kok, B. E., & Fredrickson, B. L. (2010). Upward spirals of the heart: Autonomic flexibility, as indexed by vagal tone, reciprocally and prospectively predicts positive emotions and social connectedness. Biological Psychology, 85, 432–436. doi:10.1016/j.biopsycho.2010.09.005

Pity the poor systematic reviewer and meta analyst trying to make sense of this RCT and integrate it with the rest of the literature concerning loving-kindness meditation.

This was not our only experience obtained data for a paper crucial to Fredrickson’s claims and having difficulty publishing  our findings. We obtained data for claims that she and her colleagues had solved the classical philosophical problem of whether we should pursue pleasure or meaning in our lives. Pursuing pleasure, they argue, will adversely affect genomic transcription.

We found we could redo extremely complicated analyses and replicate original findings but there were errors in the the original entering data that entirely shifted the results when corrected. Furthermore, we could replicate the original findings when we substituted data from a random number generator for the data collected from study participants. After similar struggles to what we experienced with Psychological Science, we succeeded in getting our critique published.

The original paper

Fredrickson BL, Grewen KM, Coffey KA, Algoe SB, Firestine AM, Arevalo JM, Ma J, Cole SW. A functional genomic perspective on human well-being. Proceedings of the National Academy of Sciences. 2013 Aug 13;110(33):13684-9.

Our critique

Brown NJ, MacDonald DA, Samanta MP, Friedman HL, Coyne JC. A critical reanalysis of the relationship between genomics and well-being. Proceedings of the National Academy of Sciences. 2014 Sep 2;111(35):12705-9.

See also:

Nickerson CA. No Evidence for Differential Relations of Hedonic Well-Being and Eudaimonic Well-Being to Gene Expression: A Comment on Statistical Problems in Fredrickson et al.(2013). Collabra: Psychology. 2017 Apr 11;3(1).

A partial account of the reanalysis is available in:

Reanalysis: No health benefits found for pursuing meaning in life versus pleasure. PLOS Blogs Mind the Brain

Wrapping it up

Strong claims about health effects require strong evidence.

  • Evidence produced in randomized trials need to be reported according to established conventions like CONSORT and clear labeling of duplicate publications.
  • When research is conducted with public funds, these responsibilities are increased.

I have often identified health claims in high profile media like The New York Times and The Guardian. My MO has been to trace the claims back to the original sources in peer reviewed publications, and evaluate both the media reports and the quality of the primary sources.

I hope that I am arming citizen scientists for engaging in these activities independent of me and even to arrive at contradictory appraisals to what I offer.

  • I don’t think I can expect to get many people to ask for data and perform independent analyses and certainly not to overcome the barriers my colleagues and I have met in trying to publish our results. I share my account of some of those frustrations as a warning.
  • I still think I can offer some take away messages to citizen scientists interested in getting better quality, evidence-based information on the internet.
  • Assume most of the claims readers encounter about psychological states and behavior being simply changed and profoundly influencing physical health are false or exaggerated. When in doubt, disregard the claims and certainly don’t retweet or “like” them.
  • Ignore journalists who do not provide adequate links for their claims.
  • Learn to identify generally reliable sources and take journalists off the list when they have made extravagant or undocumented claims.
  • Appreciate the financial gains to be made by scientists who feed journalists false or exaggerated claims.

Advice to citizen scientists who are cultivating more advanced skills:

Some key studies that Brody invokes in support of her claims being science-based are poorly conducted and reported clinical trials that are not labeled as such. This is quite common in positive psychology, but you need to cultivate skills to even detect that is what is going on. Even prestigious psychology journals are often lax in labeling studies as RCTs and in enforcing reporting standards. Authors’ conflicts of interest are ignored.

It is up to you to

  • Identify when the claims you are being fed should have been evaluated in a clinical trial.
  • Be skeptical when the original research is not clearly identified as clinical trial but nonetheless compares participants who received the intervention and those who did not.
  • Be skeptical when CONSORT is not followed and there is no published protocol.
  • Be skeptical of papers published in journals that do not enforce these requirements.


I think I have provided enough details for readers to decide for themselves whether I am unduly influenced by my experiences with Barbara Fredrickson and her data. She and her colleagues have differing accounts of her research and of the events I have described in this blog.

As a disclosure, I receive money for writing these blog posts, less than $200 per post. I am also marketing a series of e-books,  including Coyne of the Realm Takes a Skeptical Look at Mindfulness and Coyne of the Realm Takes a Skeptical Look at Positive Psychology.

Maybe I am just making a fuss to attract attention to these enterprises. Maybe I am just monetizing what I have been doing for years virtually for free. Regardless, be skeptical. But to get more information and get on a mailing list for my other blogging, go to and sign up.

Unmasking Jane Brody’s “A Positive Outlook May Be Good for Your Health” in The New York Times

A recipe for coercing ill people with positive psychology pseudoscience in the New York Times

  • Judging by the play she gets in social media and the 100s of comments on her articles in the New York Times, Jane Brody has a successful recipe for using positive psychology pseudoscience to bolster down-home advice you might’ve gotten from your grandmother.
  • Her recipe might seem harmless enough, but her articles are directed at people struggling with chronic and catastrophic physical illnesses. She offers them advice.
  • The message is that persons with physical illness should engage in self-discipline, practice positive psychology exercises – or else they are threatening their health and shortening their lives.
  • People struggling with physical illness have enough to do already. The admonition they individually and collectively should do more -they should become more self-disciplined- is condescending and presumptuous.
  • Jane Brody’s carrot is basically a stick. The implied threat is simply coercive: that people with chronic illness are not doing what they can to improve the physical health unless they engage in these exercises.
  • It takes a careful examination Jane Brody’s sources to discover that the “scientific basis” for this positive psychology advice is quite weak. In many instances it is patently junk, pseudoscience.
  • The health benefits claimed for positivity are unfounded.
  • People with chronic illness are often desperate or simply vulnerable to suggestions that they can and should do more.  They are being misled by this kind of article in what is supposed to be the trusted source of a quality news outlet, The New York Times, not The Daily News.
  • There is a sneaky, ill-concealed message that persons with chronic illness will obtain wondrous benefits by just adopting a positive attitude – even a hint that cancer patients will live longer.

In my blog post about positive psychology and health, I try to provide  tools so that consumers can probe for themselves the usually false and certainly exaggerated claims that are being showered on them.

However, in the case of Jane Brody’s articles, we will see that the task is difficult because she draws on a selective sampling of the literature in which researchers generate junk self-promotional claims.

That’s a general problem with the positive psychology “science” literature, but the solution for journalists like Jane Brody is to seek independent evaluation of claims from outside the positive psychology community. Journalists, did you hear that message?

The article, along with its 100s of comments from readers, is available here:

A Positive Outlook May Be Good for Your Health by Jane E.Brody

The article starts with some clichéd advice about being positive. Brody seems to be on the side of the autonomy of her  readers. She makes seemingly derogatory comments  that the advice is “cockeyed optimism” [Don’t you love that turn of phrase? I’m sure to borrow it in the future]

“Look on the sunny side of life.”

“Turn your face toward the sun, and the shadows will fall behind you.”

“Every day may not be good, but there is something good in every day.”

“See the glass as half-full, not half-empty.”

Researchers are finding that thoughts like these, the hallmarks of people sometimes called “cockeyed optimists,” can do far more than raise one’s spirits. They may actually improve health and extend life.

See?  The clever putdown of this advice was just a rhetorical device, just a set up for what follows. Very soon Brody is delivering some coercive pseudoscientific advice, backed by the claim that “there is no longer any doubt” and that the links between positive thinking and health benefits are “indisputable.”

There is no longer any doubt that what happens in the brain influences what happens in the body. When facing a health crisis, actively cultivating positive emotions can boost the immune system and counter depression. Studies have shown an indisputable link between having a positive outlook and health benefits like lower blood pressure, less heart disease, better weight control [Emphasis added.].

I found the following passage particularly sneaky and undermining of people with cancer.

Even when faced with an incurable illness, positive feelings and thoughts can greatly improve one’s quality of life. Dr. Wendy Schlessel Harpham, a Dallas-based author of several books for people facing cancer, including “Happiness in a Storm,” was a practicing internist when she learned she had non-Hodgkin’s lymphoma, a cancer of the immune system, 27 years ago. During the next 15 years of treatments for eight relapses of her cancer, she set the stage for happiness and hope, she says, by such measures as surrounding herself with people who lift her spirits, keeping a daily gratitude journal, doing something good for someone else, and watching funny, uplifting movies. Her cancer has been in remission now for 12 years.

“Fostering positive emotions helped make my life the best it could be,” Dr. Harpham said. “They made the tough times easier, even though they didn’t make any difference in my cancer cells.”

Sure, Jane Brody is careful to avoid the explicit claim the positive attitude somehow is connected to the cancer being in remission for 12 years, but the implication is there. Brody pushes the advice with a hint of the transformation available to cancer patients, only if they follow the advice.

After all, Jane Brody had just earlier asserted that positive attitude affects the immune system and this well-chosen example happens to be a cancer of the immune system.

Jane Brody immediately launches into a description of a line of research conducted by a positive psychology group at Northwestern University and University of California San Francisco.

Taking her cue from the investigators, Brody blurs the distinction between findings based in correlational studies and the results of intervention studies in which patients actually practiced positive psychology exercises.

People with new diagnoses of H.I.V. infection who practiced these skills carried a lower load of the virus, were more likely to take their medication correctly, and were less likely to need antidepressants to help them cope with their illness.

But Brody sins as a journalist are worse than that. With a great deal of difficulty, I have chased her claims back into the literature. I found some made up facts.

In my literature search, I could find only one study from these investigators that seemed directly related to these claims. The mediocre retrospective correlational study was mainly focused on use of psychostimulants, but it included a crude 6-item summary measure  of positive states of mind.

The authors didn’t present the results in a simple way that allows direct independent examination of whether indeed positive affect is related to other outcomes in any simple fashion. They did not allow check of simple correlations needed to determine whether their measure was not simply a measure of depressive symptoms turned on its head. They certainly had the data, but did not report it. Instead, they present some multivariate analyses that do not show impressive links. Any direct links to viral load are not shown and presumably are not there, although the investigators tested statistically for them. Technically speaking, I would write off the findings to measurement and specification error, certainly not worthy of reporting in The New York Times.

Less technically speaking, Brody is leading up to using HIV as an exemplar illness where cultivating positivity can do so much. But if this study is worth anything at all, it is to illustrate that even correlationally, positive affect is not related to much, other than – no surprise – alternative measures of positive affect.

Brody then goes on to describe in detail an intervention study. You’d never know from her description that her source of information is not a report of the results of the intervention study, but a promissory protocol that supposedly describes how the intervention study was going to be done.

I previously blogged about this protocol. At first, I thought it was praiseworthy that a study of a positive psychology intervention for health had even complied with the requirement that studies be preregistered and have a protocol available. Most such studies do not, but they are supposed to do that. In plain English, protocols are supposed to declare ahead of time what researchers are going to do and precisely how they are going to evaluate whether an intervention works. That is because, notoriously, researchers are inclined to say later they were really trying to do something else and to pick another outcome that makes the intervention look best.

But then I got corrected by James Heathers on Facebook. Duh, he had looked at the date the protocol was published.

He pointed out that this protocol was actually published years after collection of data had begun. The researchers already had a lot to peek at. Rather than identifying just a couple of variables on which the investigators were prepared to stake their claim the intervention was affected, the protocol listed 25 variables that would be examined as outcomes (!) in order to pick one or two.

So I updated what I said in my earlier blog. I pointed out that the published protocol was misleading. It was posted after the fact of the researchers being able to see how their study was unfolding and to change their plains accordingly.  The vagueness of the protocol gave the authors lots of wiggle room for selectively reporting and hyping their findings with the confirmation bias. They would later take advantage of this when they actually published the results of their study.

The researchers studied 159 people who had recently learned they had H.I.V. and randomly assigned them to either a five-session positive emotions training course or five sessions of general support. Fifteen months past their H.I.V. diagnosis, those trained in the eight skills maintained higher levels of positive feelings and fewer negative thoughts related to their infection.

Brody is not being accurate here. When the  authors finally got around to publishing the results, they told a very different story if you probe carefully. Even with the investigators doing a lot of spinning, they showed null results, no effects for the intervention. Appearances the contrary were created by the investigators ignoring what they actually reported in their tables. If you go to my earlier blog post, I point this out in detail, so you can see for yourself.

Brody goes on to describe the regimen that was not shown in the published study validation to be effective.

An important goal of the training is to help people feel happy, calm and satisfied in the midst of a health crisis. Improvements in their health and longevity are a bonus. Each participant is encouraged to learn at least three of the eight skills and practice one or more each day. The eight skills are:

■ Recognize a positive event each day.

■ Savor that event and log it in a journal or tell someone about it.

■ Start a daily gratitude journal.

■ List a personal strength and note how you used it.

■ Set an attainable goal and note your progress.

■ Report a relatively minor stress and list ways to reappraise the event positively.

■ Recognize and practice small acts of kindness daily.

■ Practice mindfulness, focusing on the here and now rather than the past or future.

For chrissakes, this is a warmed over version of Émile Coué de la Châtaigneraie’s autosuggestion “Every day in every way, I’m getting better and better. Surely, contemporary positive psychology’s science of health can do better than that. To Coué’s credit, he gave away his advice for free. He did not charge for his coaching, even if he was giving away something for which he had no evidence would improve people’s physical health.

Dr. Moskowitz said she was inspired by observations that people with AIDS, Type 2 diabetes and other chronic illnesses lived longer if they demonstrated positive emotions. She explained, “The next step was to see if teaching people skills that foster positive emotions can have an impact on how well they cope with stress and their physical health down the line.”

She listed as the goals improving patients’ quality of life, enhancing adherence to medication, fostering healthy behaviors, and building personal resources that result in increased social support and broader attention to the good things in life.

Let me explain why I am offended here. None of these activities have been shown to improve the health of persons with newly diagnosed HIV. It’s reasonable to assume that newly diagnosed persons have a lot with which to contend. It’s a bad time to give them advice to clutter their life with activities that will not make a difference in their health.

The published study was able to recruit and retain a sample of persons with newly diagnosed HIV because it paid them well to keep coming. I’ve worked with this population before, in a study aiming at helping them solve specific practical problems that that they said got in the way of their adherence.

Many persons with newly diagnosed HIV are low income and are unemployed or marginally employed. They will enroll in studies to get the participant fees. When I lived in the San Francisco Bay area, I recall one patient telling a recruiter from UCSF that he was too busy and unable to make a regular visit to the medical center for the intervention, but he would be willing to accept being in the study if he was assigned to the control group. It did not involve attending intervention sessions and would give him a little cash.

Based on my clinical and research experience, I don’t believe that such patients would regularly show up for this kind of useless positive psychology treatment without getting paid. Paticularly if they were informed of the actual results of this misrepresented study.

Gregg De Meza, a 56-year-old architect in San Francisco who learned he was infected with H.I.V. four years ago, told me that learning “positivity” skills turned his life around. He said he felt “stupid and careless” about becoming infected and had initially kept his diagnosis a secret.

“When I entered the study, I felt like my entire world was completely unraveling,” he said. “The training reminded me to rely on my social network, and I decided to be honest with my friends. I realized that to show your real strength is to show your weakness. No pun intended, it made me more positive, more compassionate, and I’m now healthier than I’ve ever been.”

I object to this argument by quotes-from-an-unrepresentative-patient. The intervention did not have the intended effect, and it is misleading to find somebody who claim to turn their life around.

Jane Brody proceeds with some more fake facts.

In another study among 49 patients with Type 2 diabetes, an online version of the positive emotions skills training course was effective in enhancing positivity and reducing negative emotions and feelings of stress. Prior studies showed that, for people with diabetes, positive feelings were associated with better control of blood sugar, an increase in physical activity and healthy eating, less use of tobacco and a lower risk of dying.

The study was so small and underpowered, aside from being methodologically flawed, that even if such effects were actually present, most of the time they would be missed because the study did not have enough patients to achieve significance.

In a pilot study of 39 women with advanced breast cancer, Dr. Moskowitz said an online version of the skills training decreased depression among them. The same was true with caregivers of dementia patients.

“None of this is rocket science,” Dr. Moskowitz said. “I’m just putting these skills together and testing them in a scientific fashion.”

It’s not rocket science, it’s misleading hogwash.

In a related study of more than 4,000 people 50 and older published last year in the Journal of Gerontology, Becca Levy and Avni Bavishi at the Yale School of Public Health demonstrated that having a positive view of aging can have a beneficial influence on health outcomes and longevity. Dr. Levy said two possible mechanisms account for the findings. Psychologically, a positive view can enhance belief in one’s abilities, decrease perceived stress and foster healthful behaviors. Physiologically, people with positive views of aging had lower levels of C-reactive protein, a marker of stress-related inflammation associated with heart disease and other illnesses, even after accounting for possible influences like age, health status, sex, race and education than those with a negative outlook. They also lived significantly longer.

This is even deeper into the woo. Give me a break, Jane Brody. Stop misleading people with chronic illness with false claims and fake facts. Adopting these attitudes will not prevent dementia.

Don’t believe me? I previously debunked these patently false claims in detail. You can see my critique here.

Here is what the original investigators claimed about Alzheimer’s:

We believe it is the stress generated by the negative beliefs about aging that individuals sometimes internalize from society that can result in pathological brain changes,” said Levy. “Although the findings are concerning, it is encouraging to realize that these negative beliefs about aging can be mitigated and positive beliefs about aging can be reinforced, so that the adverse impact is not inevitable.”

I exposed some analysis of voodoo statistics on which this claim is based. I concluded:

The authors develop their case that stress is a significant cause of Alzheimer’s disease with reference to some largely irrelevant studies by others, but depend on a preponderance of studies that they themselves have done with the same dubious small samples and dubious statistical techniques. Whether you do a casual search with Google scholar or a more systematic review of the literature, you won’t find stress processes of the kind the authors invoke among the usual explanations of the development of the disease.

Basically, the authors are arguing that if you hold views of aging like “Old people are absent-minded” or “Old people cannot concentrate well,” you will experience more stress as you age, and this will accelerate development of Alzheimer’s disease. They then go on to argue that because these attitudes are modifiable, you can take control of your risk for Alzheimer’s by adopting a more positive view of aging and aging people

Nonsense, utter nonsense.

Let chronically ill people and those facing cancer adopt any attitude is comfortable or natural for them. It’s a bad time to ask for change, particularly when there isn’t any promised benefit in improved health or prolonged life.

Rather than Jane Brody’s recipe for positive psychology improving your health, I strongly prefer Lilia Downe’s  La Cumbia Del Mole.

It is great on chicken. If it does not extend your life, It will give you some moments of happiness, but you will have to adjust the spices to your personal taste.

I will soon be offering e-books providing skeptical looks at positive psychology, as well as mindfulness. As in this blog post, I will take claims I find in the media and trace them back to the scientific studies on which they are based. I will show you what I see so you can see it too.

 Sign up at my new website to get advance notice of the forthcoming e-books and web courses, as well as upcoming blog posts at this and other blog sites. You can even advance order one or all of the e-books.

 Lots to see at Come see…

Unintended consequences of universal mindfulness training for schoolchildren?

the mindful nationThis is the first installment of what will be a series of occasional posts about the UK Mindfulness All Party Parliamentary Group report,  Mindful Nation.

  • Mindful Nation is seriously deficient as a document supposedly arguing for policy based on evidence.
  • The professional and financial interests of lots of people involved in preparation of the document will benefit from implementation of its recommendations.
  • After an introduction, I focus on two studies singled in Mindful Nation out as offering support for the benefits of mindfulness training for school children.
  • Results of the group’s cherrypicked studies do not support implementation of mindfulness training in the schools, but inadvertently highlight some issues.
  • Investment in universal mindfulness training in the schools is unlikely to yield measurable, socially significant results, but will serve to divert resources from schoolchildren more urgently in need of effective intervention and support.
  • Mindfulness Nation is another example of  delivery of  low intensity  services to mostly low risk persons to the detriment of those in greatest and most urgent need.

The launch event for the Mindful Nation report billed it as the “World’s first official report” on mindfulness.

Mindful Nation is a report written by the UK Mindfulness All-Party Parliamentary Group.

The Mindfulness All-Party Parliamentary Group (MAPPG)  was set up to:

  • review the scientific evidence and current best practice in mindfulness training
  • develop policy recommendations for government, based on these findings
  • provide a forum for discussion in Parliament for the role of mindfulness and its implementation in public policy.

The Mindfulness All-Party Parliamentary Group describes itself as

Impressed by the levels of both popular and scientific interest, and launched an inquiry to consider the potential relevance of mindfulness to a range of urgent policy challenges facing government.

Don’t get confused by this being a government-commissioned report. The report stands in sharp contrast to one commissioned by the US government in terms of unbalanced constitution of the committee undertaking the review, and lack  of transparency in search for relevant literature,  and methodology for rating and interpreting of the quality of available evidence.

ahrq reportCompare the claims of Mindful Nation to a comprehensive systematic review and meta-analysis prepared for the US Agency for Healthcare Research and Quality (AHRQ) that reviewed 18,753 citations, and found only 47 trials (3%) that included an active control treatment. The vast majority of studies available for inclusion had only a wait list or no-treatment control group and so exaggerated any estimate of the efficacy of mindfulness.

Although the US report was available to those  preparing the UK Mindful Nation report, no mention is made of either the full contents of report or a resulting publication in a peer-reviewed journal. Instead, the UK Mindful Nation report emphasized narrative and otherwise unsystematic reviews, and meta-analyses not adequately controlling for bias.

When the abridged version of the AHRQ report was published in JAMA: Internal Medicine, an accompanying commentary raises issues even more applicable to the Mindful Nation report:

The modest benefit found in the study by Goyal et al begs the question of why, in the absence of strong scientifically vetted evidence, meditation in particular and complementary measures in general have become so popular, especially among the influential and well educated…What role is being played by commercial interests? Are they taking advantage of the public’s anxieties to promote use of complementary measures that lack a base of scientific evidence? Do we need to require scientific evidence of efficacy and safety for these measures?

The members of the UK Mindfulness All-Party Parliamentary Group were selected for their positive attitude towards mindfulness. The collection of witnesses they called to hearings were saturated with advocates of mindfulness and those having professional and financial interests in arriving at a positive view. There is no transparency in terms of how studies or testimonials were selected, but the bias is notable. Many of the scientific studies were methodologically poor, if there was any methodology at all. Many were strongly stated, but weakly substantiated opinion pieces. Authors often included those having  financial interests in obtaining positive results, but with no acknowledgment of conflict of interest. The glowing testimonials were accompanied by smiling photos and were unanimous in their praise of the transformative benefits of mindfulness.

As Mark B. Cope and David B. Allison concluded about obesity research, such a packing of the committee and a highly selective review of the literature leads to a ”distortion of information in the service of what might be perceived to be righteous ends.” [I thank Tim Caulfield for calling this quote to my attention].

Mindfulness in the schools

The recommendations of Mindfulness Nation are

  1. The Department for Education (DfE) should designate, as a first step, three teaching schools116 to pioneer mindfulness teaching,co-ordinate and develop innovation, test models of replicability and scalability and disseminate best practice.
  2. Given the DfE’s interest in character and resilience (as demonstrated through the Character Education Grant programme and its Character Awards), we propose a comparable Challenge Fund of £1 million a year to which schools can bid for the costs of training teachers in mindfulness.
  3. The DfE and the Department of Health (DOH) should recommend that each school identifies a lead in schools and in local services to co-ordinate responses to wellbeing and mental health issues for children and young people117. Any joint training for these professional leads should include a basic training in mindfulness interventions.
  4. The DfE should work with voluntary organisations and private providers to fund a freely accessible, online programme aimed at supporting young people and those who work with them in developing basic mindfulness skills118.
Payoff of Mindful Nation to Oxford Mindfulness Centre will be huge.
Payoff of Mindful Nation to Oxford Mindfulness Centre will be huge.

Leading up to these recommendations, the report outlined an “alarming crisis” in the mental health of children and adolescents and proposes:

Given the scale of this mental health crisis, there is real urgency to innovate new approaches where there is good preliminary evidence. Mindfulness fits this criterion and we believe there is enough evidence of its potential benefits to warrant a significant scaling-up of its availability in schools.

Think of all the financial and professional opportunities that proponents of mindfulness involved in preparation of this report have garnered for themselves.

Mindfulness to promote executive functioning in children and adolescents

For the remainder of the blog post, I will focus on the two studies cited in support of the following statement:

What is of particular interest is that those with the lowest levels of executive control73 and emotional stability74 are likely to benefit most from mindfulness training.

The terms “executive control” and “emotional stability” were clarified:

Many argue that the most important prerequisites for child development are executive control (the management of cognitive processes such as memory, problem solving, reasoning and planning) and emotion regulation (the ability to understand and manage the emotions, including and especially impulse control). These main contributors to self-regulation underpin emotional wellbeing, effective learning and academic attainment. They also predict income, health and criminality in adulthood69. American psychologist, Daniel Goleman, is a prominent exponent of the research70 showing that these capabilities are the biggest single determinant of life outcomes. They contribute to the ability to cope with stress, to concentrate, and to use metacognition (thinking about thinking: a crucial skill for learning). They also support the cognitive flexibility required for effective decision-making and creativity.

Actually, Daniel Goleman is the former editor of the pop magazine Psychology Today and an author of numerous pop books.

The first cited paper.

73 Flook L, Smalley SL, Kitil MJ, Galla BM, Kaiser-Greenland S, Locke J, et al. Effects of mindful  awareness practices on executive functions in elementary school children. Journal of Applied School Psychology. 2010;26(1):70-95.

Journal of Applied School Psychology is a Taylor-Francis journal, formerly known as Special Services in the Schools (1984 – 2002).  Its Journal Impact Factor is 1.30.

One of the authors of the article, Susan Kaiser-Greenland is a mindfulness entrepreneur as seen in her website describing her as an author, public speaker, and educator on the subject of sharing secular mindfulness and meditation with children and families. Her books are The Mindful Child: How to Help Your Kid Manage Stress and Become Happier, Kinder, and More Compassionate and Mindful Games: Sharing Mindfulness and Meditation with Children, Teens, and Families and the forthcoming The Mindful Games Deck: 50 Activities for Kids and Teens.

This article represents the main research available on Kaiser-Greenfield’s Inner Kids program and figures prominently in her promotion of her products.

The sample consisted of 64 children assigned to either mindful awareness practices (MAPs; n = 32) or a control group consisting of a silent reading period (n = 32).

The MAPs training used in the current study is a curriculum developed by one of the authors (SKG). The program is modeled after classical mindfulness training for adults and uses secular and age appropriate exercises and games to promote (a) awareness of self through sensory awareness (auditory, kinesthetic, tactile, gustatory, visual), attentional regulation, and awareness of thoughts and feelings; (b) awareness of others (e.g., awareness of one’s own body placement in relation to other people and awareness of other people’s thoughts and feelings); and (c) awareness of the environment (e.g., awareness of relationships and connections between people, places, and things).

A majority of exercises involve interactions among students and between students and the instructor.


The primary EF outcomes were the Metacognition Index (MI), Behavioral Regulation Index (BRI), and Global Executive Composite (GEC) as reported by teachers and parents

Wikipedia presents the results of this study as:

The program was delivered for 30 minutes, twice per week, for 8 weeks. Teachers and parents completed questionnaires assessing children’s executive function immediately before and following the 8-week period. Multivariate analysis of covariance on teacher and parent reports of executive function (EF) indicated an interaction effect baseline EF score and group status on posttest EF. That is, children in the group that received mindful awareness training who were less well regulated showed greater improvement in EF compared with controls. Specifically, those children starting out with poor EF who went through the mindful awareness training showed gains in behavioral regulation, metacognition, and overall global executive control. These results indicate a stronger effect of mindful awareness training on children with executive function difficulties.

The finding that both teachers and parents reported changes suggests that improvements in children’s behavioral regulation generalized across settings. Future work is warranted using neurocognitive tasks of executive functions, behavioral observation, and multiple classroom samples to replicate and extend these preliminary findings.”

What I discovered when I scrutinized the study.

 This study is unblinded, with students and their teachers and parents providing the subjective ratings of the students well aware of which group students are assigned. We are not given any correlations among or between their ratings and so we don’t know whether there is just a global subjective factor (easy or difficult child, well-behaved or not) operating for either teachers or parents, or both.

It is unclear for what features of the mindfulness training the comparison reading group offers control or equivalence. The two groups are  different in positive expectations and attention and support that are likely to be reflected the parent and teacher ratings. There’s a high likelihood of any differences in outcomes being nonspecific and not something active and distinct ingredient of mindfulness training. In any comparison with the students assigned to reading time, students assigned to mindfulness training have the benefit of any active ingredient it might have, as well as any nonspecific, placebo ingredients.

This is exceedingly weak design, but one that dominates evaluations of mindfulness.

With only 32 students per group, note too that this is a seriously underpowered study. It has less than a 50% probability of detecting a moderate sized effect if one is present. And because of the larger effect size needed to achieve statistical significance with such a small sample size, and statistically significant effects will be large, even if unlikely to replicate in a larger sample. That is the paradox of low sample size we need to understand in these situations.

Not surprisingly, there were no differences between the mindfulness and reading control groups on any outcomes variable, whether rated by parents or teachers. Nonetheless, the authors rescued their claims for an effective intervention with:

However, as shown by the significance of interaction terms, baseline levels of EF (GEC reported by teachers) moderated improvement in posttest EF for those children in the MAPs group compared to children in the control group. That is, on the teacher BRIEF, children with poorer initial EF (higher scores on BRIEF) who went through MAPs training showed improved EF subsequent to the training (indicated by lower GEC scores at posttest) compared to controls.

Similar claims were made about parent ratings. But let’s look at figure 3 depicting post-test scores. These are from the teachers, but results for the parent ratings are essentially the same.

teacher BRIEF quartiles

Note the odd scaling of the X axis. The data are divided into four quartiles and then the middle half is collapsed so that there are three data points. I’m curious about what is being hidden. Even with the sleight-of-hand, it appears that scores for the intervention and control groups are identical except for the top quartile. It appears that just a couple of students in the control group are accounting for any appearance of a difference. But keep in mind that the upper quartile is only a matter of eight students in each group.

This scatter plot is further revealing:

teacher BRIEF

It appears that the differences that are limited to the upper quartile are due to a couple of outlier control students. Without them, even the post-hoc differences that were found in the upper quartile between intervention control groups would likely disappear.

Basically what we are seeing is that most students do not show any benefit whatsoever from mindfulness training over being in a reading group. It’s not surprising that students who were not particularly elevated on the variables of interest do not register an effect. That’s a common ceiling effect in such universally delivered interventions in general population samples

Essentially, if we focus on the designated outcome variables, we are wasting the students’ time as well as that of the staff. Think of what could be done if the same resources could be applied in more effective ways. There are a couple of students in in this study were outliers with low executive function. We don’t know how else they otherwise differ.Neither in the study, nor in the validation of these measures is much attention given to their discriminant validity, i.e., what variables influence the ratings that shouldn’t. I suspect strongly that there are global, nonspecific aspects to both parent and teacher ratings such that they are influenced by the other aspects of these couple of students’ engagement with their classroom environment, and perhaps other environments.

I see little basis for the authors’ self-congratulatory conclusion:

The present findings suggest that mindfulness introduced in a general  education setting is particularly beneficial for children with EF difficulties.


Introduction of these types of awareness practices in elementary education may prove to be a viable and cost-effective way to improve EF processes in general, and perhaps specifically in children with EF difficulties, and thus enhance young children’s socio-emotional, cognitive, and academic development.

Maybe the authors stared with this conviction and it was unshaken by disappointing findings.

Or the statement made in Mindfulness Nation:

What is of particular interest is that those with the lowest levels of executive control73 and emotional stability74 are likely to benefit most from mindfulness training.

But we have another study that is cited for this statement.

74. Huppert FA, Johnson DM. A controlled trial of mindfulness training in schools: The importance of practice for an impact on wellbeing. The Journal of Positive Psychology. 2010; 5(4):264-274.

The first author, Felicia Huppert is a  Founder and Director – Well-being Institute and Emeritus Professor of Psychology at University of Cambridge, as well as a member of the academic staff of the Institute for Positive Psychology and Education of the Australian Catholic University.

This study involved 173 14- and 15- year old  boys from a private Catholic school.

The Journal of Positive Psychology is not known for its high methodological standards. A look at its editorial board suggests a high likelihood that manuscripts submitted will be reviewed by sympathetic reviewers publishing their own methodologically flawed studies, often with results in support of undeclared conflicts of interest.

The mindfulness training was based on the program developed by Kabat-Zinn and colleagues at the University of Massachusetts Medical School (Kabat-Zinn, 2003). It comprised four 40 minute classes, one per week, which presented the principles and practice of mindfulness meditation. The mindfulness classes covered the concepts of awareness and acceptance, and the mindfulness practices included bodily awareness of contact points, mindfulness of breathing and finding an anchor point, awareness of sounds, understanding the transient nature of thoughts, and walking meditation. The mindfulness practices were built up progressively, with a new element being introduced each week. In some classes, a video clip was shown to highlight the practical value of mindful awareness (e.g. “The Last Samurai”, “Losing It”). Students in the mindfulness condition were also provided with a specially designed CD, containing three 8-minute audio files of mindfulness exercises to be used outside the classroom. These audio files reflected the progressive aspects of training which the students were receiving in class. Students were encouraged to undertake daily practice by listening to the appropriate audio files. During the 4-week training period, students in the control classes attended their normal religious studies lessons.

A total of 155 participants had complete data at baseline and 134 at follow-up (78 in the mindfulness and 56 in the control condition). Any student who had missing data are at either time point was simply dropped from the analysis. The effects of this statistical decison are difficult to track in the paper. Regardless, there was a lack of any difference between intervention and control group and any of a host of outcome variables, with none designated as primary outcome.

Actual practicing of mindfulness by students was inconsistent.

One third of the group (33%) practised at least three times a week, 34.8% practised more than once but less than three times a week, and 32.7% practised once a week or less (of whom 7 respondents, 8.4%, reported no practice at all). Only two students reported practicing daily. The practice variable ranged from 0 to 28 (number of days of practice over four weeks). The practice variable was found to be highly skewed, with 79% of the sample obtaining a score of 14 or less (skewness = 0.68, standard error of skewness = 0.25).

The authors rescue their claim of a significant effect for the mindfulness intervention with highly complex multivariate analyses with multiple control variables in which outcomes within-group effects for students assigned to mindfulness  were related to the extent of students actually practicing mindfulness. Without controlling for the numerous (and post-hoc) multiple comparisons, results were still largely nonsignificant.

One simple conclusion that can be drawn is that despite a lot of encouragement, there was little actual practice of mindfulness by the relatively well-off students in a relatively highly resourced school setting. We could expect results to improve with wider dissemination to schools with less resources and less privileged students.

The authors conclude:

The main finding of this study was a significant improvement on measures of mindfulness and psychological well-being related to the degree of individual practice undertaken outside the classroom.

Recall that Mindful Nation cited the study in the following context:

What is of particular interest is that those with the lowest levels of executive control73 and emotional stability74 are likely to benefit most from mindfulness training.

These are two methodologically weak studies with largely null findings. They are hardly the basis for launching a national policy implementing universal mindfulness in the schools.

As noted in the US AHRQ report, despite a huge number of studies of mindfulness having been conducted, few involved a test with an adequate control group, and so there’s little evidence that mindfulness has any advantage over any active treatment. Neither of these studies disturbed that conclusion, although they are spun both in the original studies and in the Mindful Nation report to be positive. Both papers were published in journals where the reviewers were likely to be overly sympathetic and not at him tentative to serious methodological and statistical problems.

The committee writing Mindful Nation arrived at conclusions consistent with their prior enthusiasm for mindfulness and their vested interest in it. They sorted through evidence to find what supported their pre-existing assumptions.

Like UK resilience programs, the recommendations of Mindful Nation put considerable resources in the delivery of services to a large population and likely to have the threshold of need to register a socially in clinically significant effect. On a population level, results of the implementation are doomed to fall short of its claims. Those many fewer students in need more timely, intensive, and tailored services are left underserved. Their presence is ignored or, worse, invoked to justify the delivery of services to the larger group, with the needy students not benefiting.

In this blog post, I mainly focused on two methodologically poor studies. But for the selection of these particular studies, I depended on the search of the authors of Mindful Nation and the emphasis that were given to these two studies for some sweeping claims in the report. I will continue to be writing about the recommendations of Mindful Nation. I welcome reader feedback, particularly from readers whose enthusiasm for mindfulness is offended. But I urge them not simply to go to Google and cherry pick an isolated study and ask me to refute its claims.

Rather, we need to pay attention to the larger literature concerning mindfulness, its serious methodological problems, and the sociopolitical forces and vested interests that preserve a strong confirmation bias, both in the “scientific” literature and its echoing in documents like Mindful Nation.

Promoting a positive psychology self-help book with a Wikipedia entry

to mama with loveThis edition of Mind the Brain continues an odd and fascinating story of an aggressive promotion of a positive psychology self-help book. In this chapter, I tell how the promotion is being aided by the author’s son creating a laudatory Wikipedia entry.

 The story can simply be appreciated as amusing. Or it can be used to raise the consciousness of readers concerning just what is involved in the promotion of sciencey self-help books. The story could raise readers’ level of skepticism about what they might have previously seen as a spontaneous outpouring of enthusiasm for the launch of books.

 The story can also be used to raise questions about the blurry lines between science, self-promotion of persons who traffic in the label of being a scientist, and commercial profitability.

 Is the science behind positive psychology self-help books being shaped and even distorted in the way it appears in the peer-reviewed literature and social media in order to make books and other commercial products like workshops and training for coaches more profitable? Do we need more routine declarations of conflicts of interest in scientific publications of persons writing self-help books?

I wonder how many people have ever thought of inventing a term and having a Wikipedia entry written for it in order to appropriate – claim personal credit for – a cherry picked literature. Having redefined the relevant scientific literature, such a clever person can then select and scrub the literature so that shines brilliantly with positive findings, excluding a considerable amount of negative findings and work done by others? All in the service of promoting a self-help book. Clever or crass?

Staking a claim on a piece of the scientific literature as your own.

eqd_natg_day_4__staking_her_claim_by_samueleallen-d5a0uziAppropriating an area of research under your new label, such as mental contrasting  or grit  allows you to choose to take charge of what studies to include as relevant and what to exclude. Others outside of your laboratory who take your appropriation seriously will miss a potentially larger relevant literature when they attempt a search with standard electronic bibliographic source like Google Scholar or Web of Science using the existing terms that are being replaced by a new one. They are not searching your concept, only the old one.

Naïve PhD students who were inspired to investigate the renamed, appropriated concept will need to cite the author’s work. Critics who are motivated to challenge the confirmatory bias included under the rubric of the new term will be faced with the problem that they did not actually investigate it, only an alternative topic for which they are trying to claim relevance.

Step1: Appropriate the literature, with a novel renaming of a corner of the scientific literature.

Step2: Write a self-help book.

Step3: Get your son to write an entry for Wikipedia promoting the concept. A loving son who will please his mom by citing her for 19 of the 20 citations included in the Wikipedia entry.

Some background.

I was persuaded by an extraordinary publicity campaign to purchase a self-help book, Rethinking Positive Psychology. With stories in prominent media outlets titled like

 The Case Against Positive Thinking

I thought I was buying a long overdue critique of positive psychology. Instead, the book represents a clever repackaging of the familiar wild claims of positive psychology gurus that life transformations await anyone doing their exercises. In the case of Rethinking Positive Psychology, the pitch is made that positive fantasies are not enough, but one only needs a simple and superficial consideration of the obstacles involved in achieving them and what could be done. Rather than any elaborate process of problem definition and consideration of coping options, the book calls for a swift application of a WOOP exercise – (Wish, Outcome, Obstacle, Plan).

stop thinkI quickly saw that WOOP is just a reheating of common old stuff in the self-help and clinical literature, like, for instance, the familiar Stop and Think of problem-solving therapy.

I read the book to the end on a long train ride, but from the outset I found that it was being misrepresented as being evidence-based. Over a series of blog posts at I am exploring the book’s promotion and the bad science in which it is grounded

Some of what is claimed as the science behind this book is not peer peer-reviewed. Readers have no opportunity to go to an outside source and decide for themselves whether claims are valid, bolstered in their confidence that the sources at least survived peer review. Some of what passes for the science behind the book likely predates the conception of the book and any deal with publishers. But some papers that are cited have a distinct quality of being experimercials concocted as part of the creation of a marketing advantage of the book as more sciencey than its competitors. We’ll come back to that in a later blog post.

The author of the book coined the term mental contrasting and the acronym WOOP to selectively appropriate and represent parta of a larger literature concerning implementation of intentions and positive fantasies. Relying on the author’s work alone, along with that of her husband, one would get the impression that they have together developed a whole literature that has produced results uniformly consistent with their theory and supportive of their self-help products.

Checking with Wikipedia

Only late in my investigation did I come across a Wikipedia entry for mental contrasting.

The Wikipedia entry prominently displays an exclamation point with a warning and a plea:

exclaimThis entry contains content that is written like an advertisement. Please help improve it by removing promotional content and inappropriate external links, and by adding encyclopedic content written from a neutral point of view. (April 2015).

The entry stakes out the self-help book author’s claim of the invention:

Mental contrasting (MC) is a problem-solving strategy and motivational tool that leads to selective behavior modification.[1] It was introduced by psychologist Gabriele Oettingen in 2001.[2]

There are 20 references included for the entry. Nineteen are to the work of the author of the self-help book.

How the Wikipedia entry got there was a matter of mystery and speculation until it occurred to me to click on the View History link for the entry.

keep you are my hero momIt revealed that the entry had been created by Anton Gollwitzer, described as a contributor who does not have a Wikipedia user page. He happens to have the same last name as the husband of the author of the self-help book. [*] Anton created his entry just at the time the self-help book was published.

Clicking on the talk link  for him, we immediately comes to a warning:

exclaimSpeedy deletion of “Woop (Scientific Strategy)”

A page you created, Woop (Scientific Strategy), has been tagged for deletion, as it meets one or more of the criteria for speedy deletion; specifically, you removed all content from the page or otherwise requested its deletion.

You are welcome to contribute content which complies with our content policies and any applicable inclusion guidelines. However, please do not simply re-create the page with the same content. You may also wish to read our introduction to editing and guide to writing your first article.

Thank you. — Rrburke (talk) 17:55, 27 October 2014 (UTC),

This was followed by another entry:

Your contributed article, WOOP (scientific strategy)

Which began

Hello, I noticed that you recently created a new page, WOOP (scientific strategy). First, thank you for your contribution; Wikipedia relies solely on the efforts of volunteers such as you. Unfortunately, the page you created covers a topic on which we already have a page – Mental contrasting. Because of the duplication, your article has been tagged for speedy deletion. Please note that this is not a comment on you personally and we hope you will continue helping to improve Wikipedia. If the topic of the article you created is one that interests you, then perhaps you would like to help out at Mental contrasting – you might like to discuss new information at the article’s talk page.

It was then followed by another entry:

Managing a conflict of interest

That began:

Hello, AntonGollwitzer. We welcome your contributions to Wikipedia, but if you are affiliated with some of the people, places or things you have written about on Wikipedia, you may have a conflict of interest or close connection to the subject.

All editors are required to comply with Wikipedia’s neutral point of view content policy. People who are very close to a subject often have a distorted view of it, which may cause them to inadvertently edit in ways that make the article either too flattering or too disparaging. People with a close connection to a subject are not absolutely prohibited from editing about that subject, but they need to be especially careful about ensuring their edits are verified by reliable sources and writing with as little bias as possible.

If you are very close to a subject, here are some ways you can reduce the risk of problems:

Avoid or exercise great caution when editing or creating articles related to you, your organization, or its competitors, as well as projects and products they are involved with.

Avoid linking to the Wikipedia article or website of your organization in other articles (see Wikipedia:Spam).

Exercise great caution so that you do not accidentally breach Wikipedia’s content policies.

This is getting more embarrassing. And then comes another entry:

exclaimNomination of WOOP (scientific strategy) for deletion

A discussion is taking place as to whether the article WOOP (scientific strategy) is suitable for inclusion in Wikipedia according to Wikipedia’s policies and guidelines or whether it should be deleted.

The article will be discussed at Wikipedia: Articles for deletion/WOOP (scientific strategy) until a consensus is reached, and anyone is welcome to contribute to the discussion. The nomination will explain the policies and guidelines which are of concern. The discussion focuses on high-quality evidence and our policies and guidelines.

Users may edit the article during the discussion, including to improve the article to address concerns raised in the discussion. However, do not remove the article-for-deletion notice from the top of the article. DGG ( talk ) 04:11, 29 March 2015 (UTC)

I can’t wait to see where all this is going. But is anyone else offended by this misuse of Wikipedia?


*I was wrapping up this blog post when I did a Google Scholar search did I should have done earlier. I found that when I entered the names Anton Gollwitzer and Gabriele Oettingen, the first citation was

Gollwitzer, A., Oettingen, G., Kirby, T. A., Duckworth, A. L., & Mayer, D. (2011). Mental contrasting facilitates academic performance in school children. Motivation and Emotion, 35(4), 403-412.

Angela Duckworth provided a wildly enthusiastic endorsement of the book.

I was once asked by educators to identify the single most effective intervention for improving self-control. Every scientist I spoke to referred me to the work summarized here – masterfully in with incompatible insight and warmth. Read this brilliant book and then go out and do what Gabriele Oettingen recommends. No changes the way you think about making your dreams come true.”

Duckworth has her own contract for a self-help book. Similar to Oettingen, she appropriated an existing literature under her term grit. Maybe Oettingen will return the favor of Duckworth’s endorsement and do the same for her. What a wonderful mutual admiration society the positive psychology community is.