Flawed meta-analysis reveals just how limited the evidence is mapping meditation into specific regions of the brain

The article put meaningless, but reassuring effect sizes into the literature where these numbers will be widely and uncritically cited.

mind the brain logo

“The only totally incontrovertible conclusion is that much work remains to be done…”.

lit up brain not in telegraph article PNG

Authors of a systematic review and meta-analysis of functional neuroanatomical studies (fMRI and PET) of meditation were exceptionally frank in acknowledging problems relating the practice of meditation to differences in specific regions of the brain. However, they did not adequately deal with problems hiding in plain sight. These problems should have discouraged integration of this literature into a meta-analysis and the authors’ expressing the strength of the association between meditation and the brain in terms of a small set of moderate effect sizes.

The article put meaningless, but reassuring effect sizes into the literature where these numbers will be widely and uncritically cited.

An amazing set of overly small studies with evidence that null findings are being suppressed.

Many in the multibillion mindfulness industry are naive or simply indifferent to what constitutes quality evidence. Their false confidence that “meditation changes the brain*” can be bolstered by selective quotes from this review seemingly claiming that the associations are well-established and practically significant. Readers who are more sophisticated may nonetheless be mislead by this review, unless they read beyond the abstract and with appropriate skepticism.

Read on. I suspect you will be surprised as I was about the small quantity and poor quality of the literature relating the practice of meditation to specific areas of the brain. The colored pictures of the brain widely used to illustrate discussions of meditation are premature and misleading.

As noted in another article :

Brightly coloured brain scans are a media favourite as they are both attractive to the eye and apparently easy to understand but in reality they represent some of the most complex scientific information we have. They are not maps of activity but maps of the outcome of complex statistical comparisons of blood flow that unevenly relate to actual brain function. This is a problem that scientists are painfully aware of but it is often glossed over when the results get into the press.

The article is

Fox KC, Dixon ML, Nijeboer S, Girn M, Floman JL, Lifshitz M, Ellamil M, Sedlmeier P, Christoff K. Functional neuroanatomy of meditation: A review and meta-analysis of 78 functional neuroimaging investigations. Neuroscience & Biobehavioral Reviews. 2016 Jun 30;65:208-28.

Abstract.

Keep in mind how few readers go beyond an abstract in forming an impression of what an article shows. More readers “know” what the meta analysis found solely based on their reading the abstract , relative to the fewer people who read both the article and the supplementary material).

Meditation is a family of mental practices that encompasses a wide array of techniques employing distinctive mental strategies. We systematically reviewed 78 functional neuroimaging (fMRI and PET) studies of meditation, and used activation likelihood estimation to meta-analyze 257 peak foci from 31 experiments involving 527 participants. We found reliably dissociable patterns of brain activation and deactivation for four common styles of meditation (focused attention, mantra recitation, open monitoring, and compassion/loving-kindness), and suggestive differences for three others (visualization, sense-withdrawal, and non-dual awareness practices). Overall, dissociable activation patterns are congruent with the psychological and behavioral aims of each practice. Some brain areas are recruited consistently across multiple techniques—including insula, pre/supplementary motor cortices, dorsal anterior cingulate cortex, and frontopolar cortex—but convergence is the exception rather than the rule. A preliminary effect-size meta-analysis found medium effects for both activations (d = 0.59) and deactivations (d = −0.74), suggesting potential practical significance. Our meta-analysis supports the neurophysiological dissociability of meditation practices, but also raises many methodological concerns and suggests avenues for future research.

The positive claims in the abstract

“…Found reliably dissociable patterns of brain activation and deactivation for four common styles of meditation.”

“Dissociable activation patterns are congruent with the psychological and behavioral aims of each practice.”

“Some brain areas are recruited consistently across multiple techniques”

“A preliminary effect-size meta-analysis found medium effects for both activations (d = 0.59) and deactivations (d = −0.74), suggesting potential practical significance.”

“Our meta-analysis supports the neurophysiological dissociability of meditation practices…”

 And hedges and qualifications in the abstract

“Convergence is the exception rather than the rule”

“[Our meta-analysis] also raises many methodological concerns and suggests avenues for future research.

Why was this systematic review and meta-analysis undertaken now?

A figure provided in the article showed a rapid accumulation of studies of mindfulness in the brain in the past few years, with over 100 studies now available.

However, the authors systematic search yielded “78 functional neuroimaging (fMRI and PET) studies of meditation, and used activation likelihood estimation to meta-analyze 257 peak foci from 31 experiments involving 527 participants.” About a third of the studies identified in a search provided usable data.

What did the authors want to accomplish?

Taken together, our central aims were to: (i) comprehensively review and meta-analyze the existing functional neuroimaging studies of meditation (using the meta-analytic method known as activation likelihood estimation, or ALE), and compare consistencies in brain activation and deactivation both within and across psychologically distinct meditation techniques; (ii) examine the magnitude of the effects that characterize these activation patterns, and address whether they suggest any practical significance; and (iii) articulate the various methodological challenges facing the emerging field of contemplative neuroscience (Caspi and Burleson, 2005; Thompson, 2009; Davidson, 2010; Davidson and Kaszniak, 2015), particularly with respect to functional neuroimaging studies of meditation.

Said elsewhere in the article:

Our central hypothesis was a simple one: meditation practices distinct at the psychological level (Ψ) may be accompanied by dissociable activation patterns at the neurophysiological level (Φ). Such a model describes a ‘one-to-many’ isomorphism between mind and brain: a particular psychological state or process is expected to have many neurophysiological correlates from which, ideally, a consistent pattern can be discerned (Cacioppo and Tassinary, 1990).

The assumption is meditating versus not-meditating brains should be characterized by  distinct, observable neurophysiological pattern. There should also be distinct, enduring changes in the brain in people who have been practicing meditation for some time.

I would wager that many meditation enthusiasts believe that links to specific regions are already well established. Confronted with evidence to the contrary, they would suggest that links between the experience of meditating and changes in the brain are predictable and are waiting to be found. It is that kind of confidence that leads to the significance chasing and confirmatory bias currently infecting this literature.

Types of meditation available for study

Quantitative analyses focused on four types of meditation. Additional terms of meditation did not have sufficient studies and so were examined qualitatively. Some studies of the four provided within-group effect size, whereas other studies provided between-group effect sizes.

Focused attention (7 studies)

Directing attention to one specific object (e.g., the breath or a mantra) while monitoring and disengaging from extraneous thoughts or stimuli (Harvey, 1990, Hanh, 1991, Kabat-Zinn, 2005, Lutz et al., 2008b, Wangyal and Turner, 2011).

Mantra recitation (8 studies)

Repetition of a sound, word, or sentence (spoken aloud or silently in one’s head) with the goals of calming the mind, maintaining focus, and avoiding mind-wandering.

Open monitoring (10 studies)

Bringing attention to the present moment and impartially observing all mental contents (thoughts, emotions, sensations, etc.) as they naturally arise and subside.

Loving-kindness/compassion (6 studies)

L-K involves:

Generating feelings of kindness, love, and joy toward themselves, then progressively extend these feelings to imagined loved ones, acquaintances, strangers, enemies, and eventually all living beings (Harvey, 1990, Kabat_Zinn, 2005, Lutz et al., 2008a).

Similar but not identical, compassion meditation

Takes this practice a step further: practitioners imagine the physical and/or psychological suffering of others (ranging from loved ones to all humanity) and cultivate compassionate attitudes and responses to this suffering.

In addition to these four types of meditation, three others can be identified, but so far have only limited studies of the brain: Visualization, Sense-withdrawal and Non-dual awareness practices.

A dog’s breakfast: A table of the included studies quickly reveals a meta-analysis in deep trouble

studies included

This is not a suitable collection of studies to enter into a meta-analysis with any expectation that a meaningful, generalizable effect size will be obtained.

Most studies (14) furnish only pre-post, within-group effects for mindfulness practiced by long time practitioners. Of these 14 studies, there are two outliers with 20 and 31 practitioners. Otherwise the sample size ranges from 4 to 14.

There are 11 studies furnishing between-group comparisons between experienced and novice meditators. The number of participants in the smaller cell is key for the power of between-group effect sizes, not the overall sample size. In these 11 studies, this ranged from 10 to 22.

It is well-known that one should not combine within- and between- group effect sizes in meta analysis.  Pre-/post-within-group differences capture not only the effects of the active ingredients of an intervention, but nonspecific effects of the conditions under which data are gathered, including regression to the mean. These within-group differences will typically overestimate between-group differences. Adding a  comparison group and calculating between-group differences has the potential for  controlling nonspecific effects, if the comparison condition is appropriate.

The effect sizes based on between-group differences in these studies have their own problems as estimates of the effects of meditation on the brain. Participants were not randomized to the groups, but were selected because they were already either experienced or novice meditators. Yet these two groups could differ on a lot of variables that cannot be controlled: meditation could be confounded with other lifestyle variables: sleeping better or having a better diet. There might be pre-existing differences in the brain that made it easier for the experienced meditators to have committed to long term practice. The authors acknowledge these problems late in the article, but they do so only after discussing the effect sizes they obtained as having substantive importance.

There is good reason to be skeptical that these poorly controlled between-group differences are directly comparable to whatever changes would occur in experienced meditators’ brains in the course of practicing meditation.

It has been widely appreciated that neuroimaging studies are typically grossly underpowered, and that the result is low reproducibility of findings. Having too few participants in a  study will likely yield false negatives because of an inability to achieve the effects needed to obtain significant findings. Small sample size means a stronger association is needed to be significant.

Yet, what positive findings (i.e., significant) are obtained will of necessity be larger likely to be exaggerated and not reproducible with a larger sample.

Another problem with such small cell sizes is that it cannot be assumed that effects are due to one or more participants’ differences in brain size or anatomy. One or a small subgroup of outliers could drive all significant findings in an already small sample. The assumption that statistical techniques can smooth these interindividual differences depends on having much larger samples.

It has been noted elsewhere:

Brains are different so the measure in corresponding voxels across subjects may not sample comparable information.

How did the samples get so small? Neuroanatomical studies are expensive, but why did Lazar et al (2000) have 5 rather 6 participants, or only the 4 participants that Davanger et had? Were from some participants dropped after a peeking at the data? Were studies compromised by authors not being able to recruit intended numbers of participants and having to relax entry criteria? What selection bias is there in these small samples? We just don’t know.

I am reminded of all the contentious debate that has occurred when psychoanalysts insisted on mixing uncontrolled case-series with randomized trials in the same meta-analyses of psychotherapy. My colleagues and I showed this introduces great distortion  into the literature . Undoubtedly, the same is occurring in these studies of meditation, but there is so much else wrong with this meta analysis.

The authors acknowledge that in calculating effect sizes, they combined studies measuring cerebral blood flow (positron emission tomography; PET) and blood oxygenation level (functional magnetic resonance imaging; fMRI). Furthermore, the meta-analyses combined studies that varied in the experimental tasks for which neuroanatomical data were obtained.

One problem is that even studies examining a similar form of meditation might be comparing a meditation practice to very different baseline or comparison tasks and conditions. However, collapsing across numerous different baselines or control conditions is a common (in fact, usually inevitable) practice in meta_analyses of functional neuroimaging studies…

So, there are other important sources of heterogeneity between these studies.

Generic_forest_plot
A generic forest plot. This article did not provide one.

It’s a pity that the authors did not provide a forest plot [How to read  a forest plot.]  graphically showing the confidence intervals around the effect sizes being entered into the meta-analysis.

But the authors did provide a funnel plot that I found shocking. [Recommendations for examining and interpreting funnel plot] I have never seen one like, except when someone has constructed an artificial funnel plot to make a point.

funnel plot

Notice two things about this funnel plot. Rather than a smooth, unbroken distribution, studies with effect sizes between -.45 and +.45 are entirely missing. Studies with smaller sample sizes have the largest effect sizes, whereas the smallest effect sizes all come from the larger samples.

For me, this adds to the overwhelming evidence there is something gone wrong in this literature and any effect sizes should be ignored. There must have been considerable suppression of null findings so large effects from smaller studies will not generalize. Yet, the authors find the differences between small and larger sample studies encouraging

This suggests, encouragingly, that despite potential publication bias or inflationary bias due to neuroimaging analysis methods, nonetheless studies with larger samples tend to converge on similar and more reasonable (medium) effect sizes. Although such a conclusion is tentative, the results to date (Fig. 6) suggest that a sample size of approximately n = 25 is sufficient to reliably produce effect sizes that accord with those reported in studies with much larger samples (up to n = 46).

I and others have long argued that studies of this small sample size in evaluating psychotherapy should be left as pilot feasibility studies and not used to generate effect sizes. I think the same logic applies to this literature.

Distinctive patterns of regional activation and deactivation

The first part of the results section is devoted to studies examining particular forms of meditation. Seeing the apparent consistency of results, one needs to keep in mind the small number of studies being examined and the considerable differences among them. For instance, results presented for focused attention combine three between-group comparisons with four within-group studies. Focused attention includes both pre-post meditation differences from experienced Tibetan Buddhist practitioners to differences between novice and experienced practitioners of mindfulness-based stress reduction (MBSR). In almost all cases, meaningful statistically significant differences are found in both activation and deactivation regions that would make a lot of sense in terms of the functions that are known to be associated with them. There is not much noting of anomalous brain regions being identified by significant effects There is a high ratio of significant findings to number of participants comparisons. There is little discussion of anomalies.

Meta-analysis of focused attention studies resulted in 2 significant clusters of activation, both in prefrontal cortex (Table 3;Fig. 2). Activations were observed in regions associated with the voluntary regulation of thought and action, including the premotor cortex (BA 6; Fig. 2b) and dorsal anterior cingulate cortex (BA24; Fig. 2a). Slightly sub-threshold clusters were also observed in the dorsolateral prefrontal cortex (BA 8/9; Fig. 2c) and left midinsula (BA 13; Fig. 2e); we display these somewhat sub-threshod results here because of the obvious interest of these findings in practices that involve top-down focusing of attention, typically focused on respiration. We also observed clusters of deactivation in regions associated with episodic memory and conceptual processing, including the ventral posterior cingulate cortex (BA 31; Fig. 2d)and left inferior parietal lobule (BA 39; Fig. 2f).

How can such meaningful, practically significant findings obtains when so many conditions mitigate against finding them? John Ioannidis once remarked that in hot areas of research, consistency of positive findings from small studies often reflects only the strength of bias with which they are sought. The strength of findings will decrease when larger, more methodologically sophisticated studies become available, conducted by investigators who are less committed to having to get confirmation.

The article concludes:

Many have understandably viewed the nascent neuroscience of meditation with skepticism (Andresen, 2000; Horgan, 2004), burecent years have seen an increasing number of high-quality, controlled studies that are suitable for inclusion in meta-analyses and that can advance our cumulative knowledge of the neural basis of various meditation practices (Tang et al., 2015). With nearly a hundred functional neuroimaging studies of meditation now reported, we can conclude with some confidence that different practices show relatively distinct patterns of brain activity, and that the magnitude of associated effects on brain function may have some practical significance. The only totally incontrovertible conclusion, however, is that much work remains to be done to confirm and build upon these initial findings.

“Increasing number of high-quality, controlled studies that are suitable for inclusion in meta-analyses” ?…” “Conclude with some confidence…? “Relatively distinct patterns”?… “Some practical significance”?

In all of this premature enthusiasm about findings relating the practice of meditation to activation of particular regions of the brain and deactivation of others, we should not lose track of some other issues.

Although the authors talk about mapping one-to-one relationships between psychological states and regions of the brain, none of the studies would be of sufficient size to document some relationships, given the expected size of the relationship, based on what is typically found between psychological states and other biological variables.

Many differences between techniques could be artifactual –due to the technique altering breathing, involving verbalization, or focused attention. Observed differences in the brain regions activated and deactivated might simply reflect these differences without them being related to psychological functioning.

Even if the association were found, it would be a long way to establishing that the association reflected a causal mechanism, rather than simply being correlational or even artifactual. Think of the analogy of discovering a relationship between the amount of sweat while exercising in concluding that any weight loss was due to sweating it out.

We still have not established that meditation has more psychological and physical health benefits than other active interventions with presumably different mechanisms. After lots of studies, we still don’t know whether mindfulness meditation is anything more than a placebo. While I was finishing up this blog post, I came across a new study:

The limited prosocial effects of meditation: A systematic review and meta-analysis. 

Although we found a moderate increase in prosociality following meditation, further analysis indicated that this effect was qualified by two factors: type of prosociality and methodological quality. Meditation interventions had an effect on compassion and empathy, but not on aggression, connectedness or prejudice. We further found that compassion levels only increased under two conditions: when the teacher in the meditation intervention was a co-author in the published study; and when the study employed a passive (waiting list) control group but not an active one. Contrary to popular beliefs that meditation will lead to prosocial changes, the results of this meta-analysis showed that the effects of meditation on prosociality were qualified by the type of prosociality and methodological quality of the study. We conclude by highlighting a number of biases and theoretical problems that need addressing to improve quality of research in this area. [Emphasis added].

 

 

 

Talking back to “Talking Therapy Can Literally Rewire the Brain”

This edition of Mind the Brain was prompted by an article in Huffington Post, Talking Therapy Can Literally Rewire the Brain.

The title is lame on two counts: “literally” and any suggestion that psychotherapy does something distinctive to the brain, much less “rewiring” it.

I gave the journalist the benefit of a doubt and assumed that the editor applied the title to the article without having the journalist’s permission. I know from talking to journalists, that’s a source of enormous frustration when it happens. But in this instance, the odd title came directly from a press release from King’s College London  (Study reveals for first time that talking therapy changes the brain’s wiring)which concerned an article published in the  Nature Publishing journal, Translational Psychiatry 

Hmm, authors from King’s College and published in a Nature journal suggest this might be a serious piece of science worth giving a closer look. In the end, I was reminded not to make too much of authors’ affiliation and where they publish.

I poked fun on Twitter at the title of the Huffington Post article.

literally twitter postThe retweets and likes drifted into a discussion of neuroscientists saying they really didn’t know much about the brain. Somebody threw in a link to an excellent short YouTube video by NeuroSkeptic on that topic that I highly recommend.

Anyway, I found serious problems with the Huffington Post article that should have been sufficient to stop with it.  Nonetheless, I proceeded and the problems got compounded when I turned to the press release with its direct quotes from the author. I wasn’t long into the Translational Psychiatry article before I appreciated that its abstract was misleading in claiming that there were 22 patients in the study. That is a small number, but if the abstract had stated the actual number, which was 15 patients, readers would have been warned not to take too seriously complicated multivariate statistics that were coming up.

How did a prestigious journal like Translational Psychiatry allow authors to misrepresent their sample size? I would shortly be even more puzzled about why the article was even published in Translational Psychiatry, although I formed unflattering some hypotheses about that journal. I’ll end with those hypotheses.

Talking To A Therapist Can Literally Rewire Your Brain (Huffington Post)

The opening sentence would raise the skepticism of informed reader:

If you can change the way you think, you can change your brain.

If I accept that statement, it’s going be with a broad stretching of it to meaninglessness. “If you can change the way you think..” covers lots of territory. If the statement  is going to remain the correct, then the phrase “change your brain” is going to have to be similarly broad. If the journalist wants to make a big deal of this claim, she would have to concede that reading my blog changes her brain.

That’s the conclusion of a new study, which finds that challenging unhealthy thought patterns with the help of a therapist can lead to measurable changes in brain activity.

Okay, we now know that at least a specific study with brain measurements is being discussed.

But then

In the study, psychologists at King’s College London show that Cognitive Behavioral Therapy strengthens certain healthy brain connections in patients with psychosis. This heightened connectivity was associated with long-term reductions in psychotic symptoms and recovery eight years later, according to the findings, which were published online Tuesday in the journal Translational Psychiatry.

“Over six months of therapy, we found that connections between certain key brain regions became stronger,” Dr. Liam Mason, a clinical psychologist at King’s College and the study’s lead author, told The Huffington Post in an email. “What we are really excited about here is that these stronger connections lead to long-term improvements in people’s symptoms and overall recovery across the eight years that we followed them up.”

A lot of skepticism is being raised. The article seems to be claiming that changes in brain function observed in the short term with cognitive behavior therapy for psychosis [CBTp] were associated with long-term changes over the extraordinary eight years.

The problems with this? First CBTp is not known to be particularly effective, even in the short term. Second, this a lot heterogeneity under the umbrella of “psychosis,” but in eight years, a person who has had that label appropriately applied will have a lot of experiences: recovery and relapse, and certainly other mental health treatments. How in all that noise and confusion can a signal detected that a psychotherapy that isn’t particularly effective explains any long-term improvement?

[Skeptical about my claim that CBTp is ineffective? See Effect of a missing clinical trial on what cochrane-slide-2we think about cognitive behavior therapy  and the slides about Cochrane reviews from a longer Powerpoint presentation.]

Any discussion of how CBT works and what long-term improvements it predicts has get past considerable evidence CBT doesn’t work any better than nonspecific supportive treatments. Without short-term effects, how can have long-term effects?

cbt cochrane 1

 

 

 

There is no acknowledgment in the Huffington Post article of the lack of efficacy of CBTp. Instead, we have a strong assumption that CBTp works and that the scientific paper under discussion is important because it shows that CBTp strongly works, with observable long-term effects.

The journalist claims that the present scientific paper builds on earlier one:

In the original study, patients with psychosis underwent brain imaging both before and after three months of CBT. The patients’ brains were scanned while they looked at images of faces expressing different emotions. After undergoing CBT, the patients showed marked increases in brain activity. Specifically, the brain scans showed heightened connections between the amygdala, the brain region involved in fear and threat processing, and the prefrontal cortex, which is responsible for reasoning and thinking rationally ― suggesting that the patients had an improved ability to accurately perceive social threats.

“We think that this change may be important in allowing people to consciously re-think immediate emotional reactions,” Mason said.

Readers can click back to my earlier blog post, Sex and the single amygdala: A tale almost saved by a peek at the data. The same experimental paradigms was being used to study the amygdala in terms of activity predicted changes in the number of sexual partners over time. In that particular study, p-hacking, and significance chasing and selective reporting were used by the authors to create the illusion of important findings. If you visit my blog post, check out the comments that ridiculed the study, including from two very bright undergraduates.

We don’t need to deter into a technical discussion of functional magnetic resonance imaging (fMRI) data to make a couple of points. The authors of the present study used a rather standard experimental paradigm and the focus on amygdala concerned some quite nonspecific psychological processes.

The authors of the present study soon concede this:

There’s a good chance that similar brain changes also occur in CBT patients being treated for anxiety and depression, Mason said.

“There is research showing that some of the same connections may also be strengthened by CBT for anxiety disorders,” he explained.

But wait: isn’t the lead author also saying in the Huffington Post article and the title of the press release as well that this is a first-time study ever?

For the present purposes, we need only to dispense with any notion that we’re talking about a rewiring of the brain known to be specifically associated with psychosis or even that there is reason to expect that such “rewiring” could be expected to predict long-term outcome of psychosis.

Reading further, we find that the study only involved following 15 patients from a larger study, un like the misleading abstract that claims 22.

Seriously, are we being asked to get worked up about a fMRI study with only 15 patients? Yup.

The researchers found that heightened connectivity between the amygdala and prefrontal cortex was associated with long-term recovery from psychosis. The exciting finding marks the first time scientists have been able to demonstrate that brain changes resulting from psychotherapy may be responsible for long-term recovery from mental illness.

What is going on here? The journalist next gives free reign to the lead author to climb up on a soap box and proclaim his agenda behind all of these claim:

The findings challenge the “brain bias” in psychiatry, an institutional focus on physical brain differences over psychological factors in mental illness. Thanks to this common bias, many psychiatrists are prone to recommending medication to their clients rather than psychological treatments such as CBT.

But medication has been proved to be effective with psychosis, CBTp has not.

“Psychological therapy can lead to changes in the mechanics of the brain,” Mason said. “This is especially important for conditions like psychosis which have traditionally been viewed as ‘brain diseases’ that require medication or even surgery.”

“Mechanics of the brain”?  Now we have escalated from ‘literally rewiring’ to “changes in the mechanics.” Dude, we are talking about a fMRI study. Do you think we have been transported to an auto repair shop?

“This research challenges the notion that the existence of physical brain differences in mental health disorders somehow makes psychological factors or treatments less important,” Mason added in a statement.

Clicking on the link takes one to Science Daily article which churnals (plagiarizes) a press release from Kings College,  London.

The Press Release: Study reveals for first time that talking therapy changes the brain’s wiring

There is not much in this press release that is not been regurgitated in the Huffington Post article except for some more soapbox preaching:

Unfortunately, previous research has shown that this ‘brain bias’ can make clinicians more likely to recommend medication but not psychological therapies. This is especially important in psychosis, where only one in ten people who could benefit from psychological therapies are offered them.”

But CBT, the most evaluated psychotherapy for psychosis has not been shown to be effective, by itself. Sure, patients suffering from psychosis need a lot of support, efforts to maintain positive expectations, and opportunities to talk about their experience. But in direct comparisons between such support provided by professionals or by peers, CBT has not been shown to be more effective.

The researchers now hope to confirm the results in a larger sample, and to identify the changes in the brain that differentiate people who experience improvements with CBT from those who do not. Ultimately, the results could lead to better, and more tailored, treatments for psychosis, by allowing researchers to understand what determines whether psychological therapies are effective.

Sure, we are to give a high priority to examining the mechanism by which CBT, which has not been proven effective, works its magic.

Translational Psychiatry: Brain connectivity changes occurring following cognitive behavioural therapy for psychosis predict long-term recovery

[This will be a quick tour, only highlighting some of the many problems that I found. I welcome readers probing the open access article and posting what they find.]

The Abstract misrepresents the study as having 22 patients, when it actually only had data from 15.

The Introduction largely focuses on previous work of the author group. If you bothered to check, none of it involves randomized trials, despite making claims of efficacy for CBTp. No reference is made to a large body of literature finding a lack of effectiveness for CBTp. In particular, there is no mention of the Cochrane reviews.

A close reading of the Methods indicates that what are claimed to be “objective clinical outcomes” are actually unblinded, retrospective ratings of case notes by the two raters including the first author. Unblinded ratings, particularly by an investigator, are an important source of bias in studies of CBTp and lead to exaggerated estimates of outcome.

An additional measure with inadequate validation was obtained at 7 to 8 year follow-up:

Questionnaire about the Process of Recovery (QPR,31), a service-user led instrument that follows theoretical models of recovery and provides a measure of constructs such as hope, empowerment, confidence, connectedness to others.

All patients came from clinical studies conducted by the author group that did not involve randomization. Rather, assignment to CBTp was based on provider identifying patients “deemed as suitable for CBTp.“ There is considerable risk of bias if it patient data is treated if it arose in a randomized trial. I previously raised issues about the inadequacy of routine care provided to psychotic patients both in terms of its clinical adequacy and an meaningfulness as a control/comparison group because of its lack of nonspecific factors.

All patients assigned to CBTp were receiving medication and other services. A table revealed that receipt of other services was strongly correlated with recovery status. Yet the authors are attempting to attribute any recovery across the eight years to the brief course of CBTp at the beginning. Obviously, the study is hopelessly confounded and no valid inferences possible. This alone should have gotten this study rejected.

There were data available from control subjects at follow-up, including fMRI data, but they were excluded from the present report. That is unfortunate, because these data would allow at least minimal evaluation of whether CBTp versus remaining in routine care had any difference in outcomes and – importantly – if the fMRI data similarly predicted the outcomes of patients not receiving CBTp.

Data Analysis indicates one tailed, multivariate statistical tests that are quite inappropriate and essentially meaningless with such a small data set. Bonferonni corrections, which were inconsistently applied, offer no salvation.

With such small samples and multivariate statistics, a focus on p-values is inappropriate, but the authors do just that and report p<.04 and p<.06, the latter being treated as significant. The hypothesis that this might represent significance chasing is supported when supplementary data tables are examined. When I showed them to a neuroscientist, his first response was that they were painful to look at.

longitudinalI could go on but…

Why did the authors bother with this study? Why did King’s College London publicize the study with a press release? Why was it published in Nature’s Translational Psychiatry without the editor or the reviewers catching obvious flaws?

The authors had some data lying around and selected out post-hoc a subset of patients and applied retrospective ratings and inappropriate statistics. There is no evidence of a protocol for a priority hypothesis being pursued, but strong circumstantial of p-hacking, significance chasing and selective reporting. This is not a valid study, not even an experimerciasl, it is a political, public relations effort.

soao box 2Statements in the King’s College press release echoed in the Huffington Post indicate a clear ideological agenda. Anyone who knows anything about psychiatry, neuroscience, cognitive behavior therapy for psychosis is unlikely to be persuaded. Anyone who examines the supplementary statistical tables armed with minimal statistical sophistication will be unimpressed, if not shocked. We can assume that as a group, these people would quickly leave the conversation about cognitive behavior therapy for psychosis literally rewiring the brain, if they ever got engaged.

The authors were not engaging relevant audiences in intelligent conversation. I can only presume that they were targeting naive vulnerable patients and their families having to make difficult decisions about treatment for psychosis. And the authors were preaching to the anti-psychiatry crowd. One of the authors also appears as an author of Understanding Psychosis, a strongly non-evidence-based advocacy of cognitive behavior therapy for psychosis, delivered with a hostility towards medication and psychiatrists (See my critique.) I did know that about this author until I read the materials I’ve been reviewing. It is an important bit of information and speaks to the author’s objectivity and credibility.

Obviously, the press office of King’s College London depends a lot, maybe almost entirely, on the credibility of authors associated with that institution. Maybe next time, they should seek an independent evaluation. Or maybe they are  just interested in publicity about research of any kind.

But why was this article published in the seemingly prestigious Nature journal, Translational Psychiatry? It should be noted that this journal is open access, but with exceptionally pricey Article Processing Costs (APCs) of £2,400/$3,900/€2,800. Apparently adequate screening and appropriate peer review are not including in these costs. These authors have purchased a lot of prestige. Moreover, if you want to complain about their work in a letter to the editor, you have to pay $900. So the authors have effectively insulated themselves from critics. Of course, is always blogging, PubMed Commons and PubPeer for post-publication peer review.

I previously blogged about another underpowered, misreported study claiming to have identified a biomarker blood test for depression. The authors were explicitly advertising that they were seeking commercial backers for their blood test. They published in Translational Psychiatry. Maybe that’s the place to go for placing outlandish claims into open access – where anybody can be reached – with a false assurance of a prestige protected by rigorous peer review.

 

Remission of suicidal ideation by magnetic seizure therapy? Neuro-nonsense in JAMA: Psychiatry

A recent article in JAMA: Psychiatry:

Sun Y, Farzan F, Mulsant BH, Rajji TK, Fitzgerald PB, Barr MS, Downar J, Wong W, Blumberger DM, Daskalakis ZJ. Indicators for remission of suicidal ideation following magnetic seizure therapy in patients with treatment-resistant depression. JAMA Psychiatry. 2016 Mar 16.

Was accompanied by an editorial commentary:

Camprodon JA, Pascual-Leone A. Multimodal Applications of Transcranial Magnetic Stimulation for Circuit-Based Psychiatry. JAMA: Psychiatry. 2016 Mar 16.

Together both the article and commentary can be studied as:

  • An effort by the authors and the journal itself to promote prematurely a treatment for reducing suicide.
  • A pay back to sources of financial support for the authors. Both groups have industry ties that provide them with consulting fees, equipment, grants, and other unspecified rewards. One author has a patent that should increase in value as result of this article and commentary.
  • A bid for successful applications to new grant initiatives with a pledge of allegiance to the NIMH Research Domain Criteria (RDoC).

After considering just how bad the science and reporting:

We have sufficient reason to ask how did this promotional campaign come about? Why was this article accepted by JAMA:Psychiatry? Why was it deemed worthy of comment?

I think a skeptical look at this article would lead to a warning label:

exclamation pointWarning: Results reported in this article are neither robust nor trustworthy, but considerable effort has gone into promoting them as innovative and even breakthrough. Skepticism warranted.

As we will see, the article is seriously flawed as a contribution to neuroscience, identification of biomarkers, treatment development, and suicidology, but we can nonetheless learn a lot from it in terms of how to detect such flaws when they are more subtle. If nothing else, your skepticism will be raised about articles accompanied by commentaries in prestigious journals and you will learn tools for probing such pairs of articles.

 

This article involves intimidating technical details and awe-inspiring figures.

figure 1 picture onefigure 1 picture two

 

 

 

 

 

 

 

 

 

Yet, as in some past blog posts concerning neuroscience and the NIMH RDoC, we will gloss over some technical details, which would be readily interpreted by experts. I would welcome the comments and critiques from experts.

I nonetheless expect readers to agree when they have finished this blog post that I have demonstrated that you don’t have to be an expert to detect neurononsense and crass publishing of articles that fit vested interests.

The larger trial from which these patients is registered as:

ClinicalTrials.gov. Magnetic Seizure Therapy (MST) for Treatment Resistant Depression, Schizophrenia, and Obsessive Compulsive Disorder. NCT01596608.

Because this article is strikingly lacking in crucial details or details in places where we would expect to find them, it will be useful at times to refer to the trial registration.

The title and abstract of the article

As we will soon see, the title, Indicators for remission of suicidal ideation following MST in patients with treatment-resistant depression is misleading. The article has too small sample and too inappropriate a design to establish anything as a reproducible “indicator.”

That the article is going to fail to deliver is already apparent in the abstract.

The abstract states:

 Objective  To identify a biomarker that may serve as an indicator of remission of suicidal ideation following a course of MST by using cortical inhibition measures from interleaved transcranial magnetic stimulation and electroencephalography (TMS-EEG).

Design, Setting, and Participants  Thirty-three patients with TRD were part of an open-label clinical trial of MST treatment. Data from 27 patients (82%) were available for analysis in this study. Baseline TMS-EEG measures were assessed within 1 week before the initiation of MST treatment using the TMS-EEG measures of cortical inhibition (ie, N100 and long-interval cortical inhibition [LICI]) from the left dorsolateral prefrontal cortex and the left motor cortex, with the latter acting as a control site.

Interventions The MST treatments were administered under general anesthesia, and a stimulator coil consisting of 2 individual cone-shaped coils was used.

Main Outcomes and Measures Suicidal ideation was evaluated before initiation and after completion of MST using the Scale for Suicide Ideation (SSI). Measures of cortical inhibition (ie, N100 and LICI) from the left dorsolateral prefrontal cortex were selected. N100 was quantified as the amplitude of the negative peak around 100 milliseconds in the TMS-evoked potential (TEP) after a single TMS pulse. LICI was quantified as the amount of suppression in the double-pulse TEP relative to the single-pulse TEP.

Results  Of the 27 patients included in the analyses, 15 (56%) were women; mean (SD) age of the sample was 46.0 (15.3) years. At baseline, patients had a mean SSI score of 9.0 (6.8), with 8 of 27 patients (30%) having a score of 0. After completion of MST, patients had a mean SSI score of 4.2 (6.3) (pre-post treatment mean difference, 4.8 [6.7]; paired t26 = 3.72; P = .001), and 18 of 27 individuals (67%) had a score of 0 for a remission rate of 53%. The N100 and LICI in the frontal cortex—but not in the motor cortex—were indicators of remission of suicidal ideation with 89% accuracy, 90% sensitivity, and 89% specificity (area under the curve, 0.90; P = .003).

Conclusions and Relevance  These results suggest that cortical inhibition may be used to identify patients with TRD who are most likely to experience remission of suicidal ideation following a course of MST. Stronger inhibitory neurotransmission at baseline may reflect the integrity of transsynaptic networks that are targeted by MST for optimal therapeutic response.

Even viewing the abstract alone, we can see this article is in trouble. It claims to identify a biomarker following a course of magnet seizure therapy (MST) ]. That is an extraordinary claim when a study only started with 33 patients of whom only 27 remain for analysis. Furthermore, at the initial assessment of suicidal ideation, eight of the 27 patients did not have any and so could show no benefit of treatment.

Any results could be substantially changed with any of the four excluded patients being recovered for analysis and any of the 27 included patients being dropped from analyses as an outlier. Statistical controls to control for potential confounds will produce spurious results because of overfit equations ] with even one confound. We also know well that in situation requiring control of possible confounding factors, control of only one is really sufficient and often produces worse results than leaving variables unadjusted.

Identification of any biomarkers is unlikely to be reproducible in larger more representative samples. Any claims of performance characteristics of the biomarkers (accuracy, sensitivity, specificity, area under the curve) are likely to capitalize on sampling and chance in ways that are unlikely to be reproducible.

Nonetheless, the accompanying figures are dazzling, even if not readily interpretable or representative of what would be found in another sample.

Comparison of the article to the trial registration.

According to the trial registration, the study started in February 2012 and the registration was received in May 2012. There were unspecified changes as recently as this month (March 2016), and the study is expected to and final collection of primary outcome data is in December 2016.

Primary outcome

The registration indicates that patients will have been diagnosed with severe major depression, schizophrenia or obsessive compulsive disorder. The primary outcome will depend on diagnosis. For depression it is the Hamilton Rating Scale for Depression.

There is no mention of suicidal ideation as either a primary or secondary outcome.

Secondary outcomes

According to the registration, outcomes include (1) cognitive functioning as measured by episodic memory and non-memory cognitive functions; (2) changes in neuroimaging measures of brain structure and activity derived from fMRI and MRI from baseline to 24th treatment or 12 weeks, whichever comes sooner.

Comparison to the article suggests some important neuroimaging assessment proposed in the registration were compromised. (1) only baseline measures were obtained and without MRI or fMRI; and (2) the article states

Although magnetic resonance imaging (MRI)–guided TMS-EEG is more accurate than non–MRI-guided methods, the added step of obtaining an MRI for every participant would have significantly slowed recruitment for this study owing to the pressing

need to begin treatment in acutely ill patients, many of whom were experiencing suicidal ideation. As such, we proceeded with non–MRI-guided TMS-EEG using EEG-guided methods according to a previously published study.

Treatment

magnetic seizure therapyThe article provides some details of the magnetic seizure treatment:

The MST treatments were administered under general anesthesia using a stimulator machine (MagPro MST; MagVenture) with a twin coil. Methohexital sodium (n = 14), methohexital with remifentanil hydrochloride (n = 18), and ketamine hydrochloride (n = 1) were used as the anesthetic agents. Succinylcholine chloride was used as the neuromuscular blocker. Patients had a mean (SD) seizure duration of 45.1 (21.4) seconds. The twin coil consists of 2 individual cone-shaped coils. Stimulation was delivered over the frontal cortex at the midline position directly over the electrode Fz according to the international 10-20 system.36 Placing the twin coil symmetrically over electrode Fz results in the centers of the 2 coils being over F3 and F4. Based on finite element modeling, this configuration produces a maximum induced electric field between the 2 coils, which is over electrode Fz in this case.37 Patients were treated for 24 sessions or until remission of depressive symptoms based on the 24-item Hamilton Rating Scale for Depression (HRSD) (defined as an HRSD-24 score ≤10 and 60% reduction in symptoms for at least 2 days after the last treatment).38 These remission criteria were standardized from previous ECT depression trials.39,40 Further details of the treatment protocol are available,30 and comprehensive clinical and neurophysiologic trial results will be reported separately.

The article intended to refer the reader to the trial registration for further description of treatment, but the superscript citation in the article is inaccurate. Regardless, given other deviations from registration, readers can’t tell whether any deviations from what was proposed. In in the registration, seizure therapy was described as involving:

100% machine output at between 25 and 100 Hz, with coil directed over frontal brain regions, until adequate seizure achieved. Six treatment sessions, at a frequency of two or three times per week will be administered. If subjects fail to achieve the pre-defined criteria of remission at that point, the dose will be increased to the maximal stimulator output and 3 additional treatment sessions will be provided. This will be repeated a total of 5 times (i.e., maximum treatment number is 24). 24 treatments is typically longer that a conventional ECT treatment course.

One important implication is for this treatment being proposed as resolving suicidal ideation. It takes place over a considerable period of time. Patients who die by suicide notoriously break contact before doing so. It would seem that a required 24 treatments delivered on an outpatient basis would provide ample opportunities for breaks – including demoralization because so many treatments are needed in some cases – and therefore death by suicide

But a protocol that involves continuing treatment until a prespecified reduction in the Hamilton Depression Rating Scale is achieved assures that there will be a drop in suicidal ideation. The interview-based Hamilton depression rating scales and suicidal ideation are highly correlated.

eeg-electroencephalogrphy-250x250There is no randomization or even adequate description of patient accrual in terms of the population from which the patients came. There is no control group and therefore no control for nonspecific factors. The patients are being subject to an elaborate, intrusive ritual In terms of nonspecific effects. The treatment involves patients in an elaborate ritual, starting with electroencephalographic (EEG) assessment [http://www.mayoclinic.org/tests-procedures/eeg/basics/definition/prc-20014093].

The ritual will undoubtedly will undoubtedly have strong nonspecific factors associated with it – instilling a positive expectations and considerable personal attention.

The article’s discussion of results

The discussion opens with some strong claims, unjustified by the modesty of the study and the likelihood that its specific results are not reproducible:

We found that TMS-EEG measures of cortical inhibition (ie, the N100 and LICI) in the frontal cortex, but not in the motor cortex, were strongly correlated with changes in suicidal ideation in patients with TRD who were treated with MST. These findings suggest that patients who benefitted the most from MST demonstrated the greatest cortical inhibition at baseline. More important, when patients were divided into remitters and nonremitters based on their SSI score, our results show that these measures can indicate remission of suicidal ideation from a course of MST with 90% sensitivity and 89% specificity.

Pledge of AllegianceThe discussion contains a Pledge of Allegiance to the research domain criteria approach that is not actually a reflection of the results at hand. Among the many things that we knew before the study was done and that was not shown by the study, is to suicidal ideation is so hopelessly linked to hopelessness, negative affect, and attentional biases, that in such a situation is best seen as a surrogate measure of depression, rather than a marker for risk of suicidal acts or death by suicide.

 

 

Wave that RDoC flag and maybe you will attract money from NIMH.

Our results also support the research domain criteria approach, that is, that suicidal ideation represents a homogeneous symptom construct in TRD that is targeted by MST. Suicidal ideation has been shown to be linked to hopelessness, negative affect, and attentional biases. These maladaptive behaviors all fall under the domain of negative valence systems and are associated with the specific constructs of loss, sustained threat, and frustrative nonreward. Suicidal ideation may represent a better phenotype through which to understand the neurobiologic features of mental illnesses.In this case, variations in GABAergic-mediated inhibition before MST treatment explained much of the variance for improvements in suicidal ideation across individuals with TRD.

Debunking ‘a better phenotype through which to understand the neurobiologic features of mental illnesses.’

  • Suicide is not a disorder or a symptom, but an infrequent, difficult to predict and complex act that varies greatly in nature and circumstances.
  • While some features of a brain or brain functioning may be correlated with eventual death by suicide, most identifications they provide of persons at risk to eventually die by suicide will be false positives.
  • In the United States, access to a firearm is a reliable proximal cause of suicide and is likely to be more so than anything in the brain. However, this basic observation is not consistent with American politics and can lead to grant applications not being funded.

In an important sense,

  • It’s not what’s going on in the brain, but what’s going in the interpersonal context of the brain, in terms of modifiable risk for death by suicide.

The editorial commentary

On the JAMA: Psychiatry website, both the article and the editorial commentary contain sidebar links to each other. Is only in the last two paragraphs of a 14 paragraph commentary that the target article is mentioned. However, the commentary ends with a resounding celebration of the innovation this article represents [emphasis added]:

Sun and colleagues10 report that 2 different EEG measures of cortical inhibition (a negative evoked potential in the EEG that happens approximately 100 milliseconds after a stimulus or event of interest and long-interval cortical inhibition) evoked by TMS to the left dorsolateral prefrontal cortex, but not to the left motor cortex, predicted remission of suicidal ideation with great sensitivity and specificity. This study10 illustrates the potential of multimodal TMS to study physiological properties of relevant circuits in neuropsychiatric populations. Significantly, it also highlights the anatomical specificity of these measures because the predictive value was exclusive to the inhibitory properties of prefrontal circuits but not motor systems.

Multimodal TMS applications allow us to study the physiology of human brain circuitry noninvasively and with causal resolution, expanding previous motor applications to cognitive, behavioral, and affective systems. These innovations can significantly affect psychiatry at multiple levels, by studying disease-relevant circuits to further develop systems for neuroscience models of disease and by developing tools that could be integrated into clinical practice, as they are in clinical neurophysiology clinics, to inform decision making, the differential diagnosis, or treatment planning.

Disclosures of conflicts of interest

The article’s disclosure of conflicts of interest statement is longer than the abstract.

conflict of interest disclosure

The disclosure for the conflicts of interest for the editorial commentary is much shorter but nonetheless impressive:

editorial commentary disclosures

How did this article get into JAMA: Psychiatry with an editorial comment?

Editorial commentaries are often provided by reviewers who either simply check the box on the reviewers’ form indicating their willingness to provide a comment. For reviewers who already have a conflict of interest, this provides an additional one: a non-peer-reviewed paper in which they can promote their interest.

Alternatively, commentators are simply picked by an editor who judges an article to be noteworthy of special recognition. It’s noteworthy that at least one of the associate editors of JAMA: Psychiatry is actively campaigning for a particular direction to suicide research funded by NIMH as seen in an editorial comment of his own that I recently discussed. One of the authors of this paper currently under discussion was until recently a senior member of this associate editor’s department, before departing to become Chair of the Department of Psychiatry at University of Toronto.

Essentially the authors of the paper and the authors of the commentary of providing carefully constructed advertisers for themselves and their agenda. The opportunity for them to do so is because of consistency with the agenda of at least one of the editors, if not the journal itself.

The Committee on Publication Ethics (COPE)   requires that non-peer-reviewed material in ostensibly peer reviewed journals be labeled as such. This requirement is seldom met.

The journal further promoted this article by providing 10 free continuing medical education credits for reading it.

I could go on much longer identifying other flaws in this paper and its editorial commentary. I could raise other objections to the article being published in JAMA:Psychiatry. But out of mercy for the authors, the editor, and my readers, I’ll stop here.

I would welcome comments about other flaws.

Special thanks to Bernard “Barney” Carroll for his helpful comments and encouragement, but all opinions expressed and all factual errors are my own responsibility.

Is risk of Alzheimer’s Disease reduced by taking a more positive attitude toward aging?

Unwarranted claims that “modifiable” negative beliefs cause Alzheimer’s disease lead to blaming persons who develop Alzheimer’s disease for not having been more positive.

Lesson: A source’s impressive credentials are no substitute for independent critical appraisal of what sounds like junk science and is.

More lessons on how to protect yourself from dodgy claims in press releases of prestigious universities promoting their research.

If you judge the credibility of health-related information based on the credentials of the source, this article  is a clear winner:

Levy BR, Ferrucci L, Zonderman AB, Slade MD, Troncoso J, Resnick SM. A Culture–Brain Link: Negative Age Stereotypes Predict Alzheimer’s Disease Biomarkers. Psychology and Aging. Dec 7 , 2015, No Pagination Specified. http://dx.doi.org/10.1037/pag0000062

alzheimers
From INI

As noted in the press release from Yale University, two of the authors are from Yale School of Medicine, another is a neurologist at Johns Hopkins School of Medicine, and the remaining three authors are from the US National Institute on Aging (NIA), including NIA’s Scientific Director.

The press release Negative beliefs about aging predict Alzheimer’s disease in Yale-led study declared:

“Newly published research led by the Yale School of Public Health demonstrates that                   individuals who hold negative beliefs about aging are more likely to have brain changes associated with Alzheimer’s disease.

“The study suggests that combatting negative beliefs about aging, such as elderly people are decrepit, could potentially offer a way to reduce the rapidly rising rate of Alzheimer’s disease, a devastating neurodegenerative disorder that causes dementia in more than 5 million Americans.

The press release posited a novel mechanism:

“We believe it is the stress generated by the negative beliefs about aging that individuals sometimes internalize from society that can result in pathological brain changes,” said Levy. “Although the findings are concerning, it is encouraging to realize that these negative beliefs about aging can be mitigated and positive beliefs about aging can be reinforced, so that the adverse impact is not inevitable.”

A Google search reveals over 40 stories about the study in the media. Provocative titles of the media coverage suggest a children’s game of telephone or Chinese whispers in which distortions accumulate with each retelling.

Negative beliefs about aging tied to Alzheimer’s (Waltonian)

Distain for the elderly could increase your risk of Alzheimer’s (FinancialSpots)

Lack of respect for elderly may be fueling Alzheimer’s epidemic (Telegraph)

Negative thoughts speed up onset of Alzheimer’s disease (Tech Times)

Karma bites back: Hating on the elderly may put you at risk of Alzheimer’s (LA Times)

How you feel about your grandfather may affect your brain health later in life (Men’s Health News)

Young people pessimistic about aging more likely to develop Alzheimer’s later on (Health.com)

Looking forward to old age can save you from Alzheimer’s (Canonplace News)

If you don’t like old people, you are at higher risk of Alzheimer’s, study says (RedOrbit)

If you think elderly people are icky, you’re more likely to get Alzheimer’s (HealthLine)

In defense of the authors of this article as well as journalists, it is likely that editors added the provocative titles without obtaining approval of the authors or even the journalists writing the articles. So, let’s suspend judgment and write off sometimes absurd titles to editors’ need to establish they are offering distinctive coverage, when they are not necessarily doing so. That’s a lesson for the future: if we’re going to criticize media coverage, better focus on the content of the coverage, not the titles.

However, a number of these stories have direct quotes from the study’s first author. Unless the media coverage is misattributing direct quotes to her, she must have been making herself available to the media.

Was the article such an important breakthrough offering new ways in which consumers could take control of their risk of Alzheimer’s by changing beliefs about aging?

No, not at all. In the following analysis, I’ll show that judging the credibility of claims based on the credentials of the sources can be seriously misleading.

What is troubling about this article and its well-organized publicity effort is that information is being disseminated that is misleading and potentially harmful, with the prestige of Yale and NIA attached.

Before we go any further, you can take your own look at a copy of the article in the American Psychological Association journal Psychology and Aging here, the Yale University press release here, and a fascinating post-publication peer review at PubPeer that I initiated as peer 1.

Ask yourself: if you encountered coverage of this article in the media, would you have been skeptical? If so what were the clues?

spoiler aheadcure within The article is yet another example of trusted authorities exploiting entrenched cultural beliefs about the mind-body connection being able to be harnessed in some mysterious way to combat or prevent physical illness. As Ann Harrington details in her wonderful book, The Cure Within, this psychosomatic hypothesis has a long and checkered history, and gets continually reinvented and misapplied.

We see an example of this in claims that attitude can conquer cancer. What’s the harm of such illusions? If people can be led to believe they have such control, they are set up for blame from themselves and from those around them when they fail to fend off and control the outcome of disease by sheer mental power.

The myth of “fighting spirit” overcoming cancer that has survived despite the accumulation of excellent contradictory evidence. Cancer patients are vulnerable to blaming themselves for being blamed by loved ones when they do not “win” the fight against cancer. They are also subject to unfair exhortations to fight harder as their health situation deteriorates.

onion composite
                                                        From the satirical Onion

 What I saw when I skimmed the press release and the article

  • The first alarm went off when I saw that causal claims were being made from a modest sized correlational study. This should set off anyone’s alarms.
  • The press release refers to this as a “first ever” d discussion section of the article refer to this as a “first ever” study. One does not seek nor expect to find robust “first ever” discoveries in such a small data set.
  • The authors do not provide evidence that their key measure of “negative stereotypes” is a valid measure of either stereotyping or likelihood of experiencing stress. They don’t even show it is related to concurrent reports of stress.
  • Like a lot of measures with a negative tone to items, this one is affected by what Paul Meehl calls the crud factor. Whatever is being measured in this study cannot be distinguished from a full range of confounds that not even being assessed in this study.
  • The mechanism by which effects of this self-report measure somehow get manifested in changes in the brain lacks evidence and is highly dubious.
  • There was no presentation of actual data or basic statistics. Instead, there were only multivariate statistics that require at least some access to basic statistics for independent evaluation.
  • The authors resorted to cheap statistical strategies to fool readers with their confirmation bias: reliance on one tailed rather than two-tailed tests of significance; use of a discredited backwards elimination method for choosing control variables; and exploring too many control/covariate variables, given their modest sample size.
  • The analyses that are reported do not accurately depict what is in the data set, nor generalize to other data sets.

The article

The authors develop their case that stress is a significant cause of Alzheimer’s disease with reference to some largely irrelevant studies by others, but depend on a preponderance of studies that they themselves have done with the same dubious small samples and dubious statistical techniques. Whether you do a casual search with Google scholar or a more systematic review of the literature, you won’t find stress processes of the kind the authors invoke among the usual explanations of the development of the disease.

Basically, the authors are arguing that if you hold views of aging like “Old people are absent-minded” or “Old people cannot concentrate well,” you will experience more stress as you age, and this will accelerate development of Alzheimer’s disease. They then go on to argue that because these attitudes are modifiable, you can take control of your risk for Alzheimer’s by adopting a more positive view of aging and aging people

The authors used their measure of negative aging stereotypes in other studies, but do not provide the usual evidence of convergent  and discriminant validity needed to establish the measure assesses what is intended. Basically, we should expect authors to show that a measure that they have developed is related to existing measures (convergent validity) in ways that one would expect, but not related to existing measures (discriminate validity) with which it should have associations.

Psychology has a long history of researchers claiming that their “new” self-report measures containing negatively toned items assess distinct concepts, despite high correlations with other measures of negative emotion as well as lots of confounds. I poked fun at this unproductive tradition in a presentation, Negative emotions and health: why do we keep stalking bears, when we only find scat in the woods?

The article reported two studies. The first tested whether participants holding more negative age stereotypes would have significantly greater loss of hippocampal volume over time. The study involved 52 individuals selected from a larger cohort enrolled in the brain-neuroimaging program of the Baltimore Longitudinal Study of Aging.

Readers are given none of the basic statistics that would be needed to interpret the complex multivariate analyses. Ideally, we would be given an opportunity to see how the independent variable, negative age stereotypes, is related to other data available on the subjects, and so we could get some sense if we are starting with some basic, meaningful associations.

Instead the authors present the association between negative age stereotyping and hippocampal volume only in the presence of multiple control variables:

Covariates consisted of demographics (i.e., age, sex, and education) and health at time of baseline-age-stereotype assessment, (number of chronic conditions on the basis of medical records; well-being as measured by a subset of the Chicago Attitude Inventory); self-rated health, neuroticism, and cognitive performance, measured by the Benton Visual Retention Test (BVRT; Benton, 1974).

Readers get cannot tell why these variables and not others were chosen. Adding or dropping a few variables could produce radically different results. But there are just too many variables being considered. With only 52 research participants, spurious findings that do not generalize to other samples are highly likely.

I was astonished when the authors announced that they were relying on one-tailed statistical tests. This is widely condemned as unnecessary and misleading.

Basically, every time the authors report a significance level in this article, you need to double the number to get what is obtained with a more conventional two-tailed test. So, if they proudly declare that results are significant p = .046, then the results are actually (non)significant, p= .092. I know, we should not make such a fuss about significance levels, but journals do. We’re being set up to be persuaded the results are significant, when they are not by conventional standards.

So the authors’ accumulating sins against proper statistical techniques and transparent reporting: no presentation of basic associations; reporting one tailed tests; use of multivariate statistics inappropriate for a sample that is so small. Now let’s add another one, in their multivariate regressions, the authors relied on a potentially deceptive backwards elimination:

Backward elimination, which involves starting with all candidate variables, testing the deletion of each variable using a chosen model comparison criterion, deleting the variable (if any) that improves the model the most by being deleted, and repeating this process until no further improvement is possible.

The authors assembled their candidate control/covariate variables and used a procedure that checks them statistically and drop some from consideration, based on whether they fail to add to the significance of the overall equation. This procedure is condemned because the variables that are retained in the equation capitalize on chance. Particular variables that could be theoretically relevant are eliminated simply because they fail to add anything statistically in the context of the other variables being considered. In the context of other variables, these same discarded variables would have been retained.

The final regression equation had fewer control/covariates then when the authors started. Statistical significance will be calculated on the basis of the small number of variables remaining, not the number that were picked over and so results will artificially appear stronger. Again, potentially quite misleading to the unwary reader.

The authors nonetheless concluded:

As predicted, participants holding more-negative age stereotypes, compared to those holding more-positive age stereotypes, had a significantly steeper decline in hippocampal volume

The second study:

examined whether participants holding more negative age stereotypes would have significantly greater accumulation of amyloid plaques and neurofibrillary tangles.

The outcome was a composite-plaques-and-tangles score and the predictor was the same negative age stereotypes measure from the first study. These measurements were obtained from 74 research participants upon death and autopsy. The same covariates were used in stepwise regression with backward elimination. Once again, the statistical test was one tailed.

Results were:

As predicted, participants holding more-negative age stereotypes, compared to those holding more-positive age stereotypes, had significantly higher composite-plaques-and-tangles scores, t(1,59) = 1.71 p = .046, d = 0.45, adjusting for age, sex, education, self-rated health, well-being, and number of chronic conditions.

Aha! Now we see why the authors commit themselves to a one tailed test. With a conventional two-tailed test, these results would not be significant. Given a prevailing confirmation bias, aversion to null findings, and obsession with significance levels, this article probably would not have been published without the one tailed test.

The authors’ stirring overall conclusion from the two studies:

By expanding the boundaries of known environmental influences on amyloid plaques, neurofibrillary tangles, and hippocampal volume, our results suggest a new pathway to identifying mechanisms and potential interventions related to Alzheimer’s disease

pubpeerPubPeer discussion of this paper [https://pubpeer.com/publications/16E68DE9879757585EDD8719338DCD ]

Comments accumulated for a couple of days on PubPeer after I posted some concerns about the first study. All of the comments were quite smart, some directly validated points that I been thinking about, but others took the discussion in new directions either statistically or because the commentators knew more about neuroscience.

Using a mechanism available at PubPeer, I sent emails to the first author of the paper, the statistician, and one of the NIA personnel inviting them to make comments also. None have responded so far.

Tom Johnstone, a commentator who exercise the option of identifying himself noted the reliance on inferential statistics in the absence of reporting basic relationships. He also noted that the criterion used to drop covariates was lax. Apparently familiar with neuroscience, he expressed doubts that the results had any clinical significance or relevance to the functioning of the research participants.

Another commentator complained of the small sample size, use of one tailed statistical tests without justification, the “convoluted list of covariates,” and “taboo” strategy for selecting covariates to be retained in the regression equation. This commentator also noted that the authors had examined the effect of outliers, conducting analyses both with and without the inclusion of the most extreme case. While it didn’t affect the overall results, exclusion dramatically change the significance level, highlighting the susceptibility of such a small sample to chance variation or sampling error.

Who gets the blame for misleading claims in this article?

dr-luigi-ferrucciThere’s a lot of blame to go around. By exaggerating the size and significance of any effects, the first author increases the chance of publication and also further funding to pursue what is seen as a “tantalizing” association. But it’s the job of editors and peer reviewers to protect the readership from such exaggerations and maybe to protect the author from herself. They failed, maybe because exaggerated findings are consistent with the journal‘s agenda of increasing citations by publishing newsworthy rather than trustworthy findings. The study statistician, Martin Slade obviously knew that misleading, less than optimal statistics were used, why didn’t he object? Finally, I think the NIA staff, particularly Luigi Ferrucci, the Scientific Director of NIA  should be singled out for the irresponsibility of attaching their names to such misleading claims. Why they do so? Did they not read the manuscript?  I will regularly present instances of NIH staff endorsing dubious claims, such as here. The mind-over-disease, psychosomatic hypothesis, gets a lot of support not warranted by the evidence. Perhaps NIH officials in general see this as a way of attracting research monies from Congress. Regardless, I think NIH officials have the responsibility to see that consumers are not misled by junk science.

This article at least provided the opportunity for an exercise that should raise skepticism and convince consumers at all levels – other researchers, clinicians, policymakers, and those who suffer from Alzheimer’s disease and those who care from them – we just cannot sit back and let trusted sources do our thinking for us.

 

Sex and the single amygdala: A tale almost saved by a peek at the data

So sexy! Was bringing up ‘risky sex’ merely a strategy to publish questionable and uninformative science?

wikipedia 1206_FMRIMy continuing question: Can skeptics who are not specialists, but who are science-minded and have some basic skills, learn to quickly screen and detect questionable science in the journals and media coverage?

You don’t need a weatherman to know which way the wind blows.” – Bob Dylandylan wind blows

I hope so. One goal of my blogging is to arouse readers’ skepticism and provide them some tools so that they can decide for themselves what to believe, what to reject, and what needs a closer look or a check against trusted sources.

Skepticism is always warranted in science, but it is particularly handy when confronting the superficial application of neuroscience to every aspect of human behavior. Neuroscience is increasingly being brought into conversations to sell ideas and products when it is neither necessary nor relevant. Many claims about how the brain is involved are false or exaggerated not only in the media, but in the peer-reviewed journals themselves.

A while ago I showed how a neuroscientist and a workshop guru teamed up to try to persuade clinicians with functional magnetic resonance imaging (fMRI) data  that a couples therapy was more sciencey than the rest. Although I took a look at some complicated neuroscience, a lot of my reasoning [1, 2, 3] merely involved applying basic knowledge of statistics and experimental design. I raised sufficient skepticism to dismiss the neuroscientist and psychotherapy guru’s claims, Even putting aside the excellent specialist insights provided by Neurocritic and his friend Magneto.

In this issue of Mind the Brain, I’m pursuing another tip from Neurocritic about some faulty neuroscience in need of debunking.

The paper

Victor, E. C., Sansosti, A. A., Bowman, H. C., & Hariri, A. R. (2015). Differential Patterns of Amygdala and Ventral Striatum Activation Predict Gender-Specific Changes in Sexual Risk Behavior. The Journal of Neuroscience, 35(23), 8896-8900.

Unfortunately, the paper is behind a pay wall. If you can’t get it through a university library portal, you can send a request for a PDF to the corresponding author, elizabeth.victor@duke.edu.

The abstract

Although the initiation of sexual behavior is common among adolescents and young adults, some individuals express this behavior in a manner that significantly increases their risk for negative outcomes including sexually transmitted infections. Based on accumulating evidence, we have hypothesized that increased sexual risk behavior reflects, in part, an imbalance between neural circuits mediating approach and avoidance in particular as manifest by relatively increased ventral striatum (VS) activity and relatively decreased amygdala activity. Here, we test our hypothesis using data from seventy 18- to 22-year-old university students participating in the Duke Neurogenetics Study. We found a significant three-way interaction between amygdala activation, VS activation, and gender predicting changes in the number of sexual partners over time. Although relatively increased VS activation predicted greater increases in sexual partners for both men and women, the effect in men was contingent on the presence of relatively decreased amygdala activation and the effect in women was contingent on the presence of relatively increased amygdala activation. These findings suggest unique gender differences in how complex interactions between neural circuit function contributing to approach and avoidance may be expressed as sexual risk behavior in young adults. As such, our findings have the potential to inform the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.

My thought processes

Hmm, sexual risk behavior -meaning number of partners? How many new partners during a follow-up period constitutes “risky” and does it matter whether safe sex was practiced? Well, ignoring these issues and calling it “sexual risk behavior “allows the authors to claim relevance to hot topics like HIV prevention….

But let’s cut to the chase: I’m always skeptical about a storyline depending on a three-way statistical interaction. These effects are highly unreliable, particularly in a sample size of only N = 70. I’m suspicious why investigators ahead of time staking their claims on a three-way interaction, not something simpler. I will be looking for evidence that they started with this hypothesis in mind, rather than cooking it up after peeking at the data.

fixed-designs-for-psychological-research-35-638Three-way interactions involve dividing a sample up into at eight boxes, in this case, 2 x (2) x (2). Such interactions can be mind-boggling to interpret, and this one is no exception

Although relatively increased VS activation predicted greater increases in sexual partners for both men and women, the effect in men was contingent on the presence of relatively decreased amygdala activation and the effect in women was contingent on the presence of relatively increased amygdala activation.

And then the “simple” interpretation?

These findings suggest unique gender differences in how complex interactions between neural circuit function contributing to approach and avoidance may be expressed as sexual risk behavior in young adults.

And the public health implications?

As such, our findings have the potential to inform the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.

hs-amygdalaJust how should these data inform public health strategies beyond what we knew before we stumbled upon this article? Really, should we stick people’s heads in a machine and gather fMRI data  before offering them condoms? Should we encourage computer dating services to post along with a recent headshot, recent fMRI images showing that prospective dates do not have their risky behavior center in the amygdala activated? Or encourage young people to get their heads examined with an fMRI before deciding whether it’s wise to sleep with somebody new?

So it’s difficult to see the practical relevance of these findings, but let’s stick around and consider the paragraph that Neurocritic singled out.

The paragraph

outlierThe majority of the sample reported engaging in vaginal sex at least once in their lifetime (n = 42, 60%). The mean number of vaginal sexual partners at baseline was 1.28 (SD =0.68). The mean increase in vaginal sexual partners at the last follow-up was 0.71 (SD = 1.51). There were no significant differences between men and women in self-reported baseline or change in self-reported number of sexual partners (t=0.05, p=0.96; t=1.02, p= 0.31, respectively). Although there was not a significant association between age and self-reported number of partners at baseline (r = 0.17, p= 0.16), younger participants were more likely to report a greater increase in partners over time (r =0.24, p =0.04). Notably, distribution analyses revealed two individuals with outlying values (3 SD from M; both subjects reported an increase in 8 partners between baseline and follow up). Given the low rate of sexual risk behavior reported in the sample, these outliers were not excluded, as they likely best represent young adults engaging in sexual risk behavior.

What triggers skepticism?

This paragraph is quite revealing if we just ponder it a bit.

First, notice there is only a single significant correlation (p=.04) in a subgroup analysis. Differences between men and women were examined finding no significant findings in either baseline or changes in number of sexual partners over the length of the observation. However, disregarding that finding, the authors went on to explore changes in number of partners over time among the younger participants and, bingo, there was their p =0.04.

Whoa! Age was never mentioned in the abstract. We are now beyond the 2 x 2 x 2 interaction mentioned in the abstract and rooting through another dimension, younger versus older.

But, worse, getting that significance required retaining two participants with eight new sexual partners each during the follow-up period. The decision to retain these participants was made after the pattern of results was examined with and without inclusion of these outliers. The authors say so and essentially say they decided because it made a better story.

The only group means and standard deviation included these two participants. Even including the participants, the average number of new sexual partners was less than one during some follow-up. We have no idea whether that one was risky or not. It’s a safer assumption that having eight new partners is risky, but even that we don’t know for sure.

Keep in mind for future reference: Investigators are supposed to make decisions about outliers without reference to the fate of the hypothesis being studied. And knowing nothing about this particular study, most authorities would say if two people out of 70 are way out there on a particular variable that otherwise has little variance, you should exclude them.

It is considered a Questionable Research Practice to make decisions about inclusion/exclusion based on what story the outcome of this decision allows the authors to tell. It is p-hacking, and significance chasing.

And note the distribution of numbers of vaginal sex partners. Twenty eight participants had none at the end of the study. Most accumulated less than one during the follow up, and even that mean number was distorted by two having eight partners. Hmm, it is going to be hard to get multivariate statistics to work appropriately when we get to the fancy neuroscience data. We could go off on discussions of multivariate normal or Poisson distributions or just think a bit..

We can do a little detective work and determine that one outlier was a male, another a female. (*1) Let’s go back to our eight little boxes of participants that are involved in the interpretation of the three-way interaction. It’s going to make a great difference exactly where the deviant male and female are dropped into one of the boxes or whether they are left out.

And think about sampling issues. What if, for reasons having nothing to with the study, neither of these outliers had shown up? Or if only one of them had showed up, it would skew the results in a particular direction, depending on whether the participant was the male or female.

Okay, if we were wasting our time continuing to read the article after finding what we did in the abstract, we are certainly wasting more of our time by continuing after reading this paragraph. But let’s keep poking around as an educational exercise.

The rest of the methods and results sections

We learn from the methods section that there was an ethnically diverse sample with a highly variable follow-up, from zero days to 3.9 years (M = 188.72 d, SD = 257.15; range = 0 d–3.19 years). And there were only 24 men in the original sample for the paper of 70 participants.

We don’t know whether these two outliers had eight sexual partners within a week of the first assessment or they were the ones captured in extending the study to almost 4 years. That matters somewhat, but we also have to worry whether this was an appropriate sample – with so few participants in it in the first place and even fewer who had sex by the end of the study – and length of follow-up to do such a study. The mean follow-up of about six months and huge standard deviation suggest there is not a lot of evidence of risky behavior, at least in terms of casual vaginal sex.

This is all getting very funky.

So I wondered about the larger context of the study, with increasing doubts that the authors had gone to all this trouble just to test an a priori hypothesis about risky sex.

We are told that the larger context is the ongoing “Duke Neurogenetics Study (DNS), which assesses a wide range of behavioral and biological traits.” The extensive list of inclusions and exclusions suggests a much more ambitious study. If we had more time, we could go look up the Duke Neurogenetics Study and see if that’s the case. But I have a strong suspicion that the study was not organized around the specific research questions of this paper (*2). I really can’t tell without any preregistration of this particular paper but I certainly have questions about how much Hypothesizing after the Results Are Known (HARKing) is going on here in the refining of hypotheses and measures, and decisions about which data to report.

Further explorations of the results section

I remind readers that I know little about fMRI data. Put it aside and we can discover some interesting things reading through the brief results section.

Main effects of task

As expected, our fMRI paradigms elicited robust affect-related amygdala and reward-related VS activity across the entire parent sample of 917 participants (Fig. 1). In our substudy sample of 70 participants, there were no significant effects of gender (t(70) values < 0.88, p values >0.17) or age (r values < 0.22; p values > 0.07) on VS or amygdala activity in either hemisphere.

figure1Hmm, let’s focus on the second sentence first. The authors tell us absolutely nothing is going on in terms of differences in amygdala and reward-related VS activity in relation to age and gender in the sample of 70 participants in the current study. In fact, we don’t even need to know what “amygdala and reward-related VS activity” is to wonder why the first sentence of this paragraph directs us to a graph not of the 70 participants, but a larger sample of 917 participants. And when we go to figure 1, we see some wild wowie zowie, hit-the-reader-between-the-eyes differences (in technical terms, intraocular trauma) for women. And claims of p < 0.000001 twice. But wait! One might think significance of that magnitude would have to come from the 917 participants, except the labeling of the X-axis must come from the substudy of the 70 participants for whom data concerning number of sex partners was collected. Maybe the significance comes from the anchoring of one of the graph lines by the one wayout outlier.

Note that the outlier woman with eight partners anchors the blue line for High Left Amygdala. Without inclusion of that single woman, the nonsignificant trends between women with High Left Amygdala versus women with Low Left Amygdala would be reversed.

figure2The authors make much of the differences between Figure 1 showing Results for Women and Figure 2 showing Results for Men. The comparison seems dramatic except that, once again, the one outlier sends the red line for Low Left Amygdala off from the blue line for High Left Amygdala. Otherwise, there is no story to tell. Mind-boggling, but I think we can safely conclude that something is amiss in these Frankenstein graphs.

Okay, we should stop beating a corpse of an article. There are no vital signs left.

Alternatively, we could probe the section on Poisson regressions and minimally note some details. There is the flash of some strings of zeros in the P values, but it seems complicated and then we are warned off with “no factors survive Bonferroni correction.” And then in the next paragraph, we get to exploring dubious interactions. And there is the final insult of the authors bringing in a two-way interaction trending toward significance among men, p =.051.

But we were never told how all this would lead as we were promised in the end of the abstract, “to the development of novel, gender-specific strategies that may be more effective at curtailing sexual risk behavior.”

Rushing through the discussion section, we note the disclosure that

The nature of these unexpected gender differences on clear and warrants further consideration.

So, the authors confess that they did not start with expectations of finding a gender difference. They had nothing to report from a subset of data from an ambitious project put together for other purposes with an ill-suited follow-up for the research question (and even an ill-suited experimental task. They made a decision to include two outliers, salvaged some otherwise weak and inconsistent differences, and then constructed a story that depended on their inclusion. Bingo, they can survive confirmation bias and get published.

Readers might have been left with just their skepticism about the three-way interaction described in the abstract. However, the authors implicated themselves by disclosing in the article their examination of a distribution and reasons for including outlier. Then they further disclosed they did not start with a hypothesis about gender differences.

Why didn’t the editor and reviewers at Journal of Neuroscience (impact factor 6.344) do their job and cry foul? Questionable research practices (QRPs) are brought to us courtesy of questionable publication practices (QPPs).

And then we end with the confident

These limitations notwithstanding, our current results suggest the importance of considering gender-specific patterns of interactions between functional neural circuits supporting approach and avoidance in the expression of sexual risk behavior in young adults.

Yet despite this vague claim, the authors still haven’t explained how this research could be translated to practice.

Takeaway points for the future.

Without a tip from NeuroCritic, I might not have otherwise zeroed in on the dubious complex statistical interaction on which the storyline in the abstract depended. I also benefited from the authors for whatever reason telling us that they had peeked at the data and telling us further in the discussion that they had not anticipated the gender difference. With current standards for transparency and no preregistration of such studies, it would’ve been easy for us to miss what was done because the authors did not need to alert us. Until there are more and better standards enforced, we just need to be extra skeptical of claims of the application of neuroscience to everyday life.

Trust your skepticism.

Apply whatever you know about statistics and experimental methods. You probably know more than you think you do

Beware of modest sized neuroscience studies for which authors develop storylines from the patterning authors can discover in their data, not from a priori hypotheses suggested by a theory. If you keep looking around in the scientific literature and media coverage of it, I think you will find a lot of this QRP and QPP.

Don’t go into a default believe-it mode just because an article is peer-reviewed.

Notes

  1. If both the outliers were of the same gender, it would have been enough for that gender to have had significantly more sex partners than the other.
  1. Later we had told in the Discussion section that particular stimuli for which fMRI data were available were not chosen for relevance to the research question claimed for this this paper.

We did not measure VS and amygdala activity in response to sexually provocative stimuli but rather to more general representations of reward and affective arousal. It is possible that variability in VS and amygdala activity to such explicit stimuli may have different or nonexistent gender-specific patterns that may or may not map onto sexual risk behaviors.

Special thanks to Neurocritic for suggesting this blog post and for feedback, as well as to Neuroskeptic, Jessie Sun, and Hayley Jach for helpful feedback. However, @CoyneoftheRealm bears sole responsibility for any excesses or errors in this post.