An open-minded, skeptical look at the success of “zero suicides”: Any evidence beyond the rhetoric?

  • Claims are spreading across social media that a goal of zero suicides can be achieved by radically re-organizing resources in health systems and communities. Extraordinary claims require extraordinary evidence.
  • I thoroughly searched for evidence backing claims of “zero suicides” being achieved.
  • The claims came up short, after expectations were initially raised by some statistics and a provocative graph. But any persuasiveness to these details quickly dissipated when they were scrutinized. Lesson: Abstract numbers and graphs are not necessarily quality evidence and dazzling ones can obscure a lack of evidence.
  • The goal of “zero suicides” has attracted support of Pharma and generated programs around the world, with little fidelity to the original concept developed in the  Henry Ford Health System in Detroit. In many contexts in which it is now being invoked, “zero suicides” is a vacuous buzz term, not a coherent, organizational strategy
  • Preventing suicide is a noble goal to which a lot of emotion gets attached. It also creates lucrative financial opportunities and attracts vested interests which often simply repackage existing programs for resale.
  • How can anyone oppose the idea that we should eliminate suicide? Clever sloganeering can stifle criticism and suppress embarrassing evidence to the contrary
  • Yet, we should not be bullied, nor distracted by slogans from our usual, skeptical insistence on those who make strong claims having the burden to provide strong evidence.
  • Deaths by suicide are statistically infrequent, poorly predicted events that occur in troubled contexts of interpersonal and institutional breakdown. These aspects can frustrate efforts to eliminate suicide entirely – or even accurately track these deaths.
  • Eliminating deaths by suicide is only very loosely analogous to wiping out polio and lots of pitfalls await those who get confused by a false equivalence.
  • Pursuit of the goal of “zero suicides,” particularly in under-resourced and not well-organized community settings can have unintended, negative consequences.
  • “Zero suicides” is likely a fad, to be replaced by next year’s fashion or maybe a few years after.
  • We need to step back and learn from the rise and fall of slogans and the unintended impact on distribution of scarce resources and the costs to human well-being.
  • My take away message is that increasingly sophisticated and even coercive communications about clinical and public health policies often harness the branding of prestigious medical journals. Interpreting these claims require a matching skepticism, critical thinking skills, and renewed demands for evidence.

Beginning the search for evidence for the slogan “Zero Sucide.”

zero tweetNumerous gushy tweets about achieving “zero suicides” drew me into a search for more information. I easily traced the origins of the campaign to a program at the Henry Ford Health System, a Detroit-based HMO, but the concept has now gone thoroughly international. My first Google Scholar search did not yield quality evidence from any program evaluations, but a subsequent Google search produced exceptionally laudatory and often self-congratulatory statements.

I briefly diverted my efforts to contacting authorities whom I expected might comment about “zero suicides.” Some indicated a lack of familiarity prevented them from commenting, but others were as evasive as establishment Republicans asked about Donald Trump. One expert, however, was forthcoming with an interesting article, which proved to have just right tone.  I recommend:

Kutcher S, Wei Y, Behzadi P. School-and Community-Based Youth Suicide Prevention Interventions Hot Idea, Hot Air, or Sham?. The Canadian Journal of Psychiatry. 2016 Jul 12:0706743716659245.

Continuing my search, I found numerous links to other articles, including a laudatory, Medical News and Perspectives opinion piece in JAMA behind a readily circumvented pay wall. There was also a more accessible source with a branding by New England Journal of Medicine.

Clicking on these links, I found editorial and even blatantly promotional material, not randomized trials or other quality evidence.

This kind of non-evidence-based publicity in highly visible medical journals is extraordinary in itself, although not unprecedented. Increasingly, the brand of particular medical journals is sold and harnessed to bestow special credibility on political and financial interests, has seen in 1 and 2.

NEJM Catalyst: How We Dramatically Reduced Suicide.

 NEJM Catalyst is described as bringing

Health care executives, clinician leaders, and clinicians together to share innovative ideas and practical applications for enhancing the value of health care delivery.

0 suicide takeaway
From NEJM Catalyst

The claim of “zero suicides” originated in the Perfect Care for Depression in a division of the Henry Ford Health System.

The audacious goal of zero suicides was part of the Behavioral Health Services division’s larger goal to develop a system of perfect care for depression. Our roadmap for transformation was the Quality Chasm report, which defined six dimensions of perfect care: safety, timeliness, effectiveness, efficiency, equity, and patient-centeredness. We set perfection goals and metrics for each dimension, with zero suicides being the perfection goal for effectiveness. Very quickly, however, our team seized on zero suicides as the overarching goal for our entire transformation.

The strategies:

We used three key strategies to achieve this goal. The first two — improving access to care and restricting access to lethal means of suicide — are evidence-based interventions to reduce suicide risk. While we had pursued these strategies in the past, setting the target at zero suicides injected our team with gumption. To improve access to care, we developed, implemented, and tested new models of care, such as drop-in group visits, same-day evaluations by a psychiatrist, and department-wide certification in cognitive behavior therapy. This work, once messy and arduous for the PDC team, became creative, fun, and focused. To reduce access to lethal means of suicide, we partnered with patients and families to develop new protocols for weapons removal. We also redesigned the structure and content of patient encounters to reflect the assumption that every patient with a mental illness, even if that illness is in remission, is at increased risk of suicide. Therefore, we eliminated suicide screens and risk stratification tools that yielded non-actionable results, freeing up valuable time. Eventually, each of these approaches was incorporated into the electronic health record as decision support.

The third strategy:

…The pursuit of perfection was not possible without a just culture for our internal team. Ultimately, we found this the most important strategy in achieving zero suicides. Since our goal was to achieve radical transformation, not just to tweak the margins, PDC staff couldn’t justly be punished if they came up short on these lofty goals. We adopted a root cause analysis process that treated suicide events equally as tragedies and learning opportunities.

Process of patient care described in JAMA

What happens to a patient being treated in the context of Perfect Depression Care is described in the JAMA  piece:

Each patient seen through the BHS is first assessed and stratified on the basis of suicide risk: acute, moderate, or low. “Everyone is at risk. It’s just a matter of whether it’s acute or whether it requires attention but isn’t emergent,” said Coffey. A patient considered to be at high risk undergoes a psychiatric evaluation the same day. A patient at low risk is evaluated within 7 days. Group sessions for patients also allow individuals to connect and offer support to one another, not unlike the supportive relationships between sponsors and “sponsees” in 12-step programs

The claim of Zero Suicides, in numbers and a graph

…A dramatic and statistically significant 80% reduction in suicide that has been maintained for over a decade, including one year (2009) when we actually achieved the perfection goal of zero suicides (see the figure below). During the PDC initiative, the annual HMO network membership ranged from 182,183 to 293,228, of which approximately 60% received care through Behavioral Health Services. From 1999 to 2010, there were 160 suicides among HMO members. In 1999, as we launched PDC, the mean annual suicide rate for these mental health patients was 110.3 per 100,000. During the 11 years of the initiative, the mean annual suicide rate dropped to 36.21 per 100,000. This decrease is statistically significant and, moreover, took place while the suicide rate actually increased among non–mental health patients and among the general population of the state of Michigan.

Improved_Suicide_Rates_Among_Henry_Ford_Medical_Group_HMO_Members

[This graph conflicts a bit with a graph in NEJM Catalyst that indicates suicides in the health care system were 0 suicides for 2008 and this continued through the first quarter of 2010]

It is clear that rates of suicide fluctuate greatly from year-to-year in the health system. It also appears from the graph that for most years during the program, rates of suicide among patients in the Henry Ford Health System were substantially greater than those of the general population in Michigan, which were relatively flat. Any comparisons between the program and the general statistics for the state of Michigan are not particularly informative. Michigan is a state of enormous health care disparities. During this period, there was a large insured population. Demographics differ greatly, but patients receiving care within an HMO were a substantially more privileged group than the general population of Michigan. During this time, there were many uninsured and a lot of annual movement in and out of the Henry Ford Health System. At any one time, only 60% of the patients within the health system were enrolled in the behavioral health system in which the depression program occurred.

A substantial proportion of suicides occur with individuals who are not previously known to health systems. Such persons are more represented in the statistics for the state of Michigan. Another substantial proportion of suicides occur in individuals with weakened or recently broken contact with health systems. We don’t know how the statistics reported for the health system accommodated biased departures from the health system or simply missing data. We don’t know whether behavior related to risk of suicide affected migration into the health care system or to the small group receiving behavioral healthcare through the health system. For instance, what became of patients with a psychiatric disorder in a comorbid substance use disorder? Those who were incarcerated?

Basically, the success of the program is not obvious within the noisy fluctuation of suicides within the Henry Ford Health System or the smaller behavioral health program. We cannot control for basic confounding factors or selective enrollment and disenrollment in the health care system, or even expelling from the behavioral health system of persons at risk.

 “Zero suicides” as a literal and serious goal?

The NEJM Catalyst article gave the originator of the program free reign for self-praise.

The most unexpected hurdles were skepticism that perfection goals like zero suicides were reasonable or feasible (some objected that it was “setting us up for failure”), and disbelief in the dramatic improvements obtained (we heard comments like “results from quality improvement projects aren’t scientifically rigorous”). We addressed these concerns by ensuring the transparency of our results and lessons, by collaborating with others to continually improve our methodological issues, and by supporting teams across the world who wish to pursue similar initiatives.

Our team challenged this assumption and asked, If zero is not the right goal for suicide occurrence, then what number is? Two? Twelve? Which twelve? In spite of its radicalism — indeed because of it — the goal of zero suicides became the galvanizing force behind an effort that achieved one of the most dramatic and sustained reductions in suicide in the clinical literature.

Will the Henry Ford program prove sustainable?

Edward Coffey moved to  President, CEO, and Chief of Staff at the Menninger Clinic 18 months before his article in the NEJM Catalyst. I am curious to what aspects of his Zero Suicides/Perfect Depression Care Program are still maintained at Henry Ford. As it is described, the program was designed with admirably short waiting times for referral to behavioral healthcare. If the program persists as originally described, many professionals are kept vigilant and engaged in activities to reduce suicide without any statistical likelihood of having the opportunity to actually prevent one.

In decades of work within health systems, I have found that once demonstration projects have run their initial course, their goals are replaced by new organizational  ones and resources are redistributed. Sooner or later, competing demands for scarce resources  are promoted by new slogans.

What if Perfect Depression Care has to compete for scarce resources with Perfect Diabetes Care or alleviation of gross ethnic disparities in cardiovascular outcomes?

A lot of well-meant slogans ultimately have unintended, negative consequences. “Make pain the 5th vital sign” led to more attention being paid to previously ignored and poorly managed pain. This was followed by mandated routine assessment and intervention, which led to unnecessary procedures and unprecedented epidemic of addiction and death from prescribed opioids. “Stamp out distress” has led to mandated screening and intervention programs for psychological distress in cancer care, with high rates of antidepressant prescription without proper diagnosis or follow-up.

If taken literally and seriously, a lofty, but abstract goal like Zero Suicide becomes a threat to any “just culture” in healthcare organization. If the slogan is taken seriously as resources are inevitably withdrawn, a culture of blame will emerge and pressures to distort easily manipulated statistics. Patients posing threats to the goal of zero suicide will be excluded from the system with an unknown, but negative consequences for their morbidity and mortality.

 Bottom line – we can’t have slogan-driven healthcare policies that will likely have negative implications and conflict with evidence.

 Enter Big Pharma

Not unexpectedly, Big Pharma is getting involved in promoting Zero Suicides:

Eli Lilly and Company Foundation donates $250,000 to expand Community Health Network’s Zero Suicides prevention initiative,

Major gift will save Hoosier lives through a suicide prevention network that responds to a critical Indiana healthcare issue.

 According to press coverage, the funds will go to:

The Lilly Foundation donation also provides resources needed to build a Central Indiana crisis network that will include Indiana’s schools, foster care system, juvenile justice program, primary and specialty healthcare providers, policy makers and suicide survivors. These partners will be trained to identify people at risk of attempting suicide, provide timely intervention and quickly connect them with Community’s crisis providers. Indiana’s state government is a key partner in building the statewide crisis network.

I’m sure this effort is good for  the profits of Pharma. Dissemination of screening programs into settings that are not directly connected to quality depression care is inevitably ineffective. The main healthcare consequences are an increase in antidepressant prescriptions without appropriate diagnoses, patient education, and follow-up. Substantial overtreatment results from people being identified without proper diagnosis who otherwise would not be seeking treatment. Care for depression in the community is hardly Perfect Depression Care.

It is great publicity for Eli Lilly and the community receiving the gift will surely be grateful.

Launching Zero Suicides in English communities and elsewhere

My academic colleagues in the UK assure me that we can simply dismiss an official UK government press release about the goal of zero suicides from Nick Clegg. It has been rendered obsolete by subsequent political events. A number commented that they never took it seriously, regardless.

Nick Clegg calls for new ambition for zero suicides across the NHS

The claims in the press release stand in stark contrast to long waiting times for mental health services and important gaps in responses to serious mental health crises, including lethal suicide attempts. However, another web link is to an announcement:

Centre for Mental Health was commissioned by the East of England Strategic Clinical Networks to evaluate activity taking place in four local areas in the region through a pilot programme to extend suicide prevention into communities.

The ‘zero suicide’ initiative is based on an approach developed by Dr Ed Coffey in Detroit, Michigan. The approach aims to prevent suicides by creating a more open environment for people to talk about suicidal thoughts and enabling others to help them. It particularly aims to reach people who have not been reached through previous initiatives and to address gaps in existing provision.

Four local areas in the East of England (Bedfordshire, Cambridgeshire & Peterborough, Essex and Hertfordshire) were selected in 2013 as pathfinder sites to develop new approaches to suicide prevention. Centre for Mental Health evaluated the work of the sites during 2015.

The evaluation found an impressive range of activities that had taken suicide prevention activities out into local communities. They included:

• Training key public service staff such as GPs, police officers, teachers and housing officers
• Training others who may encounter someone at risk of taking their own life, such as pub landlords, coroners, private security staff, faith groups and gym workers
• Creating ‘community champions’ to put local people in control of activities
• Putting in place practical suicide prevention measures in ‘hot spots’ such as bridges and railways
• Working with local newspapers, radio and social media to raise awareness in the wider community
• Supporting safety planning for people at risk of suicide, involving families and carers throughout the process
• Linking with local crisis services to ensure people get speedy access to evidence-based treatments.

The report noted that some of the people who received the training had already saved lives:

“I saved a man’s life using the skills you taught us on the course. I cannot find words to properly express the gratitude I have for that. Without the training I would have been in bits. It was a very public place, packed with people – but, to onlookers, we just looked like two blokes sitting on a bench talking.”

“Déjà vu all over again”, as Yogi Berra would say. This effort also recalls Bill Murray in the movie Groundhog Day, where he is trapped into repeating the same day over and over again.

A few years ago I was a scientific advisor for European Union funded project to disseminate multilevel suicide prevention programs across Europe. One UK site was among those targeted in this report. Implementation of the EU program had already failed before the plate of snacks was being removed from a poorly attended event. The effort quickly failed because it failed to attract the support of local GPs.

Years later, I recognize many of the elements of what we tried to implement, described in language almost identical to ours. There is no mention of the training materials we left behind or of the quick failure of our attempt at implementation.

Many of the proposed measures in the UK plan serve to generate publicity and do not any evidence that they reduce suicides. For instance, training people in the community who might conceivably come in contact with a suicidal person accomplishes little other than producing good publicity. Uptake of such training is abysmally low and is not likely to affect the probability that a person in a suicidal crisis will encounter anyone who can make a difference

Broad efforts to increase uptake of mental health services in the UK strain a system already suffer from unacceptably long waiting times for services. People with any likelihood of attempting suicide, however poorly predicted, are likely to be lost among persons seeking services with less serious or pressing needs.

Thoughts I have accumulated from years of evaluating depression screening programs and suicide intervention efforts

 Staying mobilized around preventing suicide is difficult because it is an infrequent event and most activations of resources will prove to false positives.

It can be tedious and annoying for both staff and patients to keep focused on an infrequent event, particularly for the vast majority of patients who rightfully believe they are not at risk for suicide.

Resources can be drained off from less frequent, but more high risk situations that require sustained intensity of response, pragmatic innovation, and flexibility of rules.

Heightened efforts to detect mental health problems increase access for people already successfully accessing services and decrease resources for those needing special efforts. The net result can be an increase in disparities.

Suicide data are easily manipulated by ignoring selective loss to follow-up. Many suicides occur at breaks in the system, where getting follow-up data is also problematic.

Finally, death by suicide is a health outcomes that is multiply determined. It does not lend itself to targeted public health approaches like eliminating polio, tempting though invoking the analogy may be.

Postscript

It is likely  that I exposed anyone reaching this postscript to a new and disconcerting perspective. What I have been saying is  discrepant with the publicity about “zero suicides” available in the media. The portrayal of “zero suicides” is quite persuasive because it is sophisticated and well-crafted. Its dissemination is well resourced and often financed by individuals and institutions with barely discernible – if at all – conflicts of financial and political interests. Just try to find any dissenters or skeptical assessments.

My takeaway message: It’s best to process claims about suicide prevention with a high level of skepticism, an insistent demand for evidence, and a preparedness for discovering that seemingly well trusted sources are not without agendas. They are usually  providing propaganda rather than evidence-based arguments.

A skeptical look at The Lancet behavioural activation versus CBT for depression (COBRA) study

A skeptical look at:

Richards DA, Ekers D, McMillan D, Taylor RS, Byford S, Warren FC, Barrett B, Farrand PA, Gilbody S, Kuyken W, O’Mahen H. et al. Cost and Outcome of Behavioural Activation versus Cognitive Behavioural Therapy for Depression (COBRA): a randomised, controlled, non-inferiority trial. The Lancet. 2016 Jul 23.

 

humpty dumpty fallenAll the Queen’s horses and all the Queen’s men (and a few women) can’t put a flawed depression trial back together again.

Were they working below their pay grade? The 14 authors of the study collectively have impressive expertise. They claim to have obtained extensive consultation in designing and implementing the trial. Yet they produced:

  • A study doomed from the start by serious methodological problems from yielding any scientifically valid and generalizable results.
  • Instead, they produced tortured results that pander to policymakers seeking an illusory cheap fix.

 

Why the interests of persons with mental health problems are not served by translating the hype from a wasteful project into clinical practice and policy.

Maybe you were shocked and awed, as I was by the publicity campaign mounted by The Lancet on behalf of a terribly flawed article in The Lancet Psychiatry about whether locked inpatient wards fail suicidal patients.

It was a minor league effort compared to the campaign orchestrated by the Science Media Centre for a recent article in The Lancet The study concerned a noninferiority trial of behavioural activation (BA) versus cognitive behaviour therapy (CBT) for depression. The message echoing through social media without any critical response was behavioural activation for depression delivered by minimally trained mental health workers was cheaper but just as effective as cognitive behavioural therapy delivered by clinical psychologists.

Reflecting the success of the campaign, the immediate reactions to the article are like nothing I have recently seen. Here are the published altmetrics for an article with an extraordinary overall score of 696 (!) as of August 24, 2016.

altmetrics

 

Here is the press release.

Here is the full article reporting the study, which nobody in the Twitter storm seems to have consulted.

some news coverage

 

 

 

 

 

 

 

 

 

Here are supplementary materials.

Here is the well-orchestrated,uncritical response from tweeters, UK academics and policy makers.

.

The Basics of the study

The study was an open-label  two-armed non-inferiority trial of behavioural activation therapy (BA) versus cognitive behavioural therapy (CBT) for depression with no non-specific comparison/control treatment.

The primary outcome was depression symptoms measured with the self-report PHQ-9 at 12 months.

Delivery of both BA and CBT followed written manuals for a maximum of 20 60-minute sessions over 16 weeks, but with the option of four additional booster sessions if the patients wanted them. Receipt of eight sessions was considered an adequate exposure to the treatments.

The BA was delivered by

Junior mental health professionals —graduates trained to deliver guided self-help interventions, but with neither professional mental health qualifications nor formal training in psychological therapies—delivered an individually tailored programme re-engaging participants with positive environmental stimuli and developing depression management strategies.

CBT, in contrast, was delivered by

Professional or equivalently qualified psychotherapists, accredited as CBT therapists with the British Association of Behavioural and Cognitive Psychotherapy, with a postgraduate diploma in CBT.

The interpretation provided by the journal article:

Junior mental health workers with no professional training in psychological therapies can deliver behavioural activation, a simple psychological treatment, with no lesser effect than CBT has and at less cost. Effective psychological therapy for depression can be delivered without the need for costly and highly trained professionals.

A non-inferiority trial

An NHS website explains non-inferiority trials:

The objective of non-inferiority trials is to compare a novel treatment to an active treatment with a view of demonstrating that it is not clinically worse with regards to a specified endpoint. It is assumed that the comparator treatment has been established to have a significant clinical effect (against placebo). These trials are frequently used in situations where use of a superiority trial against a placebo control may be considered unethical.

I have previously critiqued  [ 1,   2 ] noninferiority psychotherapy trials. I will simply reproduce a passage here:

Noninferiority trials (NIs) have a bad reputation. Consistent with a large literature, a recent systematic review of NI HIV trials  found the overall methodological quality to be poor, with a high risk of bias. The people who brought you CONSORT saw fit to develop special reporting standards for NIs  so that misuse of the design in the service of getting publishable results is more readily detected.

Basically, an NI RCT commits investigators and readers to accepting null results as support for a new treatment because it is no worse than an existing one. Suspicions are immediately raised as to why investigators might want to make that point.

Noninferiority trials are very popular among Pharma companies marketing rivals to popular medications. They use noninferiority trials to show that their brand is no worse than the already popular medication. But by not including a nonspecific control group, the trialists don’t bother to show that either of the medications is more effective than placebo under the conditions in which they were administered in these trials. Often, the medication dominating the market had achieved FDA approval for advertising with evidence of only being only modestly effective. So, potato are noninferior to spuds.

Compounding the problems of a noninferiority trial many times over

Let’s not dwell on this trial being a noninferiority trial, although I will return to the problem of knowing what would happen in the absence of either intervention or with a credible, nonspecific control group. Let’s focus instead on some other features of the trial that seriously compromised an already compromised trial.

Essentially, we will see that the investigators reached out to primary care patients who were mostly already receiving treatment with antidepressants, but unlikely with the support and positive expectations or even adherence necessary to obtain benefit. By providing these nonspecific factors, any psychological intervention would likely to prove effective in the short run.

The total amount of treatment offered substantially exceeded what is typically provided in clinical trials of CBT. However, uptake and actual receipt of treatment is likely to be low in such a population recruited by outreach, not active seeking treatment. So, noise is being introduced by offering so much treatment.

A considerable proportion of primary care patients identified as depressed won’t accept treatment or will not accept the full intensity available. However, without careful consideration of data that are probably not available for this trial, it will be ambiguous whether the amount of treatment received by particular patients represented dropping out prematurely or simply dropping out when they were satisfied with the benefits they had been received. Undoubtedly, failures to receive minimal intensity of treatment and even the overall amount of treatment received by particular patients are substantial and complexly determined, but nonrandom and differ between patients.

Dropping out of treatment is often associated with dropping out of a study – further data not being available for follow-up. These conditions set the stage for considerable challenges in analyzing and generalizing from whatever data are available. Clearly, the assumption of data being missing at random will be violated. But that is the key assumption required by multivariate statistical strategies that attempt to compensate for incomplete data.

12 months – the time point designated for assessment of primary outcomes – is likely to exceed the duration of a depressive episode in a primary care population, which is approximately 9 months. In the absence of a nonspecific active comparison/control or even a waitlist control group, recovery that would’ve occurred in the absence of treatment will be ascribed to the two active interventions being studied.

12 months is likely to exceed substantially the end of any treatment being received and so effects of any active treatments are likely to dissipate. The design allowed for up to four booster sessions. However, access to booster sessions was not controlled. It was not assigned and access cannot be assumed to be random. As we will see when we examined the CONSORT flowchart for the study, there was no increase in the number of patients receiving an adequate exposure to psychotherapy from 6 to 12 months. That is likely to indicate that most active treatment had ended within the first six months.

Focusing on 12 months outcomes, rather than six months, increases the unreliability of any analyses because more 12 month outcomes will be missing than what were available at six months.

Taken together, the excessively long 12 month follow-up being designated as primary outcome and the unusually amount of treatment being offered, but not necessarily being accepted, create substantial problems of missing data that cannot be compensated by typical imputation and multivariate methods; difficulties interpreting results in terms of the amount of treatment actually received; and comparison to the primary outcomes typical trials of psychotherapy being offered to patients seeking psychotherapy.

The authors’ multivariate analysis strategy was inappropriate, given the amount of missing data and the violation of data being missing at random..

Surely the more experienced of the 14 authors of The Lancet should have anticipated these problems and the low likelihood that this study would produce generalizable results.

Recruitment of patients

The article states:

 We recruited participants by searching the electronic case records of general practices and psychological therapy services for patients with depression, identifying potential participants from depression classification codes. Practices or services contacted patients to seek permission for researcher contact. The research team interviewed those that responded, provided detailed information on the study, took informed written consent, and assessed people for eligibility.

Eligibility criteria

Eligible participants were adults aged 18 years or older who met diagnostic criteria for major depressive disorder assessed by researchers using a standard clinical interview (Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition [SCID]9). We excluded people at interview who were receiving psychological therapy, were alcohol or drug dependent, were acutely suicidal or had attempted suicide in the previous 2 months, or were cognitively impaired, or who had bipolar disorder or psychosis or psychotic symptoms.

Table 3 Patient Characteristics reveals a couple of things about co-treatment with antidepressants that must be taken into consideration in evaluating the design and interpreting results.

antidepressant stratificationAnd

 

antidepressant stratification

So, investigators did not wait for patients to refer themselves or to be referred by physicians to the trial, they reached out to them. Applying their exclusion criteria, the investigators obtained a sample that mostly had been prescribed antidepressants, with no indication that the prescription had ended. The length of time which 70% patients had been on antidepressants was highly skewed, with a mean of 164 weeks and a median of 19. These figures strain credibility. I have reached out to the authors with a question whether there is an error in the table and await clarification.

We cannot assume that patients whose records indicate they were prescribed an antidepressant were refilling their prescriptions at the time of recruitment, were faithfully adhering, or were even being monitored.  The length of time since initial prescription increases skepticism whether there was adequate exposure to antidepressants at the time of recruitment to the study..

The inadequacy of antidepressant treatment in routine primary care

Refilling of first prescriptions of antidepressants in primary care, adherence, and monitoring and follow-up by providers are notoriously low.

Guideline-congruent treatment with antidepressants in the United States requires a five week follow up visit, which is only infrequently received in routine. When the five week follow-up visit is kept,

Rates of improvement in depression associated with prescription of an antidepressant in routine care approximate that achieved with pill placebo in antidepressant trials. The reasons for this are complex: but center on depression being of mild to moderate severity in primary care. Perhaps more important is that the attention, provisional positive expectations and support provided in routine primary care is lower than what is provided in the blinded pill-placebo condition in clinical trials. In blinded trials, neither the provider nor patient know whether the active medication or a pill placebo is being administered. The famous NIMH National Collaborative Study found, not surprisingly, that response in the pill-placebo condition was predicted by the quality of the therapeutic alliance between patient and provider.

In The Lancet study, readers are not provided with important baseline characteristics of the patients that are crucial to interpreting the results and their generalizability. We don’t know the baseline or subsequent adequacy of antidepressant treatment or of the quality of the routine care being provided for it. Given that antidepressants are not the first-line treatment for mild to moderate depression, we don’t know why these patients were not receiving psychotherapy. We don’t know even whether the recruited patients were previously offered psychotherapy and with what uptake, except that they were not receiving it two months prior to recruitment.

There is a fascinating missing story about why these patients were not receiving psychotherapy at the start of the study and why and with what accuracy they were described as taking antidepressants.

Readers are not told what happened to antidepressant treatment during the trial. To what extent did patients who were not receiving antidepressants begin doing so? As result of the more frequent contact and support provided in the psychotherapy, to what extent was there improvement in adherence, as well as the ongoing support inattention per providers and attention from primary care providers?

Depression identified in primary care is a highly heterogeneous condition, more so than among patients recruited from treatment in specialty mental health settings. Much of the depression has only the minimum number of symptoms required for a diagnosis or one more. The reliability of diagnosis is therefore lower than in specialty mental health settings. Much of the depression and anxiety disorders identified with semi structured research instruments in populations that is not selected for having sought treatment resolves itself without formal intervention.

The investigators were using less than ideal methods to recruit patients from a population in which major depressive disorder is highly heterogeneous and subject to recovery in the absence of treatment by the time point designated for assessment of primary outcome. They did not sufficiently address the problem of a high level of co-treatment having been prescribed long before the beginning of the study. They did not even assess the extent to which that prescribed treatment had patient adherence or provider monitoring and follow-up. The 12 month follow-up allowed the influence of lots of factors beyond the direct effects of the active ingredients of the two interventions being compared in the absence of a control group.

decline in scores

Examination of a table presented in the supplementary materials suggests that most change occurred in the first six months after enrollment and little thereafter. We don’t know the extent to which there was any treatment beyond the first six-month or what effect it had. A population with clinically significant depression drawn from specially care, some deterioration can be expected after withdrawal of active treatment. In a primary care population, such a graph could be produced in large part because of the recovery from depression that would be observed in the absence of active treatment.

 

Cost-effectiveness analyses reported in the study address the wrong question. These analyses only considered the relative cost of these two active treatments, leaving unaddressed the more basic question of whether it is cost-effective to offer either treatments at this intensity. It might be more cost-effective to have a person with even less mental health training contact patients, inquire about adherence, side effects, and clinical outcomes, and prompt patients to accept another appointment with the GP if an algorithm indicates that would be appropriate.

The intensity of treatment being offered and received

The 20 sessions plus 4 booster sessions of psychotherapy being offered in this trial is considerably higher than the 12 to 16 sessions offered in the typical RCT for depression. Having more sessions available than typical introduces some complications. Results are not comparable to what is found inthe trials offering less treatment. But in a primary care population not actively seeking psychotherapy for depression, there is further complication in that many patients will not access the full 20 sessions. There will be difficulties interpreting results in terms of intensity of treatment because of the heterogeneity of reasons for getting less treatment. Effectively, offering so much therapy to a group that is less inclined to accept psychotherapy introduces a lot of noise in trying to make sense of the data, particularly when cost-effectiveness is an issue.

This excerpt from the CONSORT flowchart demonstrates the multiple problems associated with offering so much treatment to a population that was not actively seeking it and yet needing twelve-month data for interpreting the results of a trial.

CONSORT chart

The number of patients who had no data at six months increased by 12 months. There was apparently no increase in the number of patients receiving an adequate exposure to psychotherapy

Why the interest of people with mental health problems are not served by the results claimed by these investigators being translated into clinical practice.

 The UK National Health Service (NHS) is seriously underfunding mental health services. Patients being referred for psychotherapy from primary care have waiting periods that often exceed the expected length of an episode of depression in primary care. Simply waiting for depression to remit without treatment is not necessarily cost effective because of the unneeded suffering, role impairment, and associated social and personal costs of an episode that persist. Moreover, there is a subgroup of depressed patients in primary care who need more intensive or different treatment. Guidelines recommending assessment after five weeks are not usually reflected in actual clinical practice.

There’s a desperate search for ways in which costs can be further reduced in the NHS. The Lancet study is being interpreted to suggest that more expensive clinical psychologists can be replaced by less expensive and less trained mental health workers. Uncritically and literally accepted, the message is that clinical psychologist working half-time addressing particular comment clinical problems can be replaced by less expensive mental health workers achieving the same effects in the same amount of time.

The pragmatic translation of these claims into practice are replace have a clinical psychologists with cheaper mental health workers. I don’t think it’s being cynical to anticipate the NHS seizing upon an opportunity to reduce costs, while ignoring effects on overall quality of care.

Care for the severely mentally ill in the NHS is already seriously compromised for other reasons. Patients experiencing an acute or chronic breakdown in psychological and social functioning often do not get minimal support and contact time to avoid more intensive and costly interventions like hospitalization. I think would be naïve to expect that the resources freed up by replacing a substantial portion of clinical psychologists with minimally trained mental health workers would be put into addressing unmeet needs of the severely mentally ill.

Although not always labeled as such, some form of BA is integral to stepped care approaches to depression in primary care. Before being prescribed antidepressants or being referred to psychotherapy, patients are encouraged to increased pleasant activities. In Scotland, they may be even given free movie passes for participating in cleanup of parks.

A stepped care approach is attractive, but evaluation of cost effectiveness is complicated by consideration of the need for adequate management of antidepressants for those patients who go on to that level of care.

If we are considering a sample of primary care patients mostly already receiving antidepressants, the relevant comparator is introduction of a depression care manager.

Furthermore, there are issues in the adequacy of addressing the needs of patients who do not benefit from lower intensity care. Is the lack of improvement with low levels of care adequately monitored and addressed. Is the uncertain escalation in level of care adequately supported so that referrals are completed?

The results of The Lancet study don’t tell us very much about the adequacy of care that patients who were enrolled in the study were receiving or whether BA is as effective as CBT as stand-alone treatments or whether nonspecific treatments would’ve done as well. We don’t even know whether patients assigned to a waitlist control would’ve shown as much improvement by 12 months and we have reason to suspect that many would.

I’m sure that the administrations of NHS are delighted with the positive reception of this study. I think it should be greeted with considerable skepticism. I am disappointed that the huge resources that went into conducting this study which could have put into more informative and useful research.

I end with two questions for the 14 authors – Can you recognize the shortcomings of your study and its interpretation that you have offered? Are you at least a little uncomfortable with the use to which these results will be put?

 

 

 

 

Study protocol violations, outcomes switching, adverse events misreporting: A peek under the hood

An extraordinary, must-read article is now available open access:

Jureidini, JN, Amsterdam, JD, McHenry, LB. The citalopram CIT-MD-18 pediatric depression trial: Deconstruction of medical ghostwriting, data mischaracterisation and academic malfeasance. International Journal of Risk & Safety in Medicine, vol. 28, no. 1, pp. 33-43, 2016

The authors had access to internal documents written with the belief that they would be left buried in corporate files. However, these documents became publicly available in a class-action product liability suit concerning the marketing of the antidepressant citalopram for treating children and adolescents.

Detailed evidence of ghost writing by industry sponsors has considerable shock value. But there is a broader usefulness to this article allowing us to peek in on the usually hidden processes by which null findings and substantial adverse events are spun into a positive report of the efficacy and safety of a treatment.

another peeking under the hoodWe are able to see behind the scenes how an already underspecified protocol was violated, primary and secondary outcomes were switched or dropped, and adverse events were suppressed in order to obtain the kind of results needed for a planned promotional effort and the FDA approval for use of the drug in these populations.

We can see how subtle changes in analyses that would otherwise go unnoticed can have a profound impact on clinical and public policy.

In so many other situations, we are left only with our skepticism about results being too good to be true. We are usually unable to evaluate independently investigators’ claims because protocols are unavailable, deviations are not noted, analyses are conducted and reported without transparency. Importantly, there usually is no access to data that would be necessary for reanalysis.

ghostwriter_badThe authors whose work is being criticized are among the most prestigious child psychiatrists in the world. The first author is currently President-elect of the American Academy of Child and Adolescent Psychiatry. The journal is among the top psychiatry journals in the world. A subscription is provided as part of membership in the American Psychiatric Association. Appearing in this journal is thus strategic because its readership includes many practitioners and clinicians who will simply defer to academics publishing in a journal they respect, without inclination to look carefully.

Indeed, I encourage readers to go to the original article and read it before proceeding further in the blog. Witness the unmasking of how null findings were turned positive. Unless you had been alerted, would you have detected that something was amiss?

Some readers have participated in multisite trials other than as a lead investigator.  I ask them to imagine that they had had received the manuscript for review and approval and assumed it was vetted by the senior investigators – and only the senior investigators.  Would they have subjected it to the scrutiny needed to detect data manipulation?

I similarly ask reviewers for scientific journals if they would have detected something amiss. Would they have compared the manuscript to the study protocol? Note that when this article was published, they probably would’ve had to contact the authors or the pharmaceutical company.

Welcome to a rich treasure trove

Separate from the civil action that led to these documents and data being released, the federal government later filed criminal charges and false claims act allegations against Forest Laboratories. The pharmaceutical company pleaded guilty and accepted a $313 million fine.

Links to the filing and the announcement from the federal government of a settlement is available in a supplementary blog at Quick Thoughts. That blog post also has rich links to the actual emails accessed by the authors, as well as blog posts by John M Nardo, M.D. that detail the difficulties these authors had publishing the paper we are discussing.

Aside from his popular blog, Dr. Nardo is one of the authors of a reanalysis that was published in The BMJ of a related trial:

Le Noury J, Nardo JM, Healy D, Jureidini J, Raven M, Tufanaru C, Abi-Jaoude E. Restoring Study 329: efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ 2015; 351: h4320

My supplementary blog post contains links to discussions of that reanalysis obtained from GlaxoSmithKline, the original publication based on these data, 30 Rapid Responses to the reanalysis The BMJ, as well as federal criminal complaints and the guilty pleading of GlaxoSmithKline.

With Dr. Nardo’s assistance, I’ve assembled a full set of materials that should be valuable in stimulating discussion among senior and junior investigators, as well in student seminars. I agree with Dr. Nardo’s assessment:

I think it’s now our job to insure that all this dedicated work is rewarded with a wide readership, one that helps us move closer to putting this tawdry era behind us…John Mickey Nardo

The citalopram CIT-MD-18 pediatric depression trial

The original article that we will be discussing is:

Wagner KD, Robb AS, Findling RL, Jin J, Gutierrez MM, Heydorn WE. A randomized, placebo-controlled trial of citalopram for the treatment of major depression in children and adolescents. American Journal of Psychiatry. 2004 Jun 1;161(6):1079-83.

It reports:

An 8-week, randomized, double-blind, placebo-controlled study compared the safety and efficacy of citalopram with placebo in the treatment of children (ages 7–11) and adolescents (ages 12–17) with major depressive disorder.

The results and conclusion:

Results: The overall mean citalopram dose was approximately 24 mg/day. Mean Children’s Depression Rating Scale—Revised scores decreased significantly more from baseline in the citalopram treatment group than in the placebo treatment group, beginning at week 1 and continuing at every observation point to the end of the study (effect size=2.9). The difference in response rate at week 8 between placebo (24%) and citalopram (36%) also was statistically significant. Citalopram treatment was well tolerated. Rates of discontinuation due to adverse events were comparable in the placebo and citalopram groups (5.9% versus 5.6%, respectively). Rhinitis, nausea, and abdominal pain were the only adverse events to occur with a frequency exceeding 10% in either treatment group.

Conclusions: In this population of children and adolescents, treatment with citalopram reduced depressive symptoms to a significantly greater extent than placebo treatment and was well tolerated.

The article ends with an elaboration of what is said in the abstract:

In conclusion, citalopram treatment significantly improved depressive symptoms compared with placebo within 1 week in this population of children and adolescents. No serious adverse events were reported, and the rate of discontinuation due to adverse events among the citalopram-treated patients was comparable to that of placebo. These findings further support the use of citalopram in children and adolescents suffering from major depression.

The study protocol

The protocol for CIT-MD-I8, IND Number 22,368 was obtained from Forest Laboratories. It was dated September 1, 1999 and amended April 8, 2002.

The primary outcome measure was the change from baseline to week 8 on the Children’s Depression Rating Scale-Revised (CDRS-R) total score.

Comparison between citalopram and placebo will be performed using three-way analysis of covariance (ANCOVA) with age group, treatment group and center as the three factors, and the baseline CDRS-R score as covariate.

The secondary outcome measures were the Clinical Global Impression severity and improvement subscales, Kiddie Schedule for Affective Disorders and Schizophrenia – depression module, and Children’s Global Assessment Scale.

Comparison between citalopram and placebo will be performed using the same approach as for the primary efficacy parameter. Two-way ANOVA will be used for CGI-I, since improvement relative to Baseline is inherent in the score.

 There was no formal power analysis but:

The primary efficacy variable is the change from baseline in CDRS-R score at Week 8.

Assuming an effect size (treatment group difference relative to pooled standard deviation) of 0.5, a sample size of 80 patients in each treatment group will provide at least 85% power at an alpha level of 0.05 (two-sided).

The deconstruction

 Selective reporting of subtle departures from the protocol could easily have been missed or simply excused as accidental and inconsequential, except that there was unrestricted access to communication within Forest Laboratories and to the data for reanalysis.

3.2 Data

The fact that Forest controlled the CIT-MD-18 manuscript production allowed for selection of efficacy results to create a favourable impression. The published Wagner et al. article concluded that citalopram produced a significantly greater reduction in depressive symptoms than placebo in this population of children and adolescents [10]. This conclusion was supported by claims that citalopram reduced the mean CDRS-R scores significantly more than placebo beginning at week 1 and at every week thereafter (effect size = 2.9); and that response rates at week 8 were significantly greater for citalopram (36% ) versus placebo (24% ). It was also claimed that there were comparable rates of tolerability and treatment discontinuation for adverse events (citalopram = 5.6% ; placebo = 5.9% ). Our analysis of these data and documents has led us to conclude that these claims were based on a combination of: misleading analysis of the primary outcome and implausible calculation of effect size; introduction of post hoc measures and failure to report negative secondary outcomes; and misleading analysis and reporting of adverse events.

3.2.1 Mischaracterisation of primary outcome

Contrary to the protocol, Forest’s final study report synopsis increased the study sample size by adding eight of nine subjects who, per protocol, should have been excluded because they were inadvertently dispensed unblinded study drug due to a packaging error [23]. The protocol stipulated: “Any patient for whom the blind has been broken will immediately be discontinued from the study and no further efficacy evaluations will be performed” [10]. Appendix Table 6 of the CIT-MD-18 Study Report [24] showed that Forest had performed a primary outcome calculation excluding these subjects (see our Fig. 2). This per protocol exclusion resulted in a ‘negative’ primary efficacy outcome.

Ultimately however, eight of the excluded subjects were added back into the analysis, turning the (albeit marginally) statistically insignificant outcome (p <  0.052) into a statistically significant outcome (p  <  0.038). Despite this change, there was still no clinically meaningful difference in symptom reduction between citalopram and placebo on the mean CDRS-R scores (Fig. 3).

The unblinding error was not reported in the published article.

Forest also failed to follow their protocol stipulated plan for analysis of age-by-treatment interaction. The primary outcome variable was the change in total CDRS-R score at week 8 for the entire citalopram versus placebo group, using a 3-way ANCOVA test of efficacy [24]. Although a significant efficacy value favouring citalopram was produced after including the unblinded subjects in the ANCOVA, this analysis resulted in an age-by-treatment interaction with no significant efficacy demonstrated in children. This important efficacy information was withheld from public scrutiny and was not presented in the published article. Nor did the published article report the power analysis used to determine the sample size, and no adequate description of this analysis was available in either the study protocol or the study report. Moreover, no indication was made in these study documents as to whether Forest originally intended to examine citalopram efficacy in children and adolescent subgroups separately or whether the study was powered to show citalopram efficacy in these subgroups. If so, then it would appear that Forest could not make a claim for efficacy in children (and possibly not even in adolescents). However, if Forest powered the study to make a claim for efficacy in the combined child plus adolescent group, this may have been invalidated as a result of the ANCOVA age-by-treatment interaction and would have shown that citalopram was not effective in children.

A further exaggeration of the effect of citalopram was to report “effect size on the primary outcome measure” of 2.9, which was extraordinary and not consistent with the primary data. This claim was questioned by Martin et al. who criticized the article for miscalculating effect size or using an unconventional calculation, which clouded “communication among investigators and across measures” [25]. The origin of the effect size calculation remained unclear even after Wagner et al. publicly acknowledged an error and stated that “With Cohens method, the effect size was 0.32,” [20] which is more typical of antidepressant trials. Moreover, we note that there was no reference to the calculation of effect size in the study protocol.

3.2.2 Failure to publish negative secondary outcomes, and undeclared inclusion of Post Hoc Outcomes

Wagner et al. failed to publish two of the protocol-specified secondary outcomes, both of which were unfavourable to citalopram. While CGI-S and CGI-I were correctly reported in the published article as negative [10], (see p1081), the Kiddie Schedule for Affective Disorders and Schizophrenia-Present (depression module) and the Children’s Global Assessment Scale (CGAS) were not reported in either the methods or results sections of the published article.

In our view, the omission of secondary outcomes was no accident. On October 15, 2001, Ms. Prescott wrote: “Ive heard through the grapevine that not all the data look as great as the primary outcome data. For these reasons (speed and greater control) I think it makes sense to prepare a draft in-house that can then be provided to Karen Wagner (or whomever) for review and comments” (see Fig. 1). Subsequently, Forest’s Dr. Heydorn wrote on April 17, 2002: “The publications committee discussed target journals, and recommended that the paper be submitted to the American Journal of Psychiatry as a Brief Report. The rationale for this was the following: … As a Brief Report, we feel we can avoid mentioning the lack of statistically significant positive effects at week 8 or study termination for secondary endpoints” [13].

Instead the writers presented post hoc statistically positive results that were not part of the original study protocol or its amendment (visit-by-visit comparison of CDRS-R scores, and ‘Response’, defined as a score of ≤28 on the CDRS-R) as though they were protocol-specified outcomes. For example, ‘Response’ was reported in the results section of the Wagner et al. article between the primary and secondary outcomes, likely predisposing a reader to regard it as more important than the selected secondary measures reported, or even to mistake it for a primary measure.

It is difficult to reconcile what the authors of the original article reported in terms of adverse events and what our “deconstructionists “ found in the unpublished final study report. The deconstruction article also notes that a letter to the editor appearing at the time of publication of the original paper called attention to another citalopram study that remain unpublished, but that was known to be a null study with substantial adverse events.

3.2.3 Mischaracterisation of adverse events

Although Wagner et al. correctly reported that “the rate of discontinuation due to adverse events among citalopram-treated patients was comparable to that of placebo”, the authors failed to mention that the five citalopram-treated subjects discontinuing treatment did so due to one case of hypomania, two of agitation, and one of akathisia. None of these potentially dangerous states of over-arousal occurred with placebo [23]. Furthermore, anxiety occurred in one citalopram patient (and none on placebo) of sufficient severity to temporarily stop the drug and irritability occurred in three citalopram (compared to one placebo). Taken together, these adverse events raise concerns about dangers from the activating effects of citalopram that should have been reported and discussed. Instead Wagner et al. reported “adverse events associated with behavioral activation (such as insomnia or agitation) were not prevalent in this trial” [10] and claimed thatthere were no reports of mania”, without acknowledging the case of hypomania [10].

Furthermore, examination of the final study report revealed that there were many more gastrointestinal adverse events for citalopram than placebo patients. However, Wagner et al. grouped the adverse event data in a way that in effect masked this possibly clinically significantly gastrointestinal intolerance. Finally, the published article also failed to report that one patient on citalopram developed abnormal liver function tests [24].

In a letter to the editor of the American Journal of Psychiatry, Mathews et al. also criticized the manner in which Wagner et al. dealt with adverse outcomes in the CIT-MD-18 data, stating that: “given the recent concerns about the risk of suicidal thoughts and behaviors in children treated with SSRIs, this study could have attempted to shed additional light on the subject” [26] Wagner et al. responded: “At the time the [CIT-MD-18] manuscript was developed, reviewed, and revised, it was not considered necessary to comment further on this topic” [20]. However, concerns about suicidal risk were prevalent before the Wagner et al. article was written and published [27]. In fact, undisclosed in both the published article and Wagner’s letter-to-the-editor, the 2001 negative Lundbeck study had raised concern over heightened suicide risk [10, 20, 21].

A later blog post will discuss the letters to the editor that appeared shortly after the original study in American Journal of Psychiatry. But for now, it would be useful to clarify the status of the negative Lundbeck study at that time.

The letter by Barbe published in AJP  remarked:

It is somewhat surprising that the authors do not compare their results with those of another trial, involving 244 adolescents (13–18-year-olds), that showed no evidence of efficacy of citalopram compared to placebo and a higher level of self-harm (16 [12.9%] of 124 versus nine [7.5%] of 120) in the citalopram group compared to the placebo group (5). Although these data were not available to the public until December 2003, one would expect that the authors, some of whom are employed by the company that produces citalopram in the United States and financed the study, had access to this information. It may be considered premature to compare the results of this trial with unpublished data from the results of a study that has not undergone the peer-review process. Once the investigators involved in the European citalopram adolescent depression study publish the results in a peer-reviewed journal, it will be possible to compare their study population, methods, and results with our study with appropriate scientific rigor.

The study authors replied:

It may be considered premature to compare the results of this trial with unpublished data from the results of a study that has not undergone the peer-review process. Once the investigators involved in the European citalopram adolescent depression study publish the results in a peer-reviewed journal, it will be possible to compare their study population, methods, and results with our study with appropriate scientific rigor.

Conflict of interest

The authors of the deconstruction study indicate they do not have any conventional industry or speaker’s bureau support to declare, but they have had relevant involvement in litigation. Their disclosure includes:

The authors are not members of any industry-sponsored advisory board or speaker’s bureau, and have no financial interest in any pharmaceutical or medical device company.

Drs. Amsterdam and Jureidini were engaged by Baum, Hedlund, Aristei & Goldman as experts in the Celexa and Lexapro Marketing and Sales Practices Litigation. Dr. McHenry was also engaged as a research consultant in the case. Dr. McHenry is a research consultant for Baum, Hedlund, Aristei & Goldman.

Concluding remarks

I don’t have many illusions about the trustworthiness of the literature reporting clinical trials, whether pharmaceutical or psychotherapy. But I found this deconstruction article quite troubling. Among the authors’ closing observations are:

The research literature on the effectiveness and safety of antidepressants for children and adolescents is relatively small, and therefore vulnerable to distortion by just one or a two badly conducted and/or reported studies. Prescribing rates are high and increasing, so that prescribers who are misinformed by misleading publications risk doing real harm to many children, and wasting valuable health resources.

I recommend readers going to my supplementary blog and reviewing a very similar case of efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. I also recommend another of my blog posts  that summarizes action taken by the US government against both Forest Laboratories and GlaxoSmithKline for promotion of misleading claims about about the efficacy and safety of antidepressants for children and adolescents.

We should scrutinize studies of the efficacy and safety of antidepressants for children and adolescents, because of the weakness of data from relatively small studies with serious difficulties in their methodology and reporting. But we should certainly not stop there. We should critically examine other studies of psychotherapy and psychosocial interventions.

I previously documented [ 1,  2] interference by promoters of the lucrative Triple P Parenting in the implementation of a supposedly independent evaluation of it, including tampering with plans for data analysis. The promoters then followed it up attempting to block publication of a meta-analysis casting doubt on their claims.

But  suppose we are not dealing the threat of conflict of interest associated with high financial stakes as an pharmaceutical companies or a globally promoted psychosocial program. There are still the less clear conflicts associated with investigator egos and the pressures to produce positive results in order to get refunded.  We should require scrutiny of protocols, whether they were faithfully implemented, with the resulting data analyzed according to a priori plans. To do that, we need unrestricted access to data and the opportunity to reanalyze it from multiple perspectives.

Results of clinical trials should be examined wherever possible in replications and extensions in new settings. But this frequently requires resources that are unlikely to be available

We are unlikely ever to see anything for clinical trials resembling the replication initiatives such as the Open Science Collaboration’s (OSC) Replication Project: Psychology. The OSC depends on mass replications involving either samples of college students or recruitment from the Internet. Most of the studies involved in the OSC did not have direct clinical or public health implications. In contrast, clinical trials usually do and require different approaches to insure the trustworthiness of findings that are claimed.

Access to the internal documents of Forest Laboratories revealed a deliberate, concerted effort to produce results consistent with the agenda of vested interests, even where prespecified analyses yielded contradictory findings. There was clear intent. But we don’t need to assume an attempt to deceive and defraud in order to insist on the opportunity to re-examine findings that affect patients and public health. As US Vice President Joseph Biden recently declared, securing advances in biomedicine and public health depends on broad and routine sharing and re-analysis of data.

My usual disclaimer: All views that I express are my own and do not necessarily reflect those of PLOS or other institutional affiliations.

Deep Brain Stimulation: Unproven treatment promoted with a conflict of interest in JAMA: Psychiatry [again]

“Even with our noisy ways and cattle prods in the brain, we have to take care of sick people, now,” – Helen Mayberg

“All of us—researchers, journalists, patients and their loved ones–are desperate for genuine progress in treatments for severe mental illness. But if the history of such treatments teaches us anything, it is that we must view claims of dramatic progress with skepticism, or we will fall prey to false hopes.” – John Horgan

An email alert announced the early release of an article in JAMA: Psychiatry reporting effects of brain stimulation therapy for depression (DBS). The article was accompanied by an editorial commentary.

Oh no! Is an unproven treatment once again being promoted by one of the most prestigious psychiatry journals with an editorial commentary reeking of vested interests?

Indeed it is, but we can use the article and commentary as a way of honing our skepticism about such editorial practices and to learn better where to look to confirm or dispel our suspicions when they arise.

Xray depictionLike many readers of this blog, there was a time when I would turn to a trusted, prestigious source like JAMA: Psychiatry with great expectations. Not being an expert in a particular area like DBS, I would be inclined to accept uncritically what I read. But then I noticed how much of what I read conflicted with what I already knew about research design and basic statistics. Time and time again, this knowledge proved sufficient to detect serious hype, exaggeration, and simply false claims.

The problem was no longer simply one of the authors adopting questionable research practices. It expanded to journals and professional organizations adopting questionable publication practices that fit with financial, political, and other, not strictly scientific agendas.

What is found in the most prestigious biomedical journals is not necessarily the most robust and trustworthy of scientific findings. Rather, content is picked in terms of its ability to be portrayed as innovative and breakthrough medicine. But beyond that, the content is consistent with prevailing campaigns to promote particular viewpoints and themes. There is apparently no restriction on those who might most personally profit being selected for accompanying commentaries.

We need to recognize that editorial commentaries often receive weak or no peer review. Invitations from editors to provide commentaries are often a matter of sharing nonscientific agenda and simple cronyism.

Coming to these conclusions, I have been on a mission to learn better how to detect hype and hokum and I have invited readers of my blog posts to come along.

This installment builds on my recent discussion of an article claiming remission of suicidal ideation by magnetic seizure therapy. Like the editorial commentary accompanying previous JAMA: Psychiatry article, the commentary discussed here had an impressive conflict of interest disclosure. The disclosure probably would not have prompted me to search on the Internet for other material about one of the authors. Yet, a search revealed some information that is quite relevant to our interpretation of the new article and its commentary.  We can ponder whether this information should have been withheld. I think it should have been disclosed.

The lesson that I learned is a higher level of vigilance is needed to interpret highly touted article-commentary combos in prestigious journals. Unless we are going to simply dismiss them as advertisements or propaganda, rather than a highlighting of solid biomedical science.

Sadly, though, this exercise convinced me that efforts to scrutinize claims by turning to seemingly trustworthy supplementary sources can provide a misleading picture.

The article under discussion is:

Bergfeld IO, Mantione M, Hoogendoorn MC, et al. Deep Brain Stimulation of the Ventral Anterior Limb of the Internal Capsule for Treatment-Resistant Depression: A Randomized Clinical Trial. JAMA Psychiatry. Published online April 06, 2016. doi:10.1001/jamapsychiatry.2016.0152.

The commentary is:

Mayberg HS, Riva-Posse P, Crowell AL. Deep Brain Stimulation for Depression: Keeping an Eye on a Moving Target. JAMA Psychiatry. Published online April 06, 2016. doi:10.1001/jamapsychiatry.2016.0173.

The trial registration is

Deep Brain Stimulation in Treatment-refractory patients with Major Depressive Disorder.

Pursuing my skepticism by searching on the Internet, I immediately discovered a series of earlier blog posts about DBS by Neurocritic [1] [2] [3] that saved me a lot of time and directed me to still other useful sources. I refer to what I learned from Neurocritic in this blog post. But as always, all opinions are entirely my responsibility, along with misstatements and any inaccuracies.

But what I learned from immediately from Neurocritic is that BSD is a hot area of research, even if it continues to produce disappointing outcomes.

DBS had a commitment of $70 million from President Obama’s Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative . Premised on the causes of psychopathology being in precise, isolated neural circuitry, it is the poster children of the Research Domain Criteria (RDoC) of former NIMH director Thomas Insel. Neurocritic points to Insel promotion of “electroceuticals” like DBS in his NIMH Director’s Blog 10 Best of 2013:

The key concept: if mental disorders are brain circuit disorders, then successful treatments need to tune circuits with precision. Chemicals may be less precise than electrical or cognitive interventions that target specific circuits.

The randomized trial of deep brain stimulation for depression.

The objective of the trial was:

To assess the efficacy of DBS of the ventral anterior limb of the internal capsule (vALIC), controlling for placebo effects with active and sham stimulation phases.

Inclusion criteria were a diagnosis of major depressive disorder designated as being treatment resistant (TRD) on the basis of

A failure of at least 2 different classes of second-generation antidepressants (eg, selective serotonin reuptake inhibitor), 1 trial of a tricyclic antidepressant, 1 trial of a tricyclic antidepressant with lithium augmentation, 1 trial of a monoamine oxidase inhibitor, and 6 or more sessions of bilateral electroconvulsive therapy.

Twenty-five patients with TRD from 2 Dutch hospitals first received surgery that implanted four contact electrodes deep within their brains. The electrodes were attached to tiny wires leading to a battery-powered pulse generator implanted under their collar bones.

The standardized DBS treatment started after a three-week recovery from the surgery. Brain stimulation was continuous one week after surgery, but at three weeks, patients begin visits with psychiatrists or psychologists on what was at first a biweekly basis, but later less frequently.

deep brain stimulation teamAt the visits, level of depression was assessed and adjustments were made to various parameters of the DBS, such as the specific site targeted in the brain, voltage, and pulse  frequency and amplitude. Treatment continued until optimization – either four weeks of sustained improvement on depression rating scales or the end of the 52 week period. In the original protocol, this this phase of the study was limited to six months, but was extended after experience with a few patients. Six patients went even longer than the 52 weeks to achieve optimization.

Once optimization was achieved, patients were randomized to a crossover phase in which they received two blocks of six weeks of either continued active or sham treatment that involved simply turning off the stimulation. Outcomes were classified in terms of investigator-rated changes in the 17-item Hamilton Depression Rating Scale.

The outcome of the open-label phase of the study was the change of the investigator-rated HAM-D-17 score (range, 0-52) from baseline to T2. In addition, we classified patients as responders (≥50% reduction of HAM-D-17 score at T2 compared with baseline) or nonresponders (<50% reduction of HAM-D-17 score atT2 compared with baseline). Remission was defined as a HAM-D-17 score of 7 or less at T2. The primary outcome measure of the randomized, double-blind crossover trial was the difference in HAM-D-17 scores between the active and sham stimulation phases. In a post hoc analysis, we tested whether a subset of nonresponders showed a partial response (≥25% but <50% reduction of HAM-D-17 score at T2 compared with baseline).

Results

Clinical outcomes. The mean time to first response in responders was 53.6 (50.6) days (range, 6-154 days) after the start of treatment optimization. The mean HAM-D-17 scores decreased from 22.2 (95%CI, 20.3-24.1) at baseline to 15.9 (95% CI, 12.3-19.5) at T2.

An already small sample shrank further from initial assessment of eligibility until retention at the end of the cross over study. Of the 52 patients assessed for eligibility, 23 were in eligible and four refused. Once the optimization phase of the trial started, four patients withdrew for lack of effect. Another five could not be randomized in the crossover phase, three because of an unstable psychiatric status, one because of fear of worsening symptoms, and one because of their physical health. So, the randomized phase of the trial consisted of nine patients randomized to the active treatment and then the sham and another seven patients randomized to the sham and then active treatment.

The crossover to sham treatment did not go as planned. Of the nine (three responders and six nonresponders) randomized to the active-then-sham condition, all had to be crossed over early – one because the patient requested a crossover, two because of a gradual increase in symptoms, and three because of logistics. Of the seven patients assigned to sham- first (four responders and three nonresponders), all had to be crossed over within a day because of increases in symptoms.

I don’t want to get lost in the details here. But we are getting into small numbers with nonrandom attrition, imbalanced assignment of responders versus nonresponders in the randomization, and the breakdown of the planned sham treatment. From what I’ve read elsewhere about DBS, I don’t think that providers or patients were blinded to the sham treatment. Patients should be able to feel the shutting off of the stimulator.

Adverse events. DBS has safety issues. Serious adverse events included severe nausea during surgery (1 patient), suicide attempt (4 patients), and suicidal ideation (2 patients). Two nonresponders died several weeks after they withdrew from the study and DBS had been stopped (1 suicide, 1 euthanasia). Two patients developed full blown mania during treatment and another patient became hypomanic.

The article’s Discussion claims

We found a significant reduction of depressive symptoms following vALIC DBS, resulting in response in 10 patients (40%) and partial response in 6 (24%) patients with TRD.

Remission was achieved in 5 (20%) patients. The randomized active-sham phase study design indicates that reduction of depressive symptoms cannot be attributed to placebo effects…

Conclusions

This trial shows efficacy of DBS in patients with TRD and supports the possible benefits of DBS despite a previous disappointing randomized clinical trial. Further specification of targets and the most accurate setting optimization as well as larger randomized clinical trials are necessary.

A clinical trial with starting with 25 patients does not have much potential to shift our confidence in the efficacy of DBS. Any hope of doing so was further dashed when the sample was reduced to 17 patients who remained for the investigators’ attempted randomization to an active treatment versus sham comparison (seven responders and nine nonresponders). Then sham condition could not be maintained as planed in the protocol for any patients.

The authors interpreted the immediate effects of shifting to sham treatment as ruling out any placebo effect. However, it’s likely that shutting off the stimulator was noticeable to the patients and the immediacy of effects speaks to likelihood an effect due to the strong expectations of patients with intolerable depression having their hope taken away. Some of the immediate response could’ve been a nocebo response.

Helen Mayberg and colleagues’ invited commentary

The commentary attempted to discourage a pessimistic assessment of DBS based on the difficulties implementing the original plans for the study as described in the protocol.

A cynical reading of the study by Bergfeld et al1 might lead to the conclusion that the labor-intensive and expert-driven tuning of the DBS device required for treatment response makes this a nonviable clinical intervention for TRD. On the contrary, we see a tremendous opportunity to retrospectively characterize the various features that best define patients who responded well to this treatment. New studies could test these variables prospectively.

The substantial deviation from protocol that occurred after only two patients were entered into the trial was praised in terms of the authors’ “tenacious attempts to establish a stable response”:

We appreciate the reality of planning a protocol with seemingly conservative time points based on the initial patients, only to find these time points ultimately to be insufficient. The authors’ tenacious attempts to establish a stable response by extending the optimization period from the initial protocol using 3 to 6 months to a full year is commendable and provides critical information for future trials.

Maybe, but I think the need for this important change, along with the other difficulties that were encountered in implementing the study, speak to a randomized controlled trial of DBS being premature.

Conflict of Interest Disclosures: Dr Mayberg has a paid consulting agreement with St Jude Medical Inc, which licensed her intellectual property to develop deep brain stimulation for the treatment of severe depression (US 2005/0033379A1). The terms of this agreement have been reviewed and approved by Emory University in accordance with their conflict of interest policies. No other disclosures were reported.

Helen Mayberg’s declaration of interest clearly identifies her as someone who is not a detached observer, but who would benefit financially and professionally from any strengthening the claims for the efficacy of DBS. We are alerted by this declaration, but I think there were some things that were not mentioned in the article or editorial about Helen Mayberg’s work that would influence her credibility even more if they were known.

Helen Mayberg’s anecdotes and statistics about the success of DBS

Mayberg has been attracting attention for over a decade with her contagious exuberance for DBS. A 2006 article in the New York Times by David Dobbs started with a compelling anecdote of one of Mayberg’s patients being able to resume a normal life after previous ineffective treatments for severe depression. The story reported the success with 8 of12 patients treated with DBS:

They’ve re-engaged their families, resumed jobs and friendships, started businesses, taken up hobbies old and new, replanted dying gardens. They’ve regained the resilience that distinguishes the healthy from the depressed.

Director of NIMH Tom Insel chimed in:

“People often ask me about the significance of small first studies like this,” says Dr. Thomas Insel, who as director of the National Institute of Mental Health enjoys an unparalleled view of the discipline. “I usually tell them: ‘Don’t bother. We don’t know enough.’ But this is different. Here we know enough to say this is something significant. I really do believe this is the beginning of a new way of understanding depression.”

A 2015 press release from Emory University, Targeting depression with deep brain stimulation, gives another anecdote of a dramatic treatment success.

Okay, we know to be skeptical about University press releases, but then there are the dramatic anecdotes and even numbers in a news article in Science, Short-Circuiting Depression that borders on an infomercial for Mayberg’s work.

short-circuiting depression

Since 2003, Mayberg and others have used DBS in area 25 to treat depression in more than 100 patients. Between 30% and 40% of patients do “extremely well”—getting married, going back to work, and reclaiming their lives, says Sidney Kennedy, a psychiatrist at Toronto General Hospital in Canada who is now running a DBS study sponsored by the medical device company St. Jude Medical. Another 30% show modest improvement but still experience residual depression. Between 20% and 25% do not experience any benefit, he says. People contemplating brain surgery might want better odds, but patients with extreme, relentless depression often feel they have little to lose. “For me, it was a last resort,” Patterson says.

By making minute adjustments in the positions of the electrodes, Mayberg says, her team has gradually raised its long-term response rates to 75% to 80% in 24 patients now being treated at Emory University.

A chronically depressed person or someone who cares for someone who is depressed might be motivated to go on the Internet and try to find more information about Mayberg’s trial. A website for Mayberg’s BROADEN (BROdmann Area 25 DEep brain Neuromodulation) study once provided a description of the study, answers to frequently asked questions, and an opportunity to register for screening for the study. However, it’s no longer accessible through Google or other search engines. But you can reach an archived website with a link provided by Neurocritic, but the click links are no longer functional.

Neurocritic’s blog posts about Mayberg and DBS

If you are lucky, a Google search for Mayberg deep brain stimulation, might bring you to any of three blog posts by Neurocritic [1] [2] [3] that have rich links and provide a very different story of Mayberg and DBS.

One link takes you to the trial registration for Mayberg’s BROADEN study: A Clinical Evaluation of Subcallosal Cingulate Gyrus Deep Brain Stimulation for Treatment-Resistant Depression. The updated file registration indicates that the study will end in September 2017, and that the study is ongoing but not recruiting participants.

This information should have been updated, as should other publicity about Mayberg’s BROADEN study. Namely, as Neurocritic documents, the company attempting to commercialize DBS by funding the study, St. Jude Medical terminated after futility analyses indicated that further enrollment of patients had only a 17% probability of achieving a significant effect. At the point of terminating the trial, 125 patients had been role.

Neurocritic also provides a link to an excellent, open access review paper:

Morishita T, Fayad SM, Higuchi MA, Nestor KA, Foote KD. Deep brain stimulation for treatment-resistant depression: systematic review of clinical outcomes. Neurotherapeutics. 2014 Jul 1;11(3):475-84.

The article reveals that although there are 22 published studies of DBS for treatment-resistant depression, only three are randomized trials, one of which was completed with null results. Two – including Mayberg’s BROADEN trial – were discontinued because futility analyses indicate that a finding of efficacy for the treatment was unlikely.

Finally, Neurocritic  also provides a link to a Neurotech Business Report, Depressing Innovation:

The news that St. Jude Medical failed a futility analysis of its BROADEN trial of DBS for treatment of depression cast a pall over an otherwise upbeat attendance at the 2013 NANS meeting [see Conference Report, p7]. Once again, the industry is left to pick up the pieces as a promising new technology gets set back by what could be many years.

It’s too early to assess blame for this failure. It’s tempting to wonder if St. Jude management was too eager to commence this trial, since that has been a culprit in other trial failures. But there’s clearly more involved here, not least the complexity of specifying the precise brain circuits involved with major depression. Indeed, Helen Mayberg’s own thinking on DBS targeting has evolved over the years since the seminal paper she and colleague Andres Lozano published in Neuron in 2005, which implicated Cg25 as a lucrative target for depression. Mayberg now believes that neuronal tracts emanating from Cg25 toward medial frontal areas may be more relevant [NBR Nov13 p1]. Research that she, Cameron McIntyre, and others are conducting on probabilistic tractography to identify the patient-specific brain regions most relevant to the particular form of depression the patient is suffering from will likely prove to be very fruitful in the years ahead.

So, we have a heavily hyped unproven treatment for which the only clinical trials have either been null or terminated following a futility analysis. Helen Mayberg, a patent holder associated with one of these trials was inappropriate to be recruited for commentary on another, more modestly sized trial that also ran into numerous difficulties that can be taken to suggest it was premature. However, I find it outrageous that so little effort has been made to correct the record concerning her BROADEN trial or even to acknowledge its closing in the JAMA: Psychiatry commentary.

Untold numbers of depressed patients who don’t get expected benefits from available treatments are being misled with false hope from anecdotes and statistics from a trial that was ultimately terminated.

I find troubling what my exercise showed might happen when someone who is motivated by the skepticism goes to the Internet and tries to get additional information about the JAMA: Psychiatry paper. They could be careful to rely on only seemingly credible sources – a trial registration and a Science article.  The Science article is not peer-reviewed but nonetheless has a credibility conveyed appearing in the premier and respected Science. The trial registration has not been updated with valuable information and the Science article gives no indication how it is contradicted by better quality evidence. So, they would be misled.

 

 

Remission of suicidal ideation by magnetic seizure therapy? Neuro-nonsense in JAMA: Psychiatry

A recent article in JAMA: Psychiatry:

Sun Y, Farzan F, Mulsant BH, Rajji TK, Fitzgerald PB, Barr MS, Downar J, Wong W, Blumberger DM, Daskalakis ZJ. Indicators for remission of suicidal ideation following magnetic seizure therapy in patients with treatment-resistant depression. JAMA Psychiatry. 2016 Mar 16.

Was accompanied by an editorial commentary:

Camprodon JA, Pascual-Leone A. Multimodal Applications of Transcranial Magnetic Stimulation for Circuit-Based Psychiatry. JAMA: Psychiatry. 2016 Mar 16.

Together both the article and commentary can be studied as:

  • An effort by the authors and the journal itself to promote prematurely a treatment for reducing suicide.
  • A pay back to sources of financial support for the authors. Both groups have industry ties that provide them with consulting fees, equipment, grants, and other unspecified rewards. One author has a patent that should increase in value as result of this article and commentary.
  • A bid for successful applications to new grant initiatives with a pledge of allegiance to the NIMH Research Domain Criteria (RDoC).

After considering just how bad the science and reporting:

We have sufficient reason to ask how did this promotional campaign come about? Why was this article accepted by JAMA:Psychiatry? Why was it deemed worthy of comment?

I think a skeptical look at this article would lead to a warning label:

exclamation pointWarning: Results reported in this article are neither robust nor trustworthy, but considerable effort has gone into promoting them as innovative and even breakthrough. Skepticism warranted.

As we will see, the article is seriously flawed as a contribution to neuroscience, identification of biomarkers, treatment development, and suicidology, but we can nonetheless learn a lot from it in terms of how to detect such flaws when they are more subtle. If nothing else, your skepticism will be raised about articles accompanied by commentaries in prestigious journals and you will learn tools for probing such pairs of articles.

 

This article involves intimidating technical details and awe-inspiring figures.

figure 1 picture onefigure 1 picture two

 

 

 

 

 

 

 

 

 

Yet, as in some past blog posts concerning neuroscience and the NIMH RDoC, we will gloss over some technical details, which would be readily interpreted by experts. I would welcome the comments and critiques from experts.

I nonetheless expect readers to agree when they have finished this blog post that I have demonstrated that you don’t have to be an expert to detect neurononsense and crass publishing of articles that fit vested interests.

The larger trial from which these patients is registered as:

ClinicalTrials.gov. Magnetic Seizure Therapy (MST) for Treatment Resistant Depression, Schizophrenia, and Obsessive Compulsive Disorder. NCT01596608.

Because this article is strikingly lacking in crucial details or details in places where we would expect to find them, it will be useful at times to refer to the trial registration.

The title and abstract of the article

As we will soon see, the title, Indicators for remission of suicidal ideation following MST in patients with treatment-resistant depression is misleading. The article has too small sample and too inappropriate a design to establish anything as a reproducible “indicator.”

That the article is going to fail to deliver is already apparent in the abstract.

The abstract states:

 Objective  To identify a biomarker that may serve as an indicator of remission of suicidal ideation following a course of MST by using cortical inhibition measures from interleaved transcranial magnetic stimulation and electroencephalography (TMS-EEG).

Design, Setting, and Participants  Thirty-three patients with TRD were part of an open-label clinical trial of MST treatment. Data from 27 patients (82%) were available for analysis in this study. Baseline TMS-EEG measures were assessed within 1 week before the initiation of MST treatment using the TMS-EEG measures of cortical inhibition (ie, N100 and long-interval cortical inhibition [LICI]) from the left dorsolateral prefrontal cortex and the left motor cortex, with the latter acting as a control site.

Interventions The MST treatments were administered under general anesthesia, and a stimulator coil consisting of 2 individual cone-shaped coils was used.

Main Outcomes and Measures Suicidal ideation was evaluated before initiation and after completion of MST using the Scale for Suicide Ideation (SSI). Measures of cortical inhibition (ie, N100 and LICI) from the left dorsolateral prefrontal cortex were selected. N100 was quantified as the amplitude of the negative peak around 100 milliseconds in the TMS-evoked potential (TEP) after a single TMS pulse. LICI was quantified as the amount of suppression in the double-pulse TEP relative to the single-pulse TEP.

Results  Of the 27 patients included in the analyses, 15 (56%) were women; mean (SD) age of the sample was 46.0 (15.3) years. At baseline, patients had a mean SSI score of 9.0 (6.8), with 8 of 27 patients (30%) having a score of 0. After completion of MST, patients had a mean SSI score of 4.2 (6.3) (pre-post treatment mean difference, 4.8 [6.7]; paired t26 = 3.72; P = .001), and 18 of 27 individuals (67%) had a score of 0 for a remission rate of 53%. The N100 and LICI in the frontal cortex—but not in the motor cortex—were indicators of remission of suicidal ideation with 89% accuracy, 90% sensitivity, and 89% specificity (area under the curve, 0.90; P = .003).

Conclusions and Relevance  These results suggest that cortical inhibition may be used to identify patients with TRD who are most likely to experience remission of suicidal ideation following a course of MST. Stronger inhibitory neurotransmission at baseline may reflect the integrity of transsynaptic networks that are targeted by MST for optimal therapeutic response.

Even viewing the abstract alone, we can see this article is in trouble. It claims to identify a biomarker following a course of magnet seizure therapy (MST) ]. That is an extraordinary claim when a study only started with 33 patients of whom only 27 remain for analysis. Furthermore, at the initial assessment of suicidal ideation, eight of the 27 patients did not have any and so could show no benefit of treatment.

Any results could be substantially changed with any of the four excluded patients being recovered for analysis and any of the 27 included patients being dropped from analyses as an outlier. Statistical controls to control for potential confounds will produce spurious results because of overfit equations ] with even one confound. We also know well that in situation requiring control of possible confounding factors, control of only one is really sufficient and often produces worse results than leaving variables unadjusted.

Identification of any biomarkers is unlikely to be reproducible in larger more representative samples. Any claims of performance characteristics of the biomarkers (accuracy, sensitivity, specificity, area under the curve) are likely to capitalize on sampling and chance in ways that are unlikely to be reproducible.

Nonetheless, the accompanying figures are dazzling, even if not readily interpretable or representative of what would be found in another sample.

Comparison of the article to the trial registration.

According to the trial registration, the study started in February 2012 and the registration was received in May 2012. There were unspecified changes as recently as this month (March 2016), and the study is expected to and final collection of primary outcome data is in December 2016.

Primary outcome

The registration indicates that patients will have been diagnosed with severe major depression, schizophrenia or obsessive compulsive disorder. The primary outcome will depend on diagnosis. For depression it is the Hamilton Rating Scale for Depression.

There is no mention of suicidal ideation as either a primary or secondary outcome.

Secondary outcomes

According to the registration, outcomes include (1) cognitive functioning as measured by episodic memory and non-memory cognitive functions; (2) changes in neuroimaging measures of brain structure and activity derived from fMRI and MRI from baseline to 24th treatment or 12 weeks, whichever comes sooner.

Comparison to the article suggests some important neuroimaging assessment proposed in the registration were compromised. (1) only baseline measures were obtained and without MRI or fMRI; and (2) the article states

Although magnetic resonance imaging (MRI)–guided TMS-EEG is more accurate than non–MRI-guided methods, the added step of obtaining an MRI for every participant would have significantly slowed recruitment for this study owing to the pressing

need to begin treatment in acutely ill patients, many of whom were experiencing suicidal ideation. As such, we proceeded with non–MRI-guided TMS-EEG using EEG-guided methods according to a previously published study.

Treatment

magnetic seizure therapyThe article provides some details of the magnetic seizure treatment:

The MST treatments were administered under general anesthesia using a stimulator machine (MagPro MST; MagVenture) with a twin coil. Methohexital sodium (n = 14), methohexital with remifentanil hydrochloride (n = 18), and ketamine hydrochloride (n = 1) were used as the anesthetic agents. Succinylcholine chloride was used as the neuromuscular blocker. Patients had a mean (SD) seizure duration of 45.1 (21.4) seconds. The twin coil consists of 2 individual cone-shaped coils. Stimulation was delivered over the frontal cortex at the midline position directly over the electrode Fz according to the international 10-20 system.36 Placing the twin coil symmetrically over electrode Fz results in the centers of the 2 coils being over F3 and F4. Based on finite element modeling, this configuration produces a maximum induced electric field between the 2 coils, which is over electrode Fz in this case.37 Patients were treated for 24 sessions or until remission of depressive symptoms based on the 24-item Hamilton Rating Scale for Depression (HRSD) (defined as an HRSD-24 score ≤10 and 60% reduction in symptoms for at least 2 days after the last treatment).38 These remission criteria were standardized from previous ECT depression trials.39,40 Further details of the treatment protocol are available,30 and comprehensive clinical and neurophysiologic trial results will be reported separately.

The article intended to refer the reader to the trial registration for further description of treatment, but the superscript citation in the article is inaccurate. Regardless, given other deviations from registration, readers can’t tell whether any deviations from what was proposed. In in the registration, seizure therapy was described as involving:

100% machine output at between 25 and 100 Hz, with coil directed over frontal brain regions, until adequate seizure achieved. Six treatment sessions, at a frequency of two or three times per week will be administered. If subjects fail to achieve the pre-defined criteria of remission at that point, the dose will be increased to the maximal stimulator output and 3 additional treatment sessions will be provided. This will be repeated a total of 5 times (i.e., maximum treatment number is 24). 24 treatments is typically longer that a conventional ECT treatment course.

One important implication is for this treatment being proposed as resolving suicidal ideation. It takes place over a considerable period of time. Patients who die by suicide notoriously break contact before doing so. It would seem that a required 24 treatments delivered on an outpatient basis would provide ample opportunities for breaks – including demoralization because so many treatments are needed in some cases – and therefore death by suicide

But a protocol that involves continuing treatment until a prespecified reduction in the Hamilton Depression Rating Scale is achieved assures that there will be a drop in suicidal ideation. The interview-based Hamilton depression rating scales and suicidal ideation are highly correlated.

eeg-electroencephalogrphy-250x250There is no randomization or even adequate description of patient accrual in terms of the population from which the patients came. There is no control group and therefore no control for nonspecific factors. The patients are being subject to an elaborate, intrusive ritual In terms of nonspecific effects. The treatment involves patients in an elaborate ritual, starting with electroencephalographic (EEG) assessment [http://www.mayoclinic.org/tests-procedures/eeg/basics/definition/prc-20014093].

The ritual will undoubtedly will undoubtedly have strong nonspecific factors associated with it – instilling a positive expectations and considerable personal attention.

The article’s discussion of results

The discussion opens with some strong claims, unjustified by the modesty of the study and the likelihood that its specific results are not reproducible:

We found that TMS-EEG measures of cortical inhibition (ie, the N100 and LICI) in the frontal cortex, but not in the motor cortex, were strongly correlated with changes in suicidal ideation in patients with TRD who were treated with MST. These findings suggest that patients who benefitted the most from MST demonstrated the greatest cortical inhibition at baseline. More important, when patients were divided into remitters and nonremitters based on their SSI score, our results show that these measures can indicate remission of suicidal ideation from a course of MST with 90% sensitivity and 89% specificity.

Pledge of AllegianceThe discussion contains a Pledge of Allegiance to the research domain criteria approach that is not actually a reflection of the results at hand. Among the many things that we knew before the study was done and that was not shown by the study, is to suicidal ideation is so hopelessly linked to hopelessness, negative affect, and attentional biases, that in such a situation is best seen as a surrogate measure of depression, rather than a marker for risk of suicidal acts or death by suicide.

 

 

Wave that RDoC flag and maybe you will attract money from NIMH.

Our results also support the research domain criteria approach, that is, that suicidal ideation represents a homogeneous symptom construct in TRD that is targeted by MST. Suicidal ideation has been shown to be linked to hopelessness, negative affect, and attentional biases. These maladaptive behaviors all fall under the domain of negative valence systems and are associated with the specific constructs of loss, sustained threat, and frustrative nonreward. Suicidal ideation may represent a better phenotype through which to understand the neurobiologic features of mental illnesses.In this case, variations in GABAergic-mediated inhibition before MST treatment explained much of the variance for improvements in suicidal ideation across individuals with TRD.

Debunking ‘a better phenotype through which to understand the neurobiologic features of mental illnesses.’

  • Suicide is not a disorder or a symptom, but an infrequent, difficult to predict and complex act that varies greatly in nature and circumstances.
  • While some features of a brain or brain functioning may be correlated with eventual death by suicide, most identifications they provide of persons at risk to eventually die by suicide will be false positives.
  • In the United States, access to a firearm is a reliable proximal cause of suicide and is likely to be more so than anything in the brain. However, this basic observation is not consistent with American politics and can lead to grant applications not being funded.

In an important sense,

  • It’s not what’s going on in the brain, but what’s going in the interpersonal context of the brain, in terms of modifiable risk for death by suicide.

The editorial commentary

On the JAMA: Psychiatry website, both the article and the editorial commentary contain sidebar links to each other. Is only in the last two paragraphs of a 14 paragraph commentary that the target article is mentioned. However, the commentary ends with a resounding celebration of the innovation this article represents [emphasis added]:

Sun and colleagues10 report that 2 different EEG measures of cortical inhibition (a negative evoked potential in the EEG that happens approximately 100 milliseconds after a stimulus or event of interest and long-interval cortical inhibition) evoked by TMS to the left dorsolateral prefrontal cortex, but not to the left motor cortex, predicted remission of suicidal ideation with great sensitivity and specificity. This study10 illustrates the potential of multimodal TMS to study physiological properties of relevant circuits in neuropsychiatric populations. Significantly, it also highlights the anatomical specificity of these measures because the predictive value was exclusive to the inhibitory properties of prefrontal circuits but not motor systems.

Multimodal TMS applications allow us to study the physiology of human brain circuitry noninvasively and with causal resolution, expanding previous motor applications to cognitive, behavioral, and affective systems. These innovations can significantly affect psychiatry at multiple levels, by studying disease-relevant circuits to further develop systems for neuroscience models of disease and by developing tools that could be integrated into clinical practice, as they are in clinical neurophysiology clinics, to inform decision making, the differential diagnosis, or treatment planning.

Disclosures of conflicts of interest

The article’s disclosure of conflicts of interest statement is longer than the abstract.

conflict of interest disclosure

The disclosure for the conflicts of interest for the editorial commentary is much shorter but nonetheless impressive:

editorial commentary disclosures

How did this article get into JAMA: Psychiatry with an editorial comment?

Editorial commentaries are often provided by reviewers who either simply check the box on the reviewers’ form indicating their willingness to provide a comment. For reviewers who already have a conflict of interest, this provides an additional one: a non-peer-reviewed paper in which they can promote their interest.

Alternatively, commentators are simply picked by an editor who judges an article to be noteworthy of special recognition. It’s noteworthy that at least one of the associate editors of JAMA: Psychiatry is actively campaigning for a particular direction to suicide research funded by NIMH as seen in an editorial comment of his own that I recently discussed. One of the authors of this paper currently under discussion was until recently a senior member of this associate editor’s department, before departing to become Chair of the Department of Psychiatry at University of Toronto.

Essentially the authors of the paper and the authors of the commentary of providing carefully constructed advertisers for themselves and their agenda. The opportunity for them to do so is because of consistency with the agenda of at least one of the editors, if not the journal itself.

The Committee on Publication Ethics (COPE)   requires that non-peer-reviewed material in ostensibly peer reviewed journals be labeled as such. This requirement is seldom met.

The journal further promoted this article by providing 10 free continuing medical education credits for reading it.

I could go on much longer identifying other flaws in this paper and its editorial commentary. I could raise other objections to the article being published in JAMA:Psychiatry. But out of mercy for the authors, the editor, and my readers, I’ll stop here.

I would welcome comments about other flaws.

Special thanks to Bernard “Barney” Carroll for his helpful comments and encouragement, but all opinions expressed and all factual errors are my own responsibility.

Getting realistic about changing the direction of suicide prevention research

A recent JAMA: Psychiatry article makes some important points about the difficulties addressing suicide as a public health problem before sliding into the authors’ promotion of their personal agendas.

Christensen H, Cuijpers P, Reynolds CF. Changing the Direction of Suicide Prevention Research: A Necessity for True Population Impact. JAMA Psychiatry. 2016.

This issue of Mind the Brain:

  • Reviews important barriers to effective approaches to reducing suicide, as cited in the editorial.
  • Discusses editorials in general as a form of privileged access publishing by which non-peer-reviewed material makes its way into ostensibly peer reviewed journals.
  • Identifies the self-promotional and personal agendas of the authors reflected in the editorial.
  • Notes that the leading means of death by suicide in the United States is not even mentioned, much less addressed in this editorial. I’ll discuss the politics behind this and why its absence reduces this editorial to a venture in triviality, except that it is a call for the waste of millions of dollars.

Barriers to reducing mortality by suicide

stop suicidePrevention of death by suicide becomes an important public health and clinical goal because of suicide’s contribution to overall mortality, the seeming senselessness of suicide, and its costs at a personal and social level. Yet as a relatively infrequent event, death by suicide resists prediction and effective preventive intervention.

Evidence concerning the formidable barriers to reducing death by suicide inevitably clashes with the strong emotional appeals and political agendas of those demanding suicide intervention programs.

Skeptics encounter stiff resistance and even vilification when they insist that clinical and social policy concerning suicide should be based on evidence.

Robin WilliamsA skeptic soon finds that trying to contest emotional and political appeals quickly becomes like trying to counter Ted Cruz or Donald Trump with evidence contradicting their proposals for dealing with terrorism or immigration. This is particularly likely after suicides by celebrities or a cluster of suicides by teenagers in a community. Who wants to pay attention to evidence when emotions are high and tears are flowing?

See my recent blog post, Preventing Suicide in All the Wrong Ways for some inconvenient truths about suicide and suicide prevention.

The JAMA: Psychiatry article’s identification of barriers

The JAMA: Psychiatry article identifies some key barriers to progress in reducing deaths due to suicide [bullet points added to direct quotes]:

  • Suicide rates in most Western countries have not decreased in the last decade, a finding that compares unfavorably with the progress made in other areas, such as breast and skin cancers, human immunodeficiency virus, and automobile accidents, for which the rates have decreased by 40% to 80%.
  • Preventing suicide is not easy. The base rate of suicide is low, making it hard to determine which individuals are at risk.
  • Our current approach to the epidemiologic risk factors has failed because prediction studies have no clinical utility—even the highest odds ratio is not informative at the individual level.
  • Decades of research on predicting suicides failed to identify any new predictors, despite the large numbers of studies.
  • A previous suicide attempt is our best marker of a future attempt, but 60% of suicides are by persons who had made no previous attempts.
  • Although recent studies in cognitive neuroscience have shed light on the cognitive “lesions” that underlie suicide risk, especially deficits in executive functioning, we have no biological markers of suicide risk, or indeed of any mental illness.
  • People at risk of suicide do not seek help. Eighty percent of people at risk have been in contact with health services prior to their attempts, but they do not identify themselves, largely because they do not think that they need help.
  • As clinicians, we know something about the long-term risk factors for suicide, but we are much less able to disambiguate short-term risk or high-risk factors from the background of long-term risk factors.

How do editorials come about? Not peer review!

 Among the many privileges of being editor-in-chief or associate editors of journals is the opportunity to commission articles that do not undergo peer review. Editors and their friends are among the regular recipients of these gifts that largely escape scrutiny.

Editorials often provide a free opportunity for self-citation and promotion of agenda. Over the years, I’ve noticed that editorials are frequently used to increase the likelihood that particular research topics will become a priority for funding for the particular ideas will be given advantage in competition for funding.

Editorials of great opportunities for self citation. If an editorial in a prestigious journal cites articles published in less prestigious places, readers will often cite the article, without bothering to examine the original source. This is a way of providing false authority  to poor quality or irrelevant evidence.

Not only do authors of commissioned articles get to say what they wish without peer review, they can restrict what can be said in reply. Journals are less willing to publish letters to the editor about editorials rather than empirical papers. They often give the writers of the editorial veto power over what criticism is published. Journals always give the writers of the editorial last word in any exchange.

So, editorials and commentaries can be free sweet plums if you know how to use them strategically.

The authors

Helen Christensen, PhD Black Dog Institute, University of New South Wales, Randwick, New South Wales, Australia.

Pim Cuijpers, PhD Department of Clinical, Neuro, and Developmental Psychology, Vrije Universiteit Amsterdam, the Netherlands

Charles F. Reynolds III, MD Department of Psychiatry and Neurology, Western Psychiatric Institute and Clinic, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.

The authors’ agendas

Helen Christianson

Helen Christianson is the Chief Scientist and Director of the Black Dog Institute, which is described at its website:

Our unique approach incorporates clinical services with our cutting-edge research, our health professional training and community education programs. We combine expertise in clinical management with innovative research to develop new, and more effective, strategies for people living with mental illness. We also place emphasis on teaching people to recognise the symptom of poor mental health in themselves and others, as well as providing them with the right psychological tools to hold the black dog at bay.

A key passage in the JAMA: Psychiatry editorial references her work.

Modeling studies have shown that if all evidence-based suicide prevention strategies were integrated into 1 multifaceted systems approach, about 20% to 25% of all suicides might be prevented.

Here is the figure from the editorial:

suicide prevenino strategies

The paper that is cited  would be better characterized as an advocacy piece, rather than a balanced systematic review.

Most fundamentally, Christiansen makes the mistake of summing attributable risk factors  to obtain a grand total of what would be accomplished if all of  a set of risk factors were addressed.

The problem is that attributable risk factors are dubious estimates derived from correlational analyses which assume that the entire correlation coefficient represents a modifiable risk. Such estimates ignore confounding. If one adds together attributable risk factors calculated in this manner, one gets a grossly inflated view of how much a phenomenon can be controlled. The attributable risk factors are themselves correlated and they share common confounds. That’s why it is bad science to combine them.

Christiansen identifies the top three modifiable risk for suicide as (1) training general practitioners in detection and treatment of suicidal risk, and notably depression; (2) training of gatekeepers such as school personnel, police, (and in some contexts, clergy) who might have contact with persons on the verge of dying by suicide; and (3) psychosocial treatments, namely psychotherapy.

Training of general practitioners and gatekeepers has not been shown to be an effective way of reducing rates of suicide. #Evidenceplease. I’ve been an external scientific advisor to over a decade of programs in Europe which emphasized these strategies. We will soon be publishing the last of our disappointing results.

Think of it: in order to be effective in averting death by suicide, training of police requires that police be on the scene in circumstances where they could use that training to prevent someone from dying by suicide, say, by jumping from a bridge or self-inflicted gun wounds. The likelihood is low that it would be a police officer with sufficient training being in the right place at the right time, with sufficient time and control of the situation to prevent a death. A police officer who had received training would unlikely encounter only a few, if any situations in an entire career.

The problem of death by suicide being an infrequent event that is poorly predicted again rears its ugly head.

Christiansen also makes a dubious assumption that more readily availability of psychotherapy will substantially reduce the risk of suicide. The problem is that persons who die by suicide are often in contact with professionals, but they either break the contact shortly before death or never disclose their intentions.

Christiansen provides a sizable estimate for the reduction in risk for suicide by means restriction.

]. Yet, I suspect that she underestimates the influence of this potentially modifiable factor.

She focuses on restricting access to prescription medications used in suicides by overdose. I don’t know if death-by-overdose data holds for even Australia, but the relevant means needing restriction in the United States is access to firearms. I will say more about that later.

So, Christiansen makes use of the editorial to sell her pet ideas and her institute markets training.

Pim Cuijpers

Pim Cuijpers doesn’t cite himself and doesn’t need to. He is rapidly accumulating a phenomenal record of publications and citations. But he is an advocate for large-scale programs incorporating technology, and notably the Internet to reduce suicide. His interests are reflected in passages like

Large-scale trials are also needed. Even if we did all of these things, large-scale research programs with millions of people are required, and technology by itself will not be enough. Although new large trials show that the effects of community programs can be effective,1,6 studies need to be bigger, combining all evidence-based medical and community strategies, using technology effectively to reduce costs of identification and treatment.

And

Help-seeking may well be assisted by using social media. Online social networks such as Facebook can be used to provide peer support and to change community attitudes in the ways already used by marketing industries. We can use the networks of “influencers” to modify attitudes and behavior in specific high-risk groups, such as the military, where suicide rates are high, or “captive audiences” in schools.

Disseminating effective programs is no longer difficult using online mental health programs. Although some early suicide apps and websites have been tested, better online interventions are needed that can respond to temporal fluctuations in suicide risk. The power of short-term prediction tools should be combined with the timely delivery of unobtrusive online or app personalized programs. However, if these development are not supported by government or industry and implemented at a population level, they will remain missed opportunities.

suicide is preventable
100% PREVENTABLE BY WHOM?

Pim Cuijpers is based the Netherlands and writing at a time when enthusiasm of  the European Research Council  is waning in funding large-scale suicide prevention programs, especially expensive ones requiring millions of participants. Such studies have been going on for over a decade and the yield is not impressive.

The projects on which I consulted adopted the reasonable assumption that because suicide is a rare event, a population of 500,000 would not be sufficient to detect a statistically significant reduction in suicide rates of less than 30%. Consider all the extraneous events that can impinge on comparisons between intervention and control sites in the time period in which the intervention could conceivably be influential. this is too low an estimate of the sample that would be needed.

The larger the sample, the greater the likelihood of extraneous influences, the greater the likelihood that the intervention wouldn’t prove effective at key moments when it was needed to avert a death by suicide, and the greater the cost. See more about this here.

Pim Cuijpers has been quite influential in developing in evaluating web-based and app-based interventions. But after initial enthusiasm, the field is learning that such resources are not effective if left unattended without users being provided with a sense that they are in some sort of a human relationship within which they are consistent use of this technology is being monitored and appreciated, as seen in appropriate feedback. Pim Cuijpers has contributed the valuable concept of supportive accountability.  I have borrowed it to explain what is missing when primary care physicians simply give depressed patients a password to an Internet program and leave it at that, expecting they will get any benefit.

Evaluations of such technology have been limited to whether they reduce depressive symptoms. There is as much a leap from evidence of such reductions, when they occur, claims about preventing suicide, as there is from leaping from evidence that psychotherapy reduces the depressive symptoms to a case that psychotherapy prevents suicide.

Enlisting users of Facebook to monitor and report expressions of suicidality is not evidence based, It is evaluated by some as a disaster and a consumer group is circulating a petition   demanding  that such practices stop. A critical incident  was

man gets arrested for fake suicide messageCharles F. Reynolds

Although Charles Reynolds does not reference his paper in the text of the editorial, but nonetheless cites it.

I have critiqued the study elsewhere. It was funded in a special review only because of political pressure from Senator Harry Reid. The senator’s father had died by suicide shortly after a visit to a primary care physician. Harry Reid required that Congress fund a study showing that improving the detection and treatment of suicidality in the elderly by primary care physicians would reduce suicide.

I was called by an NIMH program officer when I failed to submit a letter of intent concerning applying for that initiative. I told her it was a boondoggle because no one could show a reduction in suicides by targeting physician behavior. She didn’t disagree, but said a project would have to funded. She ended up a co-author on the PROSPECT paper. You don’t often see program officers getting authorship on papers from projects they fund.

The resulting PROSPECT study involved 20 primary care practices in three regions of the Northeastern United States. In the course of the intervention study, one patient in the intervention group died by suicide and two patients, one in each of the intervention and control group, made serious attempts. A multimillion dollar study confronted the low incidence of suicide, even among the elderly. Furthermore, the substantial baseline differences among the practices dwarfed any differences in suicidal ideation in the intervention versus control group. And has of discussed elsewhere [  ], suicidal ideation is a surrogate end point that can be changed by factors that do not alter risk for suicide. No one advocating more money for these kind of studies would want to get into the details of this one.

 

So, the editorial acknowledges the difficulties studying and preventing suicide as a public health issue. It suggests that an unprecedented large study costing millions of dollars would be necessary if progress is to be made. There are formidable barriers to implementing an intervention in a large population of the complexity of the editorial suggests is necessary. Just look at the problems that PROSPECT encountered.

Who will set the direction of suicide prevention research?

The editorial opens with a citation of a blog by the then Director of NIMH

Insel T. Director’s Blog: Targeting suicide. National Institutes of Health website. Posted April 2, 2015.

The blog calls for a large increase in funding for the research concerning suicide and its prevention. The definition of the problem is shaped by politics more than evidence. But at least the blog post is more candid than the editorial in making a passing reference to the leading means of suicide in the United States, firearms.

51 percent of suicide deaths in the U.S. were by firearms. Research has already demonstrated that reducing access to lethal means (including gun locks and barriers on bridges) can reduce death rates.

Great, but surely death by firearms deserves more mentioned than a passing reference to locks on guns if the Director of NIMH is serious about asking Congress for a massive increase in funding for suicide research. Or is he being smart in avoiding the issue and even brave in the passing reference that he makes to firearms?

Firearms deserve not only mention, but thoughtful analysis. But in the United States, it is politically dangerous and could threaten future funding. So we talk about other things.

Banning research on the role of firearms in suicide

For a source that is much more honest, evidence-based, and well argued than this JAMA: Psychiatry editorial, I recommend A Psychiatrist Debunks the Biggest Myths Surrounding Gun Suicides.

In 1996, Congress imposed a ban on research concerning the effects of gun ownership on public health, including suicide.

In the spring of 1996, the National Rifle Association and its allies set their sights on the Centers for Disease Control and Prevention for funding increasingly assertive studies on firearms ownership and the effects on public health. The gun rights advocates claimed the research veered toward advocacy and covered such logical ground as to be effectively useless.

At first, the House tried to close down the CDC’s entire, $46 million National Center for Injury Prevention. When that failed, [Congressman Jay Dickey to whom the Dickey amendment is named] Dickey stepped in with an alternative: strip $2.6 million that the agency had spent on gun studies that year. The money would eventually be re-appropriated for studies unrelated to guns. But the far more damaging inclusion was language that stated, “None of the funds made available for injury prevention and control at the Centers for Disease Control and Prevention may be used to advocate or promote gun control.”

Dickey proclaimed victory — an end, he said at the time, to the CDC’s attempts “to raise emotional sympathy” around gun violence. But the agency spent the subsequent years petrified of doing any research on gun violence, making the costs of the amendment clear even to Dickey himself.

He said the law was over-interpreted. Now, he looks at simple advances in highway safety — safety barriers, for example — and wonders what could have been done for guns.

The Dickey amendment does not specifically ban NIMH from investigating the role of firearms in suicide, but I think Tom Insel and all NIMH directors before and after him get the message.

Recently an effort to repeal the Dickey amendment failed:

Just hours before the mass shooting in San Bernardino on Wednesday, physicians gathered on Capitol Hill to demand an end to the Dickey Amendment restricting federal funding for gun violence research. Members of Doctors for America, the American College of Preventative Medicine, the American Academy of Pediatrics and others presented a petition against the research ban signed by more than 2,000 doctors.

“Gun violence is probably the only thing in this country that kills so many people, injures so many people, that we are not actually doing sufficient research on,” Dr. Alice Chen, the executive director of Doctors for America, told The Huffington Post.

Well over half a million people have died by firearms since 1996, when the ban on gun violence research was enacted, according to a HuffPost calculation of data through 2013 from Centers for Disease Control and Prevention. According to its sponsors, the Dickey Amendment was supposed to tamp down funding for what the National Rifle Association and other critics claimed was anti-gun advocacy research by the CDC’s National Center for Injury Prevention. In effect, it stopped federal gun violence research almost entirely.

So, why didn’t the Associate Editor of the JAMA: Psychiatry, Charles Reynolds exercise his editorial prerogative and support this effort to repeal the Dickey amendment, rather than lining up with his co-authors in a call for more wasteful research that avoids this important issue?

Study: Switching from antidepressants to mindfulness meditation increases relapse

  • A well-designed recent study found that patients with depression in remission who switch from maintenance antidepressants to mindfulness meditation without continuing medication had an increase in relapses.
  • The study is better designed and more transparently reported than a recent British study, but will get none of the British study’s attention.
  • The well-orchestrated promotion of mindfulness raises issues about the lack of checks and balances between investigators’ vested interest, supposedly independent evaluation, and the making of policy.

The study

Huijbers MJ, Spinhoven P, Spijker J, Ruhé HG, van Schaik DJ, van Oppen P, Nolen WA, Ormel J, Kuyken W, van der Wilt GJ, Blom MB. Discontinuation of antidepressant medication after mindfulness-based cognitive therapy for recurrent depression: randomised controlled non-inferiority trial. The British Journal of Psychiatry. 2016 Feb 18:bjp-p.

The study is currently behind a pay wall and does not appear to have a press release. These two factors will not contribute to it getting the attention it deserves.

But the protocol for the study is available here.

Huijbers MJ, Spijker J, Donders AR, van Schaik DJ, van Oppen P, Ruhé HG, Blom MB, Nolen WA, Ormel J, van der Wilt GJ, Kuyken W. Preventing relapse in recurrent depression using mindfulness-based cognitive therapy, antidepressant medication or the combination: trial design and protocol of the MOMENT study. BMC Psychiatry. 2012 Aug 27;12(1):1.

And the trial registration is here

Mindfulness Based Cognitive Therapy and Antidepressant Medication in Recurrent Depression. ClinicalTrials.gov: NCT00928980

The abstract

Background

Mindfulness-based cognitive therapy (MBCT) and maintenance antidepressant medication (mADM) both reduce the risk of relapse in recurrent depression, but their combination has not been studied.

Aims

To investigate whether MBCT with discontinuation of mADM is non-inferior to MBCT+mADM.

Method

A multicentre randomised controlled non-inferiority trial (ClinicalTrials.gov: NCT00928980). Adults with recurrent depression in remission, using mADM for 6 months or longer (n = 249), were randomly allocated to either discontinue (n = 128) or continue (n = 121) mADM after MBCT. The primary outcome was depressive relapse/recurrence within 15 months. A confidence interval approach with a margin of 25% was used to test non-inferiority. Key secondary outcomes were time to relapse/recurrence and depression severity.

Results

The difference in relapse/recurrence rates exceeded the non-inferiority margin and time to relapse/recurrence was significantly shorter after discontinuation of mADM. There were only minor differences in depression severity.

Conclusions

Our findings suggest an increased risk of relapse/recurrence in patients withdrawing from mADM after MBCT.

Translation?

Meditating_Dog clay___4e7ba9ad6f13e

A comment by Deborah Apthorp suggested that the original title Switching from antidepressants to mindfulness meditation increases relapse was incorrect. Checking it I realized that the abstract provides the article was Confusing, but the study did indded show that mindfulness alone led to more relapses and continued medication plus mindfulness.

Here is what is said in the actual introduction to the article:

The main aim of this multicentre, noninferiority effectiveness trial was to examine whether patients who receive MBCT for recurrent depression in remission could safely withdraw from mADM, i.e. without increased relapse/recurrence risk, compared with the combination of these interventions. Patients were randomly allocated to MBCT followed by discontinuation of mADM or MBCT+mADM. The study had a follow-up of 15 months. Our primary hypothesis was that discontinuing mADM after MBCT would be non-inferior, i.e. would not lead to an unacceptably higher risk of relapse/ recurrence, compared with the combination of MBCT+mADM.

Here is what is said in the discussion:

The findings of this effectiveness study reflect an increased risk of relapse/recurrence for patients withdrawing from mADM after having participated in MBCT for recurrent depression.

So, to be clear, the sequence was that patients were randomized either to MBCT without antidepressant or to MBCT with continuing antidepressants. Patients were then followed up for 15 months. Patients who received MBCT without the antidepressants have significantly more relapses/recurrences In the follow-up period than those who received MBCT with antidepressants.

The study addresses the question about whether patients with remitted depression on maintenance antidepressants who were randomized to receive mindfulness-based cognitive therapy (MBCT) have poorer outcomes than those randomized to remaining on their antidepressants.

The study found that poorer outcomes – more relapses – were experienced by patients switching to MBCT verses those remaining on antidepressants plus MBCT.

Strengths of the study

The patients were carefully assessed with validated semi structured interviews to verify they had recurrent past depression, were in current remission, and were taking their antidepressants. Assessment has an advantage over past studies that depended on less reliable primary-care physicians’ records to ascertain eligibility. There’s ample evidence that primary-care physicians often do not make systematic assessments deciding whether or not to preparation on antidepressants.

The control group. The comparison/control group continued on antidepressants after they were assessed by a psychiatrist who made specific recommendations.

 Power analysis. Calculation of sample size for this study was based on a noninferiority design. That meant that the investigators wanted to establish that within particular limit (25%), whether switching to MBCT produce poor outcomes.

A conventional clinical trial is designed to see if the the null hypothesis can rejected of no differences between intervention and control group. As an noninferiority trial, this study tested the null hypothesis that the intervention, shifting patients to MBCT would not result in an unacceptable rise, set at 25% more relapses and recurrences. Noninferiority trials are explained here.

Change in plans for the study

The protocol for the study originally proposed a more complex design. Patients would be randomized to one of three conditions: (1) continuing antidepressants alone; (2) continuing antidepressants, but with MBCT; or (3) MBCT alone. The problem the investigators encountered was that many patients had a strong preference and did not want to be randomized. So, they conducted two separate randomized trials.

This change in plans was appropriately noted in a modification in the trial registration.

The companion study examined whether adding MBCT to maintenance antidepressants reduce relapses. The study was published first:

Huijbers MJ, Spinhoven P, Spijker J, Ruhé HG, van Schaik DJ, van Oppen P, Nolen WA, Ormel J, Kuyken W, van der Wilt GJ, Blom MB. Adding mindfulness-based cognitive therapy to maintenance antidepressant medication for prevention of relapse/recurrence in major depressive disorder: Randomised controlled trial. Journal of Affective Disorders. 2015 Nov 15;187:54-61.

A copy can be obtained from this depository.

It was a smaller study – 35 patients randomized to MBCT alone and 33 patients randomized to a combination of MBCT and continued antidepressants. There were no differences in relapse/recurrence in 15 months.

An important limitation on generalizability

 The patients were recruited from university-based mental health settings. The minority of patients who move from treatment of depression in primary care to a specially mental health settings proportionately include more with moderate to severe depression and with a more defined history of past depression. In contrast, the patients being treated for depression in primary care include more who were mild to moderate and whose current depression and past history have not been systematically assessed. There is evidence that primary-care physicians do not make diagnoses of depression based on a structured assessment. Many patients deemed depressed and in need of treatment will have milder depression and only meet the vaguer, less validated diagnosis of Depression Not Otherwise Specified.

Declaration of interest

The authors indicated no conflicts of interest to declare for either study.

Added February 29: This may be a true statement for the core Dutch researchers who led in conducted the study. However, it is certainly not true for the British collaborator who may have served as a consultant and got authorship as result. He has extensive conflicts of interest and gains a lot personally and professionally from promotion of mindfulness in the UK. Read on.

The previous British study in The Lancet

Kuyken W, Hayes R, Barrett B, Byng R, Dalgleish T, Kessler D, Lewis G, Watkins E, Brejcha C, Cardy J, Causley A. Effectiveness and cost-effectiveness of mindfulness-based cognitive therapy compared with maintenance antidepressant treatment in the prevention of depressive relapse or recurrence (PREVENT): a randomised controlled trial. The Lancet. 2015 Jul 10;386(9988):63-73.

I provided my extended critique of this study in a previous blog post:

Is mindfulness-based therapy ready for rollout to prevent relapse and recurrence in depression?

The study protocol claimed it was designed as a superiority trial, but the authors did not provide the added sample size needed to demonstrate superiority. And they spun null findings, starting in their abstract:

However, when considered in the context of the totality of randomised controlled data, we found evidence from this trial to support MBCT-TS as an alternative to maintenance antidepressants for prevention of depressive relapse or recurrence at similar costs.

What is wrong here? They are discussing null findings as if they had conducted a noninferiority trial with sufficient power to show that differences of a particular size could be ruled out. Lots of psychotherapy trials are underpowered, but should not be used to declare treatments can be substituted for each other.

Contrasting features of the previous study versus the present one

Spinning of null findings. According to the trial registration, the previous study was designed to show that MBCT was superior to maintenance antidepressant treatment and preventing relapse and recurrence. A superiority trial tests the hypothesis that an intervention is better than a control group by a pre-set margin. For a very cool slideshow comparing superiority to noninferiority trials, see here .

Rather than demonstrating that MBCT was superior to routine care with maintenance antidepressant treatment, The Lancet study failed to find significant differences between the two conditions. In an amazing feat of spin, the authors took to publicizing this has a success that MBCT was equivalent to maintenance antidepressants. Equivalence is a stricter criterion that requires more than null findings – that any differences be within pre-set (registered) margins. Many null findings represent low power to find significant differences, not equivalence.

Patient selection. Patients were recruited from primary care on the basis of records indicating they had been prescribed antidepressants two years ago. There was no ascertainment of whether the patients were currently adhering to the antidepressants or whether they were getting effective monitoring with feedback.

Poorly matched, nonequivalent comparison/control group. The guidelines that patients with recurrent depression should remain on antidepressants for two years when developed based on studies in tertiary care. It’s likely that many of these patients were never systematically assessed for the appropriateness of treatment with antidepressants, follow-up was spotty, and many patients were not even continuing to take their antidepressants with any regularit

So, MBCT was being compared to an ill-defined, unknown condition in which some proportion of patients do not need to be taken antidepressants and were not taking them. This routine care also lack the intensity, positive expectations, attention and support of the MBCT condition. If an advantage for MBCT had been found – and it was not – it might only a matter that there was nothing specific about MBCT, but only the benefits of providing nonspecific conditions that were lacking in routine care.

The unknowns. There was no assessment of whether the patients actually practiced MBCT, and so there was further doubt that anything specific to MBCT was relevant. But then again, in the absence of any differences between groups, we may not have anything to explain.

  • Given we don’t know what proportion of patients were taking an adequate maintenance doses of antidepressants, we don’t know whether anything further treatment was needed for them – Or for what proportion.
  • We don’t know whether it would have been more cost-effective simply to have a depression care manager  recontact patients recontact patients, and determine whether they were still taking their antidepressants and whether they were interested in a supervised tapering.
  • We’re not even given the answer of the extent to which primary care patients provided with an MBCT actually practiced.

A well orchestrated publicity campaign to misrepresent the findings. Rather than offering an independent critical evaluation of The Lancet study, press coverage offered the investigators’ preferred spin. As I noted in a previous blog

The headline of a Guardian column  written by one of the Lancet article’s first author’s colleagues at Oxford misleadingly proclaimed that the study showed

freeman promoAnd that misrepresentation was echoed in the Mental Health Foundation call for mindfulness to be offered through the UK National Health Service –

 

calls for NHS mindfulness

The Mental Health Foundation is offering a 10-session online course  for £60 and is undoubtedly prepared for an expanded market

Declaration of interests

WK [the first author] and AE are co-directors of the Mindfulness Network Community Interest Company and teach nationally and internationally on MBCT. The other authors declare no competing interests.

Like most declarations of conflicts of interest, this one alerts us to something we might be concerned about but does not adequately inform us.

We are not told, for instance, something the authors were likely to know: Soon after all the hoopla about the study, The Oxford Mindfulness Centre, which is directed by the first author, but not mentioned in the declaration of interest publicize a massive effort by the Wellcome Trust to roll out its massive Mindfulness in the Schools project that provides mindfulness training to children, teachers, and parents.

A recent headline in the Times: US & America says it all.

times americakey to big bucks 

 

 

A Confirmation bias in subsequent citing

It is generally understood that much of what we read in the scientific literature is false or exaggerated due to various Questionable Research Practices (QRP) leading to confirmation bias in what is reported in the literature. But there is another kind of confirmation bias associated with the creation of false authority through citation distortion. It’s well-documented that proponents of a particular view selectively cite papers in terms of whether the conclusions support of their position. Not only are positive findings claimed original reports exaggerated as they progress through citations, negative findings receie less attention or are simply lost.

Huijbers et al.transparently reported that switching to MBCT leads to more relapses in patients who have recovered from depression. I confidently predict that these findings will be cited less often than the poorer quality The Lancet study, which was spun to create the appearance that it showed MBCT had equivalent  outcomes to remaining on antidepressants. I also predict that the Huijbers et al MBCT study will often be misrepresented when it is cited.

Added February 29: For whatever reason, perhaps because he served as a consultant, the author of The Lancet study is also an author on this paper, which describes a study conducted entirely in the Netherlands. Note however, when it comes to the British The Lancet study,  this article cites it has replicating past work when it was a null trial. This is an example of creating a false authority by distorted citation in action. I can’t judge whether the Dutch authors simply accepted the the conclusions offered in the abstract and press coverage of The Lancet study, or whether The Lancet author influenced their interpretation of it.

I would be very curious and his outpouring of subsequent papers on MBCT, whether The author of  The Lancet paper cites this paper and whether he cites it accurately. Skeptics, join me in watching.

What do I think is going on it in the study?

I think it is apparent that the authors have selected a group of patients who have remitted from their depression, but who are at risk for relapse and recurrence if they go without treatment. With such chronic, recurring depression, there is evidence that psychotherapy adds little to medication, particularly when patients are showing a clinical response to the antidepressants. However, psychotherapy benefits from antidepressants being added.

But a final point is important – MBCT was never designed as a primary cognitive behavioral therapy for depression. It was intended as a means of patients paying attention to themselves in terms of cues suggesting there are sliding back into depression and taking appropriate action. It’s unfortunate that been oversold as something more than this.