When psychotherapy trials have multiple flaws…

Multiple flaws pose more threats to the validity of psychotherapy studies than would be inferred when the individual flaws are considered independently.

mind the brain logo

Multiple flaws pose more threats to the validity of psychotherapy studies than would be inferred when the individual flaws are considered independently.

We can learn to spot features of psychotherapy trials that are likely to lead to exaggerated claims of efficacy for treatments or claims that will not generalize beyond the sample that is being studied in a particular clinical trial. We can look to the adequacy of sample size, and spot what Cochrane collaboration has defined as risk of bias in their handy assessment tool.

We can look at the case-mix in the particular sites where patients were recruited.  We can examine the adequacy of diagnostic criteria that were used for entering patients to a trial. We can examine how blinded the trial was in terms of whoever assigned patients to particular conditions, but also what the patients, the treatment providers, and their evaluaters knew which condition to which particular patients were assigned.

And so on. But what about combinations of these factors?

We typically do not pay enough attention multiple flaws in the same trial. I include myself among the guilty. We may suspect that flaws are seldom simply additive in their effect, but we don’t consider whether they may be even synergism in the negative effects on the validity of a trial. As we will see in this analysis of a clinical trial, multiple flaws can provide more threats to the validity trial than what we might infer when the individual flaws are considered independently.

The particular paper we are probing is described in its discussion section as the “largest RCT to date testing the efficacy of group CBT for patients with CFS.” It also takes on added importance because two of the authors, Gijs Bleijenberg and Hans Knoop, are considered leading experts in the Netherlands. The treatment protocol was developed over time by the Dutch Expert Centre for Chronic Fatigue (NKCV, http://www.nkcv.nl; Knoop and Bleijenberg, 2010). Moreover, these senior authors dismiss any criticism and even ridicule critics. This study is cited as support for their overall assessment of their own work.  Gijs Bleijenberg claims:

Cognitive behavioural therapy is still an effective treatment, even the preferential treatment for chronic fatigue syndrome.

But

Not everybody endorses these conclusions, however their objections are mostly baseless.

Spoiler alert

This is a long read blog post. I will offer a summary for those who don’t want to read through it, but who still want the gist of what I will be saying. However, as always, I encourage readers to be skeptical of what I say and to look to my evidence and arguments and decide for themselves.

Authors of this trial stacked the deck to demonstrate that their treatment is effective. They are striving to support the extraordinary claim that group cognitive behavior therapy fosters not only better adaptation, but actually recovery from what is internationally considered a physical condition.

There are some obvious features of the study that contribute to the likelihood of a positive effect, but these features need to be considered collectively, in combination, to appreciate the strength of this effort to guarantee positive results.

This study represents the perfect storm of design features that operate synergistically:

perfect storm

 Referral bias – Trial conducted in a single specialized treatment setting known for advocating psychological factors maintaining physical illness.

Strong self-selection bias of a minority of patients enrolling in the trial seeking a treatment they otherwise cannot get.

Broad, overinclusive diagnostic criteria for entry into the trial.

Active treatment condition carry strong message how patients should respond to outcome assessment with improvement.

An unblinded trial with a waitlist control lacking the nonspecific elements (placebo) that confound the active treatment.

Subjective self-report outcomes.

Specifying a clinically significant improvement that required only that a primary outcome be less than needed for entry into the trial

Deliberate exclusion of relevant objective outcomes.

Avoidance of any recording of negative effects.

Despite the prestige attached to this trial in Europe, the US Agency for Healthcare Research and Quality (AHRQ) excludes this trial from providing evidence for its database of treatments for chronic fatigue syndrome/myalgic encephalomyelitis. We will see why in this post.

factsThe take away message: Although not many psychotherapy trials incorporate all of these factors, most trials have some. We should be more sensitive to when multiple factors occur in the same trial, like bias in the site for patient recruitment; lacking of blinding; lack of balance between active treatment and control condition in terms of nonspecific factors, and subjective self-report measures.

The article reporting the trial is

Wiborg JF, van Bussel J, van Dijk A, Bleijenberg G, Knoop H. Randomised controlled trial of cognitive behaviour therapy delivered in groups of patients with chronic fatigue syndrome. Psychotherapy and Psychosomatics. 2015;84(6):368-76.

Unfortunately, the article is currently behind a pay wall. Perhaps readers could contact the corresponding author Hans.knoop@radboudumc.nl  and request a PDF.

The abstract

Background: Meta-analyses have been inconclusive about the efficacy of cognitive behaviour therapies (CBTs) delivered in groups of patients with chronic fatigue syndrome (CFS) due to a lack of adequate studies. Methods: We conducted a pragmatic randomised controlled trial with 204 adult CFS patients from our routine clinical practice who were willing to receive group therapy. Patients were equally allocated to therapy groups of 8 patients and 2 therapists, 4 patients and 1 therapist or a waiting list control condition. Primary analysis was based on the intention-to-treat principle and compared the intervention group (n = 136) with the waiting list condition (n = 68). The study was open label. Results: Thirty-four (17%) patients were lost to follow-up during the course of the trial. Missing data were imputed using mean proportions of improvement based on the outcome scores of similar patients with a second assessment. Large and significant improvement in favour of the intervention group was found on fatigue severity (effect size = 1.1) and overall impairment (effect size = 0.9) at the second assessment. Physical functioning and psychological distress improved moderately (effect size = 0.5). Treatment effects remained significant in sensitivity and per-protocol analyses. Subgroup analysis revealed that the effects of the intervention also remained significant when both group sizes (i.e. 4 and 8 patients) were compared separately with the waiting list condition. Conclusions: CBT can be effectively delivered in groups of CFS patients. Group size does not seem to affect the general efficacy of the intervention which is of importance for settings in which large treatment groups are not feasible due to limited referral

The trial registration

http://www.isrctn.com/ISRCTN15823716

Who was enrolled into the trial?

Who gets into a psychotherapy trial is a function of the particular treatment setting of the study, the diagnostic criteria for entry, and patient preferences for getting their care through a trial, rather than what is being routinely provided in that setting.

 We need to pay particular attention to when patients enter psychotherapy trials hoping they will receive a treatment they prefer and not to be assigned to the other condition. Patients may be in a clinical trial for the betterment of science, but in some settings, they are willing to enroll because of a probability of getting treatment they otherwise could not get. This in turn also affects the evaluation of both the condition in which they get the preferred treatment, but also their evaluation of the condition in which they are denied it. Simply put, they register being pleased with what they wanted or not being pleased if they did not get what they wanted.

The setting is relevant to evaluating who was enrolled in a trial.

The authors’ own outpatient clinic at the Radboud University Medical Center was the site of the study. The group has an international reputation for promoting the biopsychosocial model, in which psychological factors are assumed to be the decisive factor in maintaining somatic complaints.

All patients were referred to our outpatient clinic for the management of chronic fatigue.

There is thus a clear referral bias  or case-mix bias but we are not provided a ready basis for quantifying it or even estimating its effects.

The diagnostic criteria.

The article states:

In accordance with the US Center for Disease Control [9], CFS was defined as severe and unexplained fatigue which lasts for at least 6 months and which is accompanied by substantial impairment in functioning and 4 or more additional complaints such as pain or concentration problems.

Actually, the US Center for Disease Control would now reject this trial because these entry criteria are considered obsolete, overinclusive, and not sufficiently exclusive of other conditions that might be associated with chronic fatigue.*

There is a real paradigm shift happening in America. Both the 2015 IOM Report and the Centers for Disease Control and Prevention (CDC) website emphasize Post Exertional Malaise and getting more ill after any effort with M.E. CBT is no longer recommended by the CDC as treatment.

cdc criteriaThe only mandatory symptom for inclusion in this study is fatigue lasting 6 months. Most properly, this trial targets chronic fatigue [period] and not the condition, chronic fatigue syndrome.

Current US CDC recommendations  (See box  7-1 from the IoM document, above) for diagnosis require postexertional malaise for a diagnosis of myalgic encephalomyelitis (ME). See below.

pemPatients meeting the current American criteria for ME would be eligible for enrollment in this trial, but it’s unclear what proportion of the patients enrolled actually met the American criteria. Because of the over-inclusiveness of the entry diagnostic criteria, it is doubtful whether the results would generalize to American sample. A look at patient flow into the study will be informative.

Patient flow

Let’s look at what is said in the text, but also in the chart depicting patient flow into the trial for any self-selection that might be revealed.

In total, 485 adult patients were diagnosed with CFS during the inclusion period at our clinic (fig. 1). One hundred and fifty-seven patients were excluded from the trial because they declined treatment at our clinic, were already asked to participate in research incompatible with inclusion (e.g. research focusing on individual CBT for CFS) or had a clinical reason for exclusion (i.e. they received specifically tailored interventions because they were already unsuccessfully treated with individual CBT for CFS outside our clinic or were between 18 and 21 years of age and the family had to be involved in the therapy). Of the 328 patients who were asked to engage in group therapy, 99 (30%) patients indicated that they were unwilling to receive group therapy. In 25 patients, the reason for refusal was not recorded. Two hundred and four patients were randomly allocated to one of the three trial conditions. Baseline characteristics of the study sample are presented in table 1. In total, 34 (17%) patients were lost to follow-up. Of the remaining 170 patients, 1 patient had incomplete primary outcome data and 6 patients had incomplete secondary outcome data.

flow chart

We see that the investigators invited two thirds of patients attending the clinic to enroll in the trial. Of these, 41% refused. We don’t know the reason for some of the refusals, but almost a third of the patients approached declined because they did not want group therapy. The authors left being able to randomize 42% of patients coming to the clinic or less than two thirds of patients they actually asked. Of these patients, a little more than two thirds received the treatment to which were randomized and were available for follow-up.

These patients receiving treatment to which they were randomized and who were available for follow-up are self-selected minority of the patients coming to the clinic. This self-selection process likely reduced the proportion of patients with myalgic encephalomyelitis. It is estimated that 25% of patients meeting the American criteria a housebound and 75% are unable to work. It’s reasonably to infer that patients being the full criteria would opt out of a treatment that require regular attendance of a group session.

The trial is biased to ambulatory patients with fatigue and not ME. Their fatigue is likely due to some combinations of factors such as multiple co-morbidities, as-yet-undiagnosed medical conditions, drug interactions, and the common mild and subsyndromal  anxiety and depressive symptoms that characterize primary care populations.

The treatment being evaluated

Group cognitive behavior therapy for chronic fatigue syndrome, either delivered in a small (4 patients and 1 therapist) or larger (8 patients and 2 therapists) group format.

The intervention consisted of 14 group sessions of 2 h within a period of 6 months followed by a second assessment. Before the intervention started, patients were introduced to their group therapist in an individual session. The intervention was based on previous work of our research group [4,13] and included personal goal setting, fixing sleep-wake cycles, reducing the focus on bodily symptoms, a systematic challenge of fatigue-related beliefs, regulation and gradual increase in activities, and accomplishment of personal goals. A formal exercise programme was not part of the intervention.

Patients received a workbook with the content of the therapy. During sessions, patients were explicitly invited to give feedback about fatigue-related cognitions and behaviours to fellow patients. This aspect was introduced to facilitate a pro-active attitude and to avoid misperceptions of the sessions as support group meetings which have been shown to be insufficient for the treatment of CFS.

And note:

In contrast to our previous work [4], we communicated recovery in terms of fatigue and disabilities as general goal of the intervention.

Some impressions of the intensity of this treatment. This is a rather intensive treatment with patients having considerable opportunities for interactions with providers. This factor alone distinguishes being assigned to the intervention group versus being left in the wait list control group and could prove powerful. It will be difficult to distinguish intensity of contact from any content or active ingredients of the therapy.

I’ll leave for another time a fuller discussion of the extent to which what was labeled as cognitive behavior therapy in this study is consistent with cognitive therapy as practiced by Aaron Beck and other leaders of the field. However, a few comments are warranted. What is offered in this trial does not sound like cognitive therapy as Americans practice it. What is often in this trial seems emphasize challenging beliefs, pushing patients to get more active, along with psychoeducational activities. I don’t see indications of the supportive, collaborative relationship in which patients are encouraged to work on what they want to work on, engage in outside activities (homework assignments) and get feedback.

What is missing in this treatment is what Beck calls collaborative empiricism, “a systemic process of therapist and patient working together to establish common goals in treatment, has been found to be one of the primary change agents in cognitive-behavioral therapy (CBT).”

Importantly, in Beck’s approach, the therapist does not assume cognitive distortions on the part of the patient. Rather, in collaboration with the patient, the therapist introduces alternatives to the interpretations that the patient has been making and encourages the patient to consider the difference. In contrast, rather than eliciting goal statements from patients, therapist in this study imposes the goal of increased activity. Therapists in this study also seem ready to impose their views that the patients’ fatigue-related beliefs are maladaptive.

The treatment offered in this trial is complex, with multiple components making multiple assumptions that seem quite different from what is called cognitive therapy or cognitive behavioral therapy in the US.

The authors’ communication of recovery from fatigue and disability seems a radical departure not only from cognitive behavior therapy for anxiety and depression and pain, but for cognitive behavior therapy offered for adaptation to acute and chronic physical illnesses. We will return to this “communication” later.

The control group

Patients not randomized to group CBT were placed on a waiting list.

Think about it! What do patients think about having gotten involved in all the inconvenience and burden of a clinical trial in hope that they would get treatment and then being assigned to the control group with just waiting? Not only are they going to be disappointed and register that in their subjective evaluations of the outcome assessments patients may worry about jeopardizing the right to the treatment they are waiting for if they overly endorse positive outcomes. There is a potential for  nocebo effect , compounding the placebo effect of assignment to the CBT active treatment groups.

What are informative comparisons between active treatments and  control conditions?

We need to ask more often what inclusion of a control group accomplishes for the evaluation of a psychotherapy. In doing so, we need to keep in mind that psychotherapies do not have effect sizes, only comparisons of psychotherapies and control condition have effect sizes.

A pre-post evaluation of psychotherapy from baseline to follow-up includes the effects of any active ingredient in the psychotherapy, a host of nonspecific (placebo) factors, and any changes that would’ve occurred in the absence of the intervention. These include regression to the mean– patients are more likely to enter a clinical trial now, rather than later or previously, if there has been exacerbation of their symptoms.

So, a proper comparison/control condition includes everything that the patients randomized to the intervention group get except for the active treatment. Ideally, the intervention and the comparison/control group are equivalent on all these factors, except the active ingredient of the intervention.

That is clearly not what is happening in this trial. Patients randomized to the intervention group get the intervention, the added intensity and frequency of contact with professionals that the intervention provides, and all the support that goes with it; and the positive expectations that come with getting a therapy that they wanted.

Attempts to evaluate the group CBT versus the wait-list control group involved confounding the active ingredients of the CBT and all these nonspecific effects. The deck is clearly being stacked in favor of CBT.

This may be a randomized trial, but properly speaking, this is not a randomized controlled trial, because the comparison group does not control for nonspecific factors, which are imbalanced.

The unblinded nature of the trial

In RCTs of psychotropic drugs, the ideal is to compare the psychotropic drug to an inert pill placebo with providers, patients, and evaluate being blinded as to whether the patients received psychotropic drug or the comparison pill.

While it is difficult to achieve a comparable level of blindness and a psychotherapy trial, more of an effort to achieve blindness is desirable. For instance, in this trial, the authors took pains to distinguish the CBT from what would’ve happened in a support group. A much more adequate comparison would therefore be CBT versus either a professional or peer-led support group with equivalent amounts of contact time. Further blinding would be possible if patients were told only two forms of group therapy were being compared. If that was the information available to patients contemplating consenting to the trial, it wouldn’t have been so obvious from the outset to the patients being randomly assigned that one group was preferable to the other.

Subjective self-report outcomes.

The primary outcomes for the trial were the fatigue subscale of the Checklist Individual Strength;  the physical functioning subscale of the Short Health Survey 36 (SF-36); and overall impairment as measured by the Sickness Impact Profile (SIP).

Realistically, self-report outcomes are often all that is available in many psychotherapy trials. Commonly these are self-report assessments of anxiety and depressive symptoms, although these may be supplemented by interviewer-based assessments. We don’t have objective biomarkers with which to evaluate psychotherapy.

These three self-report measures are relatively nonspecific, particularly in a population that is not characterized by ME. Self-reported fatigue in a primary care population lacks discriminative validity with respect to pain, anxiety and depressive symptoms, and general demoralization.  The measures are susceptible to receipt of support and re-moralization, as well as gratitude for obtaining a treatment that was sought.

Self-report entry criteria include a score 35 or higher on the fatigue severity subscale. Yet, a score of less than 35 on this scale at follow up is part of what is defined as a clinically significant improvement with a composite score from combined self-report measures.

We know from medical trials that differences can be observed with subjective self-report measures that will not be found with objective measures. Thus, mildly asthmatic patients will fail to distinguish in their subjective self-reports between [  between the effective inhalant albuterol, an inert inhalant, and sham acupuncture, but will rate improvement better than getting no intervention.  However,  there will be a strong advantage over the other three conditions with an objective measure, maximum forced expiratory volume in 1 second (FEV1) as assessed  with spirometry.

The suppression of objective outcome measures

We cannot let these the authors of this trial off the hook in their dependence on subjective self-report outcomes. They are instructing patients that recovery is the goal, which implies that it is an attainable goal. We can reasonably be skeptical about acclaim of recovery based on changes in self-report measures. Were the patients actually able to exercise? What was their exercise capacity, as objectively measured? Did they return to work?

These authors have included such objective measurements in past studies, but not included them as primary outcomes, nor, even in some cases, reported them in the main paper reporting the trial.

Wiborg JF, Knoop H, Stulemeijer M, Prins JB, Bleijenberg G. How does cognitive behaviour therapy reduce fatigue in patients with chronic fatigue syndrome? The role of physical activity. Psychol Med. 2010 Jan 5:1

The senior authors’ review fails to mention their three studies using actigraphy that did not find effects for CBT. I am unaware of any studies that did find enduring effects.

Perhaps this is what they mean when they say the protocol has been developed over time – they removed what they found to be threats to the findings that they wanted to claim.

Dismissing of any need to consider negative effects of treatment

Most psychotherapy fail to assess any adverse effects of treatment, but this is usually done discretely, without mention. In contrast, this article states

Potential harms of the intervention were not assessed. Previous research has shown that cognitive behavioural interventions for CFS are safe and unlikely to produce detrimental effects.

Patients who meet stringent criteria for ME would be put at risk for pressure to exert themselves. By definition they are vulnerable to postexertional malaise (PEM). Any trail of this nature needs to assess that risk. Maybe no adverse effects would be found. If that were so, it would strongly indicate the absence of patients with appropriate diagnoses.

Timing of assessment of outcomes varied between intervention and control group.

I at first did not believe what I was reading when I encountered this statement in the results section.

The mean time between baseline and second assessment was 6.2 months (SD = 0.9) in the control condition and 12.0 months (SD = 2.4) in the intervention group. This difference in assessment duration was significant (p < 0.001) and was mainly due to the fact that the start of the therapy groups had to be frequently postponed because of an irregular patient flow and limited treatment capacities for group therapy at our clinic. In accordance with the treatment manual, the second assessment was postponed until the fourteenth group session was accomplished. The mean time between the last group session and the second assessment was 3.3 weeks (SD = 3.5).

So, outcomes were assessed for the intervention group shortly after completion of therapy, when nonspecific (placebo) effects would be stronger, but a mean of six months later than for patients assigned to the control condition.

Post-hoc statistical controls are not sufficient to rescue the study from this important group difference, and it compounds other problems in the study.

Take away lessons

Pay more attention to how limitations any clinical trial may compound each other in terms of the trial provide exaggerated estimates of the effects of treatment or the generalizability of the results to other settings.

Be careful of loose diagnostic criteria because a trial may not generalize to the same criteria being applied in settings that are different either in terms of patient population of the availability of different treatments. This is particularly important when a treatment setting has a bias in referrals and only a minority of patients being invited to participate in the trial actually agree and are enrolled.

Ask questions about just what information is obtained in comparing active treatment group and the study to its control/comparison. For start, just what is being controlled and how might that affect the estimates of the effectiveness of the active treatment?

Pay particular attention to the potent combination of the trial being unblinded, a weak comparision/control, and an active treatment that is not otherwise available to patients.

Note

*The means of determining whether the six months of fatigue might be accounted for by other medical factors was specific to the setting. Note that a review of medical records for sufficient for an unknown proportion of patients, with no further examination or medical tests.

The Department of Internal Medicine at the Radboud University Medical Center assessed the medical examination status of all patients and decided whether patients had been sufficiently examined by a medical doctor to rule out relevant medical explanations for the complaints. If patients had not been sufficiently examined, they were seen for standard medical tests at the Department of Internal Medicine prior to referral to our outpatient clinic. In accordance with recommendations by the Centers for Disease Control, sufficient medical examination included evaluation of somatic parameters that may provide evidence for a plausible somatic explanation for prolonged fatigue [for a list, see [9]. When abnormalities were detected in these tests, additional tests were made based on the judgement of the clinician of the Department of Internal Medicine who ultimately decided about the appropriateness of referral to our clinic. Trained therapists at our clinic ruled out psychiatric comorbidity as potential explanation for the complaints in unstructured clinical interviews.

workup

A skeptical look at The Lancet behavioural activation versus CBT for depression (COBRA) study

A skeptical look at:

Richards DA, Ekers D, McMillan D, Taylor RS, Byford S, Warren FC, Barrett B, Farrand PA, Gilbody S, Kuyken W, O’Mahen H. et al. Cost and Outcome of Behavioural Activation versus Cognitive Behavioural Therapy for Depression (COBRA): a randomised, controlled, non-inferiority trial. The Lancet. 2016 Jul 23.

 

humpty dumpty fallenAll the Queen’s horses and all the Queen’s men (and a few women) can’t put a flawed depression trial back together again.

Were they working below their pay grade? The 14 authors of the study collectively have impressive expertise. They claim to have obtained extensive consultation in designing and implementing the trial. Yet they produced:

  • A study doomed from the start by serious methodological problems from yielding any scientifically valid and generalizable results.
  • Instead, they produced tortured results that pander to policymakers seeking an illusory cheap fix.

 

Why the interests of persons with mental health problems are not served by translating the hype from a wasteful project into clinical practice and policy.

Maybe you were shocked and awed, as I was by the publicity campaign mounted by The Lancet on behalf of a terribly flawed article in The Lancet Psychiatry about whether locked inpatient wards fail suicidal patients.

It was a minor league effort compared to the campaign orchestrated by the Science Media Centre for a recent article in The Lancet The study concerned a noninferiority trial of behavioural activation (BA) versus cognitive behaviour therapy (CBT) for depression. The message echoing through social media without any critical response was behavioural activation for depression delivered by minimally trained mental health workers was cheaper but just as effective as cognitive behavioural therapy delivered by clinical psychologists.

Reflecting the success of the campaign, the immediate reactions to the article are like nothing I have recently seen. Here are the published altmetrics for an article with an extraordinary overall score of 696 (!) as of August 24, 2016.

altmetrics

 

Here is the press release.

Here is the full article reporting the study, which nobody in the Twitter storm seems to have consulted.

some news coverage

 

 

 

 

 

 

 

 

 

Here are supplementary materials.

Here is the well-orchestrated,uncritical response from tweeters, UK academics and policy makers.

.

The Basics of the study

The study was an open-label  two-armed non-inferiority trial of behavioural activation therapy (BA) versus cognitive behavioural therapy (CBT) for depression with no non-specific comparison/control treatment.

The primary outcome was depression symptoms measured with the self-report PHQ-9 at 12 months.

Delivery of both BA and CBT followed written manuals for a maximum of 20 60-minute sessions over 16 weeks, but with the option of four additional booster sessions if the patients wanted them. Receipt of eight sessions was considered an adequate exposure to the treatments.

The BA was delivered by

Junior mental health professionals —graduates trained to deliver guided self-help interventions, but with neither professional mental health qualifications nor formal training in psychological therapies—delivered an individually tailored programme re-engaging participants with positive environmental stimuli and developing depression management strategies.

CBT, in contrast, was delivered by

Professional or equivalently qualified psychotherapists, accredited as CBT therapists with the British Association of Behavioural and Cognitive Psychotherapy, with a postgraduate diploma in CBT.

The interpretation provided by the journal article:

Junior mental health workers with no professional training in psychological therapies can deliver behavioural activation, a simple psychological treatment, with no lesser effect than CBT has and at less cost. Effective psychological therapy for depression can be delivered without the need for costly and highly trained professionals.

A non-inferiority trial

An NHS website explains non-inferiority trials:

The objective of non-inferiority trials is to compare a novel treatment to an active treatment with a view of demonstrating that it is not clinically worse with regards to a specified endpoint. It is assumed that the comparator treatment has been established to have a significant clinical effect (against placebo). These trials are frequently used in situations where use of a superiority trial against a placebo control may be considered unethical.

I have previously critiqued  [ 1,   2 ] noninferiority psychotherapy trials. I will simply reproduce a passage here:

Noninferiority trials (NIs) have a bad reputation. Consistent with a large literature, a recent systematic review of NI HIV trials  found the overall methodological quality to be poor, with a high risk of bias. The people who brought you CONSORT saw fit to develop special reporting standards for NIs  so that misuse of the design in the service of getting publishable results is more readily detected.

Basically, an NI RCT commits investigators and readers to accepting null results as support for a new treatment because it is no worse than an existing one. Suspicions are immediately raised as to why investigators might want to make that point.

Noninferiority trials are very popular among Pharma companies marketing rivals to popular medications. They use noninferiority trials to show that their brand is no worse than the already popular medication. But by not including a nonspecific control group, the trialists don’t bother to show that either of the medications is more effective than placebo under the conditions in which they were administered in these trials. Often, the medication dominating the market had achieved FDA approval for advertising with evidence of only being only modestly effective. So, potato are noninferior to spuds.

Compounding the problems of a noninferiority trial many times over

Let’s not dwell on this trial being a noninferiority trial, although I will return to the problem of knowing what would happen in the absence of either intervention or with a credible, nonspecific control group. Let’s focus instead on some other features of the trial that seriously compromised an already compromised trial.

Essentially, we will see that the investigators reached out to primary care patients who were mostly already receiving treatment with antidepressants, but unlikely with the support and positive expectations or even adherence necessary to obtain benefit. By providing these nonspecific factors, any psychological intervention would likely to prove effective in the short run.

The total amount of treatment offered substantially exceeded what is typically provided in clinical trials of CBT. However, uptake and actual receipt of treatment is likely to be low in such a population recruited by outreach, not active seeking treatment. So, noise is being introduced by offering so much treatment.

A considerable proportion of primary care patients identified as depressed won’t accept treatment or will not accept the full intensity available. However, without careful consideration of data that are probably not available for this trial, it will be ambiguous whether the amount of treatment received by particular patients represented dropping out prematurely or simply dropping out when they were satisfied with the benefits they had been received. Undoubtedly, failures to receive minimal intensity of treatment and even the overall amount of treatment received by particular patients are substantial and complexly determined, but nonrandom and differ between patients.

Dropping out of treatment is often associated with dropping out of a study – further data not being available for follow-up. These conditions set the stage for considerable challenges in analyzing and generalizing from whatever data are available. Clearly, the assumption of data being missing at random will be violated. But that is the key assumption required by multivariate statistical strategies that attempt to compensate for incomplete data.

12 months – the time point designated for assessment of primary outcomes – is likely to exceed the duration of a depressive episode in a primary care population, which is approximately 9 months. In the absence of a nonspecific active comparison/control or even a waitlist control group, recovery that would’ve occurred in the absence of treatment will be ascribed to the two active interventions being studied.

12 months is likely to exceed substantially the end of any treatment being received and so effects of any active treatments are likely to dissipate. The design allowed for up to four booster sessions. However, access to booster sessions was not controlled. It was not assigned and access cannot be assumed to be random. As we will see when we examined the CONSORT flowchart for the study, there was no increase in the number of patients receiving an adequate exposure to psychotherapy from 6 to 12 months. That is likely to indicate that most active treatment had ended within the first six months.

Focusing on 12 months outcomes, rather than six months, increases the unreliability of any analyses because more 12 month outcomes will be missing than what were available at six months.

Taken together, the excessively long 12 month follow-up being designated as primary outcome and the unusually amount of treatment being offered, but not necessarily being accepted, create substantial problems of missing data that cannot be compensated by typical imputation and multivariate methods; difficulties interpreting results in terms of the amount of treatment actually received; and comparison to the primary outcomes typical trials of psychotherapy being offered to patients seeking psychotherapy.

The authors’ multivariate analysis strategy was inappropriate, given the amount of missing data and the violation of data being missing at random..

Surely the more experienced of the 14 authors of The Lancet should have anticipated these problems and the low likelihood that this study would produce generalizable results.

Recruitment of patients

The article states:

 We recruited participants by searching the electronic case records of general practices and psychological therapy services for patients with depression, identifying potential participants from depression classification codes. Practices or services contacted patients to seek permission for researcher contact. The research team interviewed those that responded, provided detailed information on the study, took informed written consent, and assessed people for eligibility.

Eligibility criteria

Eligible participants were adults aged 18 years or older who met diagnostic criteria for major depressive disorder assessed by researchers using a standard clinical interview (Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition [SCID]9). We excluded people at interview who were receiving psychological therapy, were alcohol or drug dependent, were acutely suicidal or had attempted suicide in the previous 2 months, or were cognitively impaired, or who had bipolar disorder or psychosis or psychotic symptoms.

Table 3 Patient Characteristics reveals a couple of things about co-treatment with antidepressants that must be taken into consideration in evaluating the design and interpreting results.

antidepressant stratificationAnd

 

antidepressant stratification

So, investigators did not wait for patients to refer themselves or to be referred by physicians to the trial, they reached out to them. Applying their exclusion criteria, the investigators obtained a sample that mostly had been prescribed antidepressants, with no indication that the prescription had ended. The length of time which 70% patients had been on antidepressants was highly skewed, with a mean of 164 weeks and a median of 19. These figures strain credibility. I have reached out to the authors with a question whether there is an error in the table and await clarification.

We cannot assume that patients whose records indicate they were prescribed an antidepressant were refilling their prescriptions at the time of recruitment, were faithfully adhering, or were even being monitored.  The length of time since initial prescription increases skepticism whether there was adequate exposure to antidepressants at the time of recruitment to the study..

The inadequacy of antidepressant treatment in routine primary care

Refilling of first prescriptions of antidepressants in primary care, adherence, and monitoring and follow-up by providers are notoriously low.

Guideline-congruent treatment with antidepressants in the United States requires a five week follow up visit, which is only infrequently received in routine. When the five week follow-up visit is kept,

Rates of improvement in depression associated with prescription of an antidepressant in routine care approximate that achieved with pill placebo in antidepressant trials. The reasons for this are complex: but center on depression being of mild to moderate severity in primary care. Perhaps more important is that the attention, provisional positive expectations and support provided in routine primary care is lower than what is provided in the blinded pill-placebo condition in clinical trials. In blinded trials, neither the provider nor patient know whether the active medication or a pill placebo is being administered. The famous NIMH National Collaborative Study found, not surprisingly, that response in the pill-placebo condition was predicted by the quality of the therapeutic alliance between patient and provider.

In The Lancet study, readers are not provided with important baseline characteristics of the patients that are crucial to interpreting the results and their generalizability. We don’t know the baseline or subsequent adequacy of antidepressant treatment or of the quality of the routine care being provided for it. Given that antidepressants are not the first-line treatment for mild to moderate depression, we don’t know why these patients were not receiving psychotherapy. We don’t know even whether the recruited patients were previously offered psychotherapy and with what uptake, except that they were not receiving it two months prior to recruitment.

There is a fascinating missing story about why these patients were not receiving psychotherapy at the start of the study and why and with what accuracy they were described as taking antidepressants.

Readers are not told what happened to antidepressant treatment during the trial. To what extent did patients who were not receiving antidepressants begin doing so? As result of the more frequent contact and support provided in the psychotherapy, to what extent was there improvement in adherence, as well as the ongoing support inattention per providers and attention from primary care providers?

Depression identified in primary care is a highly heterogeneous condition, more so than among patients recruited from treatment in specialty mental health settings. Much of the depression has only the minimum number of symptoms required for a diagnosis or one more. The reliability of diagnosis is therefore lower than in specialty mental health settings. Much of the depression and anxiety disorders identified with semi structured research instruments in populations that is not selected for having sought treatment resolves itself without formal intervention.

The investigators were using less than ideal methods to recruit patients from a population in which major depressive disorder is highly heterogeneous and subject to recovery in the absence of treatment by the time point designated for assessment of primary outcome. They did not sufficiently address the problem of a high level of co-treatment having been prescribed long before the beginning of the study. They did not even assess the extent to which that prescribed treatment had patient adherence or provider monitoring and follow-up. The 12 month follow-up allowed the influence of lots of factors beyond the direct effects of the active ingredients of the two interventions being compared in the absence of a control group.

decline in scores

Examination of a table presented in the supplementary materials suggests that most change occurred in the first six months after enrollment and little thereafter. We don’t know the extent to which there was any treatment beyond the first six-month or what effect it had. A population with clinically significant depression drawn from specially care, some deterioration can be expected after withdrawal of active treatment. In a primary care population, such a graph could be produced in large part because of the recovery from depression that would be observed in the absence of active treatment.

 

Cost-effectiveness analyses reported in the study address the wrong question. These analyses only considered the relative cost of these two active treatments, leaving unaddressed the more basic question of whether it is cost-effective to offer either treatments at this intensity. It might be more cost-effective to have a person with even less mental health training contact patients, inquire about adherence, side effects, and clinical outcomes, and prompt patients to accept another appointment with the GP if an algorithm indicates that would be appropriate.

The intensity of treatment being offered and received

The 20 sessions plus 4 booster sessions of psychotherapy being offered in this trial is considerably higher than the 12 to 16 sessions offered in the typical RCT for depression. Having more sessions available than typical introduces some complications. Results are not comparable to what is found inthe trials offering less treatment. But in a primary care population not actively seeking psychotherapy for depression, there is further complication in that many patients will not access the full 20 sessions. There will be difficulties interpreting results in terms of intensity of treatment because of the heterogeneity of reasons for getting less treatment. Effectively, offering so much therapy to a group that is less inclined to accept psychotherapy introduces a lot of noise in trying to make sense of the data, particularly when cost-effectiveness is an issue.

This excerpt from the CONSORT flowchart demonstrates the multiple problems associated with offering so much treatment to a population that was not actively seeking it and yet needing twelve-month data for interpreting the results of a trial.

CONSORT chart

The number of patients who had no data at six months increased by 12 months. There was apparently no increase in the number of patients receiving an adequate exposure to psychotherapy

Why the interest of people with mental health problems are not served by the results claimed by these investigators being translated into clinical practice.

 The UK National Health Service (NHS) is seriously underfunding mental health services. Patients being referred for psychotherapy from primary care have waiting periods that often exceed the expected length of an episode of depression in primary care. Simply waiting for depression to remit without treatment is not necessarily cost effective because of the unneeded suffering, role impairment, and associated social and personal costs of an episode that persist. Moreover, there is a subgroup of depressed patients in primary care who need more intensive or different treatment. Guidelines recommending assessment after five weeks are not usually reflected in actual clinical practice.

There’s a desperate search for ways in which costs can be further reduced in the NHS. The Lancet study is being interpreted to suggest that more expensive clinical psychologists can be replaced by less expensive and less trained mental health workers. Uncritically and literally accepted, the message is that clinical psychologist working half-time addressing particular comment clinical problems can be replaced by less expensive mental health workers achieving the same effects in the same amount of time.

The pragmatic translation of these claims into practice are replace have a clinical psychologists with cheaper mental health workers. I don’t think it’s being cynical to anticipate the NHS seizing upon an opportunity to reduce costs, while ignoring effects on overall quality of care.

Care for the severely mentally ill in the NHS is already seriously compromised for other reasons. Patients experiencing an acute or chronic breakdown in psychological and social functioning often do not get minimal support and contact time to avoid more intensive and costly interventions like hospitalization. I think would be naïve to expect that the resources freed up by replacing a substantial portion of clinical psychologists with minimally trained mental health workers would be put into addressing unmeet needs of the severely mentally ill.

Although not always labeled as such, some form of BA is integral to stepped care approaches to depression in primary care. Before being prescribed antidepressants or being referred to psychotherapy, patients are encouraged to increased pleasant activities. In Scotland, they may be even given free movie passes for participating in cleanup of parks.

A stepped care approach is attractive, but evaluation of cost effectiveness is complicated by consideration of the need for adequate management of antidepressants for those patients who go on to that level of care.

If we are considering a sample of primary care patients mostly already receiving antidepressants, the relevant comparator is introduction of a depression care manager.

Furthermore, there are issues in the adequacy of addressing the needs of patients who do not benefit from lower intensity care. Is the lack of improvement with low levels of care adequately monitored and addressed. Is the uncertain escalation in level of care adequately supported so that referrals are completed?

The results of The Lancet study don’t tell us very much about the adequacy of care that patients who were enrolled in the study were receiving or whether BA is as effective as CBT as stand-alone treatments or whether nonspecific treatments would’ve done as well. We don’t even know whether patients assigned to a waitlist control would’ve shown as much improvement by 12 months and we have reason to suspect that many would.

I’m sure that the administrations of NHS are delighted with the positive reception of this study. I think it should be greeted with considerable skepticism. I am disappointed that the huge resources that went into conducting this study which could have put into more informative and useful research.

I end with two questions for the 14 authors – Can you recognize the shortcomings of your study and its interpretation that you have offered? Are you at least a little uncomfortable with the use to which these results will be put?

 

 

 

 

Hans Eysenck’s contribution to cognitive behavioral therapy for physical health problems: fraudulent data

  • The centenary of the birth of Hans Eysenck is being marked by honoring his role in bringing clinical psychology to the UK and pioneering cognitive behavior therapy (CBT).
  • There is largely silence about his publishing fraudulent data, editorial misconduct, and substantial undeclared conflicts of interest.
  • The articles in which Eysenck used fraudulent data are no longer cited much, but the influence of his claims which depended on these data remains profound.
  • Eysenck used fraudulent data to argue that CBT could prevent cancer and cardiovascular disease and extend the lives of persons with advanced cancer.
  • He similarly used fraudulent data to advance the claim that psychoanalysis is, unlike smoking, carcinogenic and has other adverse effects on health.
  • Ironically, Eysenck incorporated into his explanations for how CBT works elements of the psychoanalytic thinking that he seemingly detested.

If there is sufficient interest, a follow-up blog post will discuss:

  • Because of Eysenck’s influence, CBT in the UK exaggerates the role of early childhood adversity and much less to functional behavioral analysis than the American behavior therapy and cognitive behavior therapy.
  • Both CBT in the UK and some quack therapy approaches make assumptions about mechanism tied to Eysenck’s use of fraudulent data.
  • Consistent with Eysenck’s influence, CBT for physical problems in the UK largely focuses on self-report questionnaire assessments of mechanism of change and of outcome, rather than functional behavioral and objective physical health outcome variables.

8th-chocolate-happy-birthday-cake-for-HansHappy Birthday, Hans Eysenck

March 12, 2016 was the centenary of the birth of psychologist Hans Eysenck. The British Psychological Society’s  The Psychologist marked the occasion with release of a free app by which BPS members can access a collection of articles about Hans Eysenck from the archives.  Nonmembers can access the articles here.

The introduction to the collection, Philip Corr’s The centenary of a maverick states

Eysenck’s contributions were many, varied and significant, including: the professional development of clinical psychology; the slaying of the psychoanalytical dragon; pioneering behaviour therapy and, thus, helping to usher in the era of cognitive behavioural therapy…

Corr also wrote in the March 30 2016 Times Higher Education:

in defence corr

hans ensenck portraitThe articles collected in The Psychologist were written over many years. Together they present an unflattering picture of a controversial man who was shunned by his colleagues, blocked from getting awards, and who would humiliate those with whom he disagreed rather than acknowledge any contradictory evidence. Particularly revealing are Roderick Buchanan’s   Looking back: The controversial Hans Eysenck and a review of Buchanan’s book by Eysenck’s son Michael, Playing with fire: The controversial career of Hans J. Eysenck.

However, the collection stops short of acknowledging what was revealed in the early 90s in The BMJ: Eysenck knowingly published fraudulent data to back outrageous claims that CBT prevented cancer and extended the lives of patients with terminal cancer, whereas psychoanalysis was carcinogenic. He published his claims in journals he had founded, liberally self-plagiarizing and duplicate publishing with undeclared conflicts of interest. Eysenck received salary supplements and cash awards from German tobacco companies and from lawyers for the American tobacco companies for these activities.

slide 2 r smith should editors slide1 R Smith EysenckThe BMJ gave psychiatrists Anthony Pelosi and Louis Appleby a forum in the early nineties for criticizing Eysenck, even though the articles they attacked had been published elsewhere. The BMJ Editor Richard Smith followed up,  citing Eysenck as an example in raising the question whether editors should publish research articles in their own journal. Pelosi filed formal charges against Eysenck with the British Psychological Society. But, according to Buchanan’s book:

The BPS investigatory committee deemed it “inappropriate” to set up an investigatory panel to look into the material Pelosi had sent them, and henceforth considered the matter closed. Pelosi disagreed, of course, but was left with little recourse.

In an editorial in The Times Simon Wessely acknowledged Pelosi and Appleby’s criticism of Eysenck, but said “It would take more than a couple of psychiatrists to ruffle Eysenck.”

Simon on EysenckWessely suggested that the matter be dropped: the controversy was distracting everyone from the real progress being made in psychological approaches to cancer, like showing a fighting spirit extends the lives of cancer patients.  There was apparently no further mention in the UK press. Read more here.

Eysenck’s articles involving fraudulent data are seldom cited in the contemporary literature, but the claims the data were used to back remain quite influential. For instance, Eysenck claimed psychological factors presented more risk for cancer than many well-established biological factors. Including Eysenck’s data probably allowed one of the most cited meta-analyses of psychological factors in cancer to pass the threshold of hazard ratios strong enough for publication in the prestigious journal, Nature Clinical Practice: Oncology. Without the inclusion of Eysenck’s data, hazard ratios from methodologically weak studies cluster slightly higher than 1.0, suggesting little association that cannot be explained by confounds. A later blog post will document the broader influence of the Eysenck fraud on psychoneuroimmunology.

Eysenck’s claims concerning effects of CBT on physical health conditions now similarly go uncited.  However, the idiosyncratic definition he gave to CBT and his claims about the presumed mechanism by which it improved physical health pervade both CBT as defined in the UK and a number of quack treatments in the UK and elsewhere.

It is important to establish the connection between fraudulent data, distinctive features of CBT in the UK, and presumed mechanisms of action in order to open for re-examination the forms that CBT for physical health problems take in the UK and the way in which claims of efficacy are evaluated.

Fraudulent Data

Eysenck repeated tables and text in a number of places, but I will mainly draw on data as he presented them in the journal he founded, Behaviour Research and Therapy [1,   2], which correspond with what he presents elsewhere.

Eysenck’s Croatian collaborator Grossarth-Maticek conducted the therapy and collected the predictor and outcome data. A personality inventory  was used to classify participants receiving therapy into four types , a cancer-prone type (Type 1), a coronary heart disease (CHD)-prone type (Type 2), and 2 healthy types (Type 3 and Type 4). The typology was derived from quadrants in a 2×2 dichotomization of high versus low and rationality versus anti-emotionality, quite different from the dimensions and item content of the Eysenck Personality Questionnaire. Indeed, Roderick Buchanan noted in his biography that “Eysenck had struggled to banish typological concepts in favour of continuous dimensions for most of his career.” Grossarth-Maticekis questionnaire and typology has been sharply criticized later by Eysenck son Michael, among many others.

Eysenck and Grossarth-Maticek reported results of individually delivered “creative novation behaviour therapy”:

… Effects of prophylactic behaviour therapy on the cancer-prone and the CHD-prone probands respectively after 13 yr. It will be clear that treatment by means of creative novation behaviour therapy has had a highly significant prophylactic effect, preventing deaths from cancer in probands of Type 1, and death from coronary heart disease in probands of Type 2.

table 3 prophylactic effectsFor creative novation behaviour therapy delivered in a group format:

It will be seen that both cancer and CHD mortality are very significantly higher in the control group, as is death from other causes. Incidence rates are also very significantly higher in the control group for cancer, but with a difference below our selected P = 0.01 level of significance for CHD. Most telling is the difference regarding those ‘still living’-79.9% in the therapy group, 23.9% in the control group. The results of the group therapy study support those of the individual therapy group in demonstrating the value of behaviour therapy in preventing death from cancer and CHD, and in lowering the incidence from cancer and possibly from CHD.

table 4 group therapyStrong effects were reported even when the treatment was delivered as a discussion of a brief pamphlet. The companion paper  described this bibliotherapy and provided the pamphlet as an appendix,  which is reproduced here.

This statement is given to the proband, who also receives an introductory 1-hr treatment in which the meaning of the statement is explained, application considered, and likely advantages discussed. After the patient has been given time to consider the statement, and apply it to his/her own problems, the therapist spends a further 3-5 hr with the patient, suggesting specific applications of the principles in the statement to the needs of the patient, and his/her particular circumstances.

Six hundred probands received the bibliotherapy and a control group of 500 matched for personality type, smoking, age and sex received no treatment. Another 100 matched patients received a placebo condition in which they met with interviewers to discuss a pamphlet with “psychoanalytic explanation and suggestions.”

I encourage readers to take a look at the pamphlet, which is less than a page long. It ends with:

The most important aims of autonomous self-activation: your aim should always be to produce conditions would make it possible for you to lead a happy and contented life.

The results were:

There are no statistically significant differences between the control group and the placebo group, which may therefore be combined and considered a single control group. Compared with this control group, the treatment group fared significantly better. In the control group, 128 died of cancer, 176 of CHD; in the treatment group only 27 died of cancer, and 47 of CHD. For ‘death from other causes’, the figures are 192 and 115. Clearly the bibliographic method had a very strong prophylactic effect.

table 5 group and biblioEysenck and Grossarth-Maticek reported numerous other studies, including one in which 24 matched pairs of patients with inoperable cancer were assigned to either creative novation behaviour therapy or a control group. The patients receiving the behaviour therapy lived five years versus the three years of those in the control group, a difference which was highly significant.

Keep in mind that in these studies that all of the creative novation behaviour therapy sessions were solely provided by Grossarth-Maticek.

But let’s jump to a final in a series of tables constructed to make the argument that psychoanalysis was harmful to physical health.

We are here dealing with three groups. Group I is constituted of patients who terminated their  psychoanalytical treatment after 2 yr or less, and were then treated with behaviour therapy.

Group 2 is a control group matched with the members of group I on age, sex, smoking and personality type. Group 3 is a control group which discontinued psychoanalysis, like Group I, but did not receive behaviour therapy. Members of Group I and 2 do not differ significantly in mortality, but Group 3 has significantly greater mortality than either. Looking again at the percentage of patients still living, we find for Group 1 92, 95 and 95%, for Group 2 96, 89 and 95%, for Group 3 the figures are: 72, 63 and 61%. Clearly behaviour therapy can reverse the negative impact psychoanalysis has on survival.

table 15 psychoanalysisIn a number of places, this is explained in identical words:

Theoretically, this conclusion is not unreasonable. We have shown that stress is a powerful factor in causing cancer and CRD, and it is widely agreed, even among psychoanalysts, that their treatment imposes a considerable strain on patients. The hope is often expressed that finally the treatment will resolve these strains, but there is no evidence to suggest that this is true (Rachman & Wilson, 1980; Eysenk & Martin, 1987). Indeed, there is good evidence that even in cases of mental disorder psychoanalysis often does considerable harm (Mays & Franks, 1985). A theoretical model to account for these negative outcomes of psychoanalysis and psychotherapy generally has been presented elsewhere (Eysenck, 1985); it would apply equally well in the psychosomatic as in the purely psychiatric field.

dog breakfastCBT for physical health problems: a dog’s breakfast approach

Grossarth-Maticek had already formulated his approach and delivered all psychotherapy before Eysenck began co-authored papers and promoting him. In a 1982 article without Eysenck as an author, Grossarth-Maticek is quite explicit about the psychoanalytic theory behind his approach:

A central proposition of our research program is that cancer patients are either preoccupied with traumatic events of early childhood or with excessive expectations of the parents during their whole life. They are characterized by intensive internal inhibitions toward expressing feelings and desires. Therefore, we speak of a chronic blockade of expression of feelings and desires. We assume that parents of cancer patients did not respond adequately to the child’s cries for help and these children were obliged very early to do non-conforming daily task. Cancer patients have never learned to express persistent cries for help…

The specific family dynamics in the special educational pattern which block hysterical reactions determine the behavior, which in turn is characterized by excessive persistence of performance of the daily task, disregard of symptoms and lack of aggressiveness in behavior. Through the currents of negative life events (i.e., death of closely connected persons) expressions of loneliness and reactive depression can appear intensively and chronically.

If this is not clear enough:

In our approach we try not to deny the psycho analytic propositions but to integrate the psychoanalytic research program with social psychological and sociological factors, hereby assuming that they have interactive effects on carcinogenesis.

Strangely, Grossarth-Maticek suggests in this article, that the psychoanalytic factors interact with “organic risk factors such as cigarette smoking in the case of lung cancer.” Grossarth-Maticek and Eysenck would soon be receiving tens of thousands of dollars in support from the German tobacco companies and lawyers from the American tobacco companies to promote the idea that personality caused smoking and lung cancer, but any connection between smoking and lung cancer was spurious. Product liability suits against tobacco companies should therefore be dismissed.

In the articles co-authored by Grossarth-Maticek and Eysenck, these roots of what Eysenck repackaged as creative novation behaviour therapy are only hinted at, but are noticeable to the observant reader in references to the role of dependency and autonomy. Fraudulent data are mustered to show the powerful positive effects of this behaviour therapy versus the toxicity of psychoanalysis.

On page 8 of this article, ten  explicitly labeled behavioural techniques are identified as occurring across individual, group, and bibliotherapy:

  • Training for reduction of the planned behaviors initiation of autonomous behavior.
  • Training for cognitive alteration under conditions of relaxation
  • Training for alternative reactions.
  • Training for the integration of cognition, emotionality and intuition.
  • Training to achieve stable expression of feelings.
  • Training for potentiating social behavioral control
  • Training to suppress stress-creating ideas
  • Training to achieve a behavior-directing hierarchic value structure
  • Training in the suppression of stress-creating thought.
  • Abolition of dependence reactions.

This approach has only superficial resemblance to American behavioral therapy and CBT. The emphasis on expression of emotional feelings and abolition of dependent reactions is incomprehensible when it is detached from its psychoanalytic roots. The paper refers to behavioral analysis, but interviews about the past, including childhood experiences are emphasized, rather than applied behavioral analysis. The hierarchies of behavior do not correspond to operant approaches, but to a value structure of autonomy versus dependence.

There is also considerable reference to the use of hypnosis to achieve these goals.

In short, neither the goals nor the methods have much relationship to learning theory at the time that Eysenck was writing nor to contemporary developments in operant conditioning. His approach is a tortured extension of classical conditioning. Outside of the fraudulent data that Grossarth-Maticek developed and that he published with Eysenck, there is little basis for assuming that psychological factors were related to physical health in the way the treatment approach postulated.

It should be kept in mind that Eysenck was not a psychotherapist. He actually detested psychotherapy and generated considerable controversy earlier by arguing that any apparent effects of psychotherapy were due to natural remission. It should also be noted that Eysenck was claiming creation novation behaviour therapy modified personality traits, even when delivered in a brief pamphlet, in ways that could not be anticipated by his other writings about personality. Finally, the particular personality characteristics that Eysenck was talking about modifying were very different than what he assessed with the Eysenck Personality Inventory.

Only “controversial” and “too good to be true” or fraud?

 Before Eysenck began collaborating with Grossarth-Maticek, there was widespread doubts about the validity of Grossarth-Maticek’s work.  In 1973, Grossarth-Maticek’s work had been submitted to the University of Heidelberg as a Habilitation, a second doctoral degree required for a full professorship. It was rejected. One member of the committee, Manfred Amelung, declared the results “too good to be true.” He retained a copy and would later put his knowledge of its details into a devastating critique. According to Buchanan’s biography, Eysenck demanded of Grossarth-Maticek “you must let me check your data, for if you deceive me I will never forgive you.”

Eysenck gained access to the data set, sometimes directing reanalyses by Grossarth-Maticek and his statistician. Other analyses were done by Eysenck’s statisticians in London. Eysenck’s biographer Buchanan noted “there were ample opportunities to select, tease out, or redirect attention – given a data set that was apparently sprawling chaotic but rich and ambitious….From the mid-1980s, Eysenck did virtually all of the writing for publication in English and presumably exerted a strong editorial control.” Buchanan also notes that tobacco companies became skeptical of the strength of findings that were reported, but also their inconsistency. They refused to continue to support Eysenck unless an independent team was set up to check analyses and the conclusions that Eysenck was drawing from them.

Eysenck single-authored a target article for Psychological Inquiry that reproduced many of the tables that we have been discussing. More than a dozen commentators included the members of the independent team, but also others who did not have access to the data, but who examined the tables with forensic attention. The commentary started off with Manfred Manfred Amelung who made use of what he had learned from Grossarth-Maticek’s doctoral work.

Many of the commentators suggested that the intervention studies presented conclusions that were “too good to be true,” not only in terms of the efficacy claim for the intervention, but for the negative outcomes claimed for the control group. But other commentators pointed to gross inconsistencies across different reports in terms of methods and results, clear evidence of manipulation of data, including some patients being counted a number of times, other patients dying twice, Eysenck and Grossarth-Maticek’s improbable ability to obtain matching of intervention patients and controls, and too perfect predictions. In the end, even Grossarth-Maticek’s Heidelberg statistician expressed concerns that there had been tampering with the data.

Both Grossarth-Maticek and Eysenck got opportunities to respond and were defensive and dismissive of the overwhelming evidence of exaggeration of the results and even fraud.

The exchanges in Psychological Inquiry occurred over two issues. Taken together, the critical commentaries are devastating, but the criticisms became diffuse because commentators focused on different problems. It took a more succinct, pithy critique by Anthony Pelosi and Louis Appleby in The BMJ to bring the crisis of credibility to a head.

Anthony Pelosi and Louis Appleby in The BMJ

 In the first round of their two-part attack, Pelosi and Appleby centered on Eysenck and Grossarth-Maticek’s  two articles in Behaviour Research and Therapy, but referenced the critiques in Psychological Inquiry. The incredible effectiveness of these two psychiatrists depended largely on their pointing  out what was hiding in plain sight in the two Behaviour Research and Therapy articles. For instance:

After 13 years, 16 of 50 untreated type 1 subjects had died of a carcinoma. Not one of the 50 cancer prone subjects receiving the psychotherapy died of cancer. The therapy was a genuine panacea, giving equivalent results for type 2 subjects and heart disease. The all cause mortality was over 60% in untreated and 15% in treated subjects. The death rate in the untreated subjects was truly alarming as they began the trial healthy and most were between 40 and 60 years of age.

I encourage readers to compare the Pelosi and Appleby paper to the tables I presented here and see what they missed.

Pelosi and Appleby calculated the effort required by Grossarth-Maticek if he had – as Eysenck insisted- single-handedly carried out all of the treatment.

It is striking that all the individual and group therapy was given by Professor Grossarth-Maticek. The trials were undertaken between 1972 and 1974 and involved 96 subjects (or perhaps 192 subjects, see below) in at least 20 hours of individual work, and at least 10 groups (245 subjects with 20-25 in each) for six to 15 sessions each. Add to this Grossarth-Maticek’s explanatory introduction to bibliotherapy for 600 people, and it can be seen that the amount of time spent by this single senior academic on his experimental psychotherapies is huge and certainly unprecedented.

They summarized inconsistencies and contradictions reported in the Psychological Inquiry, but then added their own observation that a matching of 192 pairs of intervention and control patients had only produced a sample of 192! They suggested that in the two Behaviour Research and Therapy articles there were at least  “10 elaborate misprints or misstatements in the description of the methods” that the editor or reviewers should have caught.

At no point, does the word “fraud” or “fraudulent” appear in Pelosi and Appleby’s first article. Rather, they suggest that  “Eysenck and Grossarth-Maticek… are:

making claims which, if correct, would make creative novation therapy a vital part of public health policy throughout the world.”

They conclude with

For these reasons there should be a total reexamination and proper analysis of the original data from this research in an attempt to answer the questions listed above. The authors give their address as the Institute of Psychiatry in London, which must be concerned about protecting its reputation. Therefore the institute should, in our view, assist in this clarification of the meaning of the various studies. There should also be some stern questions asked of the editors of the various journals involved, especially those concerned among the editorial staff of Behaviour Research and Therapy who, in our opinion, have done a disservice to their scientific disciplines, and indeed to Professors Eysenck and Grossarth-Maticek, in allowing this ill considered presentation of research on such a serious topic.

Eysenck’s reply and Pelosi and Appleby’s response

 Readers can consult Eysenck’s reply  for themselves, but it strikes me as evasive and dismissive. Specific criticisms are not directly answered, but Eysenck points to consistency between his results and those of David Spiegel, who had claimed to get even stronger effects in his small study of supportive expressive therapy for women with metastatic breast cancer. Rather than demolishing the credibility of his work with Grossarth-Maticek, Eysenck argues that Pelosi and Appleby only point to the need for funding of a replication. Eysenck closes with:

Their critical review, however incorrect, full of errors and misunderstandings, and lacking in objectivity, may have been useful in drawing attention to a large body of work, of both scientific and social relevance, that has been overlooked for too long.

Pelosi and Appleby took Eysenck’s reply as an opportunity to get even more specific in the criticisms:

We are accused of being vague in mentioning many errors, inappropriate analyses, and missing details in the publications on this research programme. We value this opportunity to be more specific, to clarify just a few of the questions raised by ourselves and others, which Eysenck has failed to answer, and to outline additional findings from these authors’ investigations.

After a detailed reply, they wrap up with references to the criticisms that Eysenck received in Psychological Inquiry, in an ironic note, turning Eysenck’s attacks on proponents of the link between smoking and lung cancer on to Eysenck himself:

Our concern has been to clarify the methods and analyses of a body of research which, if accurate, would profoundly influence public health policies on cancer and heart disease. Other critics have been more challenging in what they have alleged, and in our opinion the controversy which now surrounds one of academic psychology’s most influential figures constitutes a crisis for the subject itself. The seriousness of the detailed allegations by van der Ploeg, although refuted by Eysenck and Grossarth-Maticek, should in themselves prompt these authors to reexamine their own findings after appropriate further training in the methodology of medical research. Perhaps the most skilfully worded criticism on this subject was made not about Eysenck but by him in a debate on the relation between smoking and cancer. In disputing the findings of Doll and Hill’s epidemiological studies on this association he comments: “What we have found are serious methodological weaknesses in the design of the studies quoted in favour of these theories, statistical errors, and unsubstantiated extrapolations from dubious data to unconfirmed conclusions.” Eysenck owes it to himself and to his discipline to reconsider critically his own work on this subject.

In the over 20 years since this exchange, Pelosi and Appleby and their ally editor Richard Smith of The BMJ failed to get an appropriate response from the British Psychological Society, King’s College London or the Institute of Psychiatry, the journal Behaviour Research and Therapy, or the Committee on Publication Ethics (COPE). This situation demonstrates the inability of British academia to correct bad and even fraudulent science. It stands as a cautionary note to those of us now attempting to correct what we perceive as bad science. Efforts are likely to be futile. On the other hand, the editorship of Behaviour Research and Therapy has passed to an American, Michelle Craske, a professor at UCLA. Perhaps she can be persuaded to make a long overdue correction to the scientific record and remove a serious blemish on the credibility of that Journal.

If there is sufficient interest, I will survey the profound influence of the fraudulent work of Eysenck and Grossarth-Maticek in a future blog post.

  • Because of their influence, CBT in the UK gives an exaggerated emphasis to early childhood adversity and much less to functional behavioural analysis than the American behavior therapy and CBT.
  • Consistent with Eysenck’s influence, CBT for physical problems in the UK largely focuses on self-report questionnaire assessments of mechanism of change and of outcome, rather than functional behavioral and objective physical health outcome variables.

Influences can also be seen in:

Contemporary CBT for physical conditions as practiced in UK, including CBT for irritable bowel syndrome (IBS), fibromyalgia, and other “all in the head” conditions that are deemed Medically Unexplained Symptoms (MUS) in the UK, as in PRINCE trial of Trudie Chalder and Simon Wessely.

The “psychosomatic” approach as seen in neurologist Suzanne O’Sullivan’s  recent editorial in The Lancet and her “It’s All in Your Head”, which won the 2016 Wellcome Book Award and her.

Quack treatments, such as Phil Parker’s Lightning Process, which the UK’s Advertising Standards Authority (ASA) ruled against advertising its effectiveness in treatment of chronic fatigue syndrome/ myalgic Encephalopathy,  multiple sclerosis, and irritable bowel syndrome/digestive issues. The Lightning Process is nonetheless implemented in the UK NHS under the direction of University of Bristol Professor Esther Crawley 

Quack cancer treatments such as Simonton visualization method.

More mainstream, but unproven psychological treatments for cancer including David Spiegel’s supportive expressive therapy. Neither Spiegel –nor anyone else– has ever been able to replicate the finding praised by Eysenck, but repeats his claims in a recent non-peer reviewed article in the UK-based Psycho-Oncology and with a closely related article in BPS’ British Journal of Health Psychology.

More mainstream, but unproven psychological approaches to cancer that claim to improve immune functioning by reducing stress.

Some Scottish readers will understand this message concerning Eysenck’s fraud: The ice cream man cometh.

My usual disclaimer: All views that I express are my own and do not necessarily reflect those of PLOS or other institutional affiliations.

Getting realistic about changing the direction of suicide prevention research

A recent JAMA: Psychiatry article makes some important points about the difficulties addressing suicide as a public health problem before sliding into the authors’ promotion of their personal agendas.

Christensen H, Cuijpers P, Reynolds CF. Changing the Direction of Suicide Prevention Research: A Necessity for True Population Impact. JAMA Psychiatry. 2016.

This issue of Mind the Brain:

  • Reviews important barriers to effective approaches to reducing suicide, as cited in the editorial.
  • Discusses editorials in general as a form of privileged access publishing by which non-peer-reviewed material makes its way into ostensibly peer reviewed journals.
  • Identifies the self-promotional and personal agendas of the authors reflected in the editorial.
  • Notes that the leading means of death by suicide in the United States is not even mentioned, much less addressed in this editorial. I’ll discuss the politics behind this and why its absence reduces this editorial to a venture in triviality, except that it is a call for the waste of millions of dollars.

Barriers to reducing mortality by suicide

stop suicidePrevention of death by suicide becomes an important public health and clinical goal because of suicide’s contribution to overall mortality, the seeming senselessness of suicide, and its costs at a personal and social level. Yet as a relatively infrequent event, death by suicide resists prediction and effective preventive intervention.

Evidence concerning the formidable barriers to reducing death by suicide inevitably clashes with the strong emotional appeals and political agendas of those demanding suicide intervention programs.

Skeptics encounter stiff resistance and even vilification when they insist that clinical and social policy concerning suicide should be based on evidence.

Robin WilliamsA skeptic soon finds that trying to contest emotional and political appeals quickly becomes like trying to counter Ted Cruz or Donald Trump with evidence contradicting their proposals for dealing with terrorism or immigration. This is particularly likely after suicides by celebrities or a cluster of suicides by teenagers in a community. Who wants to pay attention to evidence when emotions are high and tears are flowing?

See my recent blog post, Preventing Suicide in All the Wrong Ways for some inconvenient truths about suicide and suicide prevention.

The JAMA: Psychiatry article’s identification of barriers

The JAMA: Psychiatry article identifies some key barriers to progress in reducing deaths due to suicide [bullet points added to direct quotes]:

  • Suicide rates in most Western countries have not decreased in the last decade, a finding that compares unfavorably with the progress made in other areas, such as breast and skin cancers, human immunodeficiency virus, and automobile accidents, for which the rates have decreased by 40% to 80%.
  • Preventing suicide is not easy. The base rate of suicide is low, making it hard to determine which individuals are at risk.
  • Our current approach to the epidemiologic risk factors has failed because prediction studies have no clinical utility—even the highest odds ratio is not informative at the individual level.
  • Decades of research on predicting suicides failed to identify any new predictors, despite the large numbers of studies.
  • A previous suicide attempt is our best marker of a future attempt, but 60% of suicides are by persons who had made no previous attempts.
  • Although recent studies in cognitive neuroscience have shed light on the cognitive “lesions” that underlie suicide risk, especially deficits in executive functioning, we have no biological markers of suicide risk, or indeed of any mental illness.
  • People at risk of suicide do not seek help. Eighty percent of people at risk have been in contact with health services prior to their attempts, but they do not identify themselves, largely because they do not think that they need help.
  • As clinicians, we know something about the long-term risk factors for suicide, but we are much less able to disambiguate short-term risk or high-risk factors from the background of long-term risk factors.

How do editorials come about? Not peer review!

 Among the many privileges of being editor-in-chief or associate editors of journals is the opportunity to commission articles that do not undergo peer review. Editors and their friends are among the regular recipients of these gifts that largely escape scrutiny.

Editorials often provide a free opportunity for self-citation and promotion of agenda. Over the years, I’ve noticed that editorials are frequently used to increase the likelihood that particular research topics will become a priority for funding for the particular ideas will be given advantage in competition for funding.

Editorials of great opportunities for self citation. If an editorial in a prestigious journal cites articles published in less prestigious places, readers will often cite the article, without bothering to examine the original source. This is a way of providing false authority  to poor quality or irrelevant evidence.

Not only do authors of commissioned articles get to say what they wish without peer review, they can restrict what can be said in reply. Journals are less willing to publish letters to the editor about editorials rather than empirical papers. They often give the writers of the editorial veto power over what criticism is published. Journals always give the writers of the editorial last word in any exchange.

So, editorials and commentaries can be free sweet plums if you know how to use them strategically.

The authors

Helen Christensen, PhD Black Dog Institute, University of New South Wales, Randwick, New South Wales, Australia.

Pim Cuijpers, PhD Department of Clinical, Neuro, and Developmental Psychology, Vrije Universiteit Amsterdam, the Netherlands

Charles F. Reynolds III, MD Department of Psychiatry and Neurology, Western Psychiatric Institute and Clinic, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania.

The authors’ agendas

Helen Christianson

Helen Christianson is the Chief Scientist and Director of the Black Dog Institute, which is described at its website:

Our unique approach incorporates clinical services with our cutting-edge research, our health professional training and community education programs. We combine expertise in clinical management with innovative research to develop new, and more effective, strategies for people living with mental illness. We also place emphasis on teaching people to recognise the symptom of poor mental health in themselves and others, as well as providing them with the right psychological tools to hold the black dog at bay.

A key passage in the JAMA: Psychiatry editorial references her work.

Modeling studies have shown that if all evidence-based suicide prevention strategies were integrated into 1 multifaceted systems approach, about 20% to 25% of all suicides might be prevented.

Here is the figure from the editorial:

suicide prevenino strategies

The paper that is cited  would be better characterized as an advocacy piece, rather than a balanced systematic review.

Most fundamentally, Christiansen makes the mistake of summing attributable risk factors  to obtain a grand total of what would be accomplished if all of  a set of risk factors were addressed.

The problem is that attributable risk factors are dubious estimates derived from correlational analyses which assume that the entire correlation coefficient represents a modifiable risk. Such estimates ignore confounding. If one adds together attributable risk factors calculated in this manner, one gets a grossly inflated view of how much a phenomenon can be controlled. The attributable risk factors are themselves correlated and they share common confounds. That’s why it is bad science to combine them.

Christiansen identifies the top three modifiable risk for suicide as (1) training general practitioners in detection and treatment of suicidal risk, and notably depression; (2) training of gatekeepers such as school personnel, police, (and in some contexts, clergy) who might have contact with persons on the verge of dying by suicide; and (3) psychosocial treatments, namely psychotherapy.

Training of general practitioners and gatekeepers has not been shown to be an effective way of reducing rates of suicide. #Evidenceplease. I’ve been an external scientific advisor to over a decade of programs in Europe which emphasized these strategies. We will soon be publishing the last of our disappointing results.

Think of it: in order to be effective in averting death by suicide, training of police requires that police be on the scene in circumstances where they could use that training to prevent someone from dying by suicide, say, by jumping from a bridge or self-inflicted gun wounds. The likelihood is low that it would be a police officer with sufficient training being in the right place at the right time, with sufficient time and control of the situation to prevent a death. A police officer who had received training would unlikely encounter only a few, if any situations in an entire career.

The problem of death by suicide being an infrequent event that is poorly predicted again rears its ugly head.

Christiansen also makes a dubious assumption that more readily availability of psychotherapy will substantially reduce the risk of suicide. The problem is that persons who die by suicide are often in contact with professionals, but they either break the contact shortly before death or never disclose their intentions.

Christiansen provides a sizable estimate for the reduction in risk for suicide by means restriction.

]. Yet, I suspect that she underestimates the influence of this potentially modifiable factor.

She focuses on restricting access to prescription medications used in suicides by overdose. I don’t know if death-by-overdose data holds for even Australia, but the relevant means needing restriction in the United States is access to firearms. I will say more about that later.

So, Christiansen makes use of the editorial to sell her pet ideas and her institute markets training.

Pim Cuijpers

Pim Cuijpers doesn’t cite himself and doesn’t need to. He is rapidly accumulating a phenomenal record of publications and citations. But he is an advocate for large-scale programs incorporating technology, and notably the Internet to reduce suicide. His interests are reflected in passages like

Large-scale trials are also needed. Even if we did all of these things, large-scale research programs with millions of people are required, and technology by itself will not be enough. Although new large trials show that the effects of community programs can be effective,1,6 studies need to be bigger, combining all evidence-based medical and community strategies, using technology effectively to reduce costs of identification and treatment.

And

Help-seeking may well be assisted by using social media. Online social networks such as Facebook can be used to provide peer support and to change community attitudes in the ways already used by marketing industries. We can use the networks of “influencers” to modify attitudes and behavior in specific high-risk groups, such as the military, where suicide rates are high, or “captive audiences” in schools.

Disseminating effective programs is no longer difficult using online mental health programs. Although some early suicide apps and websites have been tested, better online interventions are needed that can respond to temporal fluctuations in suicide risk. The power of short-term prediction tools should be combined with the timely delivery of unobtrusive online or app personalized programs. However, if these development are not supported by government or industry and implemented at a population level, they will remain missed opportunities.

suicide is preventable
100% PREVENTABLE BY WHOM?

Pim Cuijpers is based the Netherlands and writing at a time when enthusiasm of  the European Research Council  is waning in funding large-scale suicide prevention programs, especially expensive ones requiring millions of participants. Such studies have been going on for over a decade and the yield is not impressive.

The projects on which I consulted adopted the reasonable assumption that because suicide is a rare event, a population of 500,000 would not be sufficient to detect a statistically significant reduction in suicide rates of less than 30%. Consider all the extraneous events that can impinge on comparisons between intervention and control sites in the time period in which the intervention could conceivably be influential. this is too low an estimate of the sample that would be needed.

The larger the sample, the greater the likelihood of extraneous influences, the greater the likelihood that the intervention wouldn’t prove effective at key moments when it was needed to avert a death by suicide, and the greater the cost. See more about this here.

Pim Cuijpers has been quite influential in developing in evaluating web-based and app-based interventions. But after initial enthusiasm, the field is learning that such resources are not effective if left unattended without users being provided with a sense that they are in some sort of a human relationship within which they are consistent use of this technology is being monitored and appreciated, as seen in appropriate feedback. Pim Cuijpers has contributed the valuable concept of supportive accountability.  I have borrowed it to explain what is missing when primary care physicians simply give depressed patients a password to an Internet program and leave it at that, expecting they will get any benefit.

Evaluations of such technology have been limited to whether they reduce depressive symptoms. There is as much a leap from evidence of such reductions, when they occur, claims about preventing suicide, as there is from leaping from evidence that psychotherapy reduces the depressive symptoms to a case that psychotherapy prevents suicide.

Enlisting users of Facebook to monitor and report expressions of suicidality is not evidence based, It is evaluated by some as a disaster and a consumer group is circulating a petition   demanding  that such practices stop. A critical incident  was

man gets arrested for fake suicide messageCharles F. Reynolds

Although Charles Reynolds does not reference his paper in the text of the editorial, but nonetheless cites it.

I have critiqued the study elsewhere. It was funded in a special review only because of political pressure from Senator Harry Reid. The senator’s father had died by suicide shortly after a visit to a primary care physician. Harry Reid required that Congress fund a study showing that improving the detection and treatment of suicidality in the elderly by primary care physicians would reduce suicide.

I was called by an NIMH program officer when I failed to submit a letter of intent concerning applying for that initiative. I told her it was a boondoggle because no one could show a reduction in suicides by targeting physician behavior. She didn’t disagree, but said a project would have to funded. She ended up a co-author on the PROSPECT paper. You don’t often see program officers getting authorship on papers from projects they fund.

The resulting PROSPECT study involved 20 primary care practices in three regions of the Northeastern United States. In the course of the intervention study, one patient in the intervention group died by suicide and two patients, one in each of the intervention and control group, made serious attempts. A multimillion dollar study confronted the low incidence of suicide, even among the elderly. Furthermore, the substantial baseline differences among the practices dwarfed any differences in suicidal ideation in the intervention versus control group. And has of discussed elsewhere [  ], suicidal ideation is a surrogate end point that can be changed by factors that do not alter risk for suicide. No one advocating more money for these kind of studies would want to get into the details of this one.

 

So, the editorial acknowledges the difficulties studying and preventing suicide as a public health issue. It suggests that an unprecedented large study costing millions of dollars would be necessary if progress is to be made. There are formidable barriers to implementing an intervention in a large population of the complexity of the editorial suggests is necessary. Just look at the problems that PROSPECT encountered.

Who will set the direction of suicide prevention research?

The editorial opens with a citation of a blog by the then Director of NIMH

Insel T. Director’s Blog: Targeting suicide. National Institutes of Health website. Posted April 2, 2015.

The blog calls for a large increase in funding for the research concerning suicide and its prevention. The definition of the problem is shaped by politics more than evidence. But at least the blog post is more candid than the editorial in making a passing reference to the leading means of suicide in the United States, firearms.

51 percent of suicide deaths in the U.S. were by firearms. Research has already demonstrated that reducing access to lethal means (including gun locks and barriers on bridges) can reduce death rates.

Great, but surely death by firearms deserves more mentioned than a passing reference to locks on guns if the Director of NIMH is serious about asking Congress for a massive increase in funding for suicide research. Or is he being smart in avoiding the issue and even brave in the passing reference that he makes to firearms?

Firearms deserve not only mention, but thoughtful analysis. But in the United States, it is politically dangerous and could threaten future funding. So we talk about other things.

Banning research on the role of firearms in suicide

For a source that is much more honest, evidence-based, and well argued than this JAMA: Psychiatry editorial, I recommend A Psychiatrist Debunks the Biggest Myths Surrounding Gun Suicides.

In 1996, Congress imposed a ban on research concerning the effects of gun ownership on public health, including suicide.

In the spring of 1996, the National Rifle Association and its allies set their sights on the Centers for Disease Control and Prevention for funding increasingly assertive studies on firearms ownership and the effects on public health. The gun rights advocates claimed the research veered toward advocacy and covered such logical ground as to be effectively useless.

At first, the House tried to close down the CDC’s entire, $46 million National Center for Injury Prevention. When that failed, [Congressman Jay Dickey to whom the Dickey amendment is named] Dickey stepped in with an alternative: strip $2.6 million that the agency had spent on gun studies that year. The money would eventually be re-appropriated for studies unrelated to guns. But the far more damaging inclusion was language that stated, “None of the funds made available for injury prevention and control at the Centers for Disease Control and Prevention may be used to advocate or promote gun control.”

Dickey proclaimed victory — an end, he said at the time, to the CDC’s attempts “to raise emotional sympathy” around gun violence. But the agency spent the subsequent years petrified of doing any research on gun violence, making the costs of the amendment clear even to Dickey himself.

He said the law was over-interpreted. Now, he looks at simple advances in highway safety — safety barriers, for example — and wonders what could have been done for guns.

The Dickey amendment does not specifically ban NIMH from investigating the role of firearms in suicide, but I think Tom Insel and all NIMH directors before and after him get the message.

Recently an effort to repeal the Dickey amendment failed:

Just hours before the mass shooting in San Bernardino on Wednesday, physicians gathered on Capitol Hill to demand an end to the Dickey Amendment restricting federal funding for gun violence research. Members of Doctors for America, the American College of Preventative Medicine, the American Academy of Pediatrics and others presented a petition against the research ban signed by more than 2,000 doctors.

“Gun violence is probably the only thing in this country that kills so many people, injures so many people, that we are not actually doing sufficient research on,” Dr. Alice Chen, the executive director of Doctors for America, told The Huffington Post.

Well over half a million people have died by firearms since 1996, when the ban on gun violence research was enacted, according to a HuffPost calculation of data through 2013 from Centers for Disease Control and Prevention. According to its sponsors, the Dickey Amendment was supposed to tamp down funding for what the National Rifle Association and other critics claimed was anti-gun advocacy research by the CDC’s National Center for Injury Prevention. In effect, it stopped federal gun violence research almost entirely.

So, why didn’t the Associate Editor of the JAMA: Psychiatry, Charles Reynolds exercise his editorial prerogative and support this effort to repeal the Dickey amendment, rather than lining up with his co-authors in a call for more wasteful research that avoids this important issue?

Study: Switching from antidepressants to mindfulness meditation increases relapse

  • A well-designed recent study found that patients with depression in remission who switch from maintenance antidepressants to mindfulness meditation without continuing medication had an increase in relapses.
  • The study is better designed and more transparently reported than a recent British study, but will get none of the British study’s attention.
  • The well-orchestrated promotion of mindfulness raises issues about the lack of checks and balances between investigators’ vested interest, supposedly independent evaluation, and the making of policy.

The study

Huijbers MJ, Spinhoven P, Spijker J, Ruhé HG, van Schaik DJ, van Oppen P, Nolen WA, Ormel J, Kuyken W, van der Wilt GJ, Blom MB. Discontinuation of antidepressant medication after mindfulness-based cognitive therapy for recurrent depression: randomised controlled non-inferiority trial. The British Journal of Psychiatry. 2016 Feb 18:bjp-p.

The study is currently behind a pay wall and does not appear to have a press release. These two factors will not contribute to it getting the attention it deserves.

But the protocol for the study is available here.

Huijbers MJ, Spijker J, Donders AR, van Schaik DJ, van Oppen P, Ruhé HG, Blom MB, Nolen WA, Ormel J, van der Wilt GJ, Kuyken W. Preventing relapse in recurrent depression using mindfulness-based cognitive therapy, antidepressant medication or the combination: trial design and protocol of the MOMENT study. BMC Psychiatry. 2012 Aug 27;12(1):1.

And the trial registration is here

Mindfulness Based Cognitive Therapy and Antidepressant Medication in Recurrent Depression. ClinicalTrials.gov: NCT00928980

The abstract

Background

Mindfulness-based cognitive therapy (MBCT) and maintenance antidepressant medication (mADM) both reduce the risk of relapse in recurrent depression, but their combination has not been studied.

Aims

To investigate whether MBCT with discontinuation of mADM is non-inferior to MBCT+mADM.

Method

A multicentre randomised controlled non-inferiority trial (ClinicalTrials.gov: NCT00928980). Adults with recurrent depression in remission, using mADM for 6 months or longer (n = 249), were randomly allocated to either discontinue (n = 128) or continue (n = 121) mADM after MBCT. The primary outcome was depressive relapse/recurrence within 15 months. A confidence interval approach with a margin of 25% was used to test non-inferiority. Key secondary outcomes were time to relapse/recurrence and depression severity.

Results

The difference in relapse/recurrence rates exceeded the non-inferiority margin and time to relapse/recurrence was significantly shorter after discontinuation of mADM. There were only minor differences in depression severity.

Conclusions

Our findings suggest an increased risk of relapse/recurrence in patients withdrawing from mADM after MBCT.

Translation?

Meditating_Dog clay___4e7ba9ad6f13e

A comment by Deborah Apthorp suggested that the original title Switching from antidepressants to mindfulness meditation increases relapse was incorrect. Checking it I realized that the abstract provides the article was Confusing, but the study did indded show that mindfulness alone led to more relapses and continued medication plus mindfulness.

Here is what is said in the actual introduction to the article:

The main aim of this multicentre, noninferiority effectiveness trial was to examine whether patients who receive MBCT for recurrent depression in remission could safely withdraw from mADM, i.e. without increased relapse/recurrence risk, compared with the combination of these interventions. Patients were randomly allocated to MBCT followed by discontinuation of mADM or MBCT+mADM. The study had a follow-up of 15 months. Our primary hypothesis was that discontinuing mADM after MBCT would be non-inferior, i.e. would not lead to an unacceptably higher risk of relapse/ recurrence, compared with the combination of MBCT+mADM.

Here is what is said in the discussion:

The findings of this effectiveness study reflect an increased risk of relapse/recurrence for patients withdrawing from mADM after having participated in MBCT for recurrent depression.

So, to be clear, the sequence was that patients were randomized either to MBCT without antidepressant or to MBCT with continuing antidepressants. Patients were then followed up for 15 months. Patients who received MBCT without the antidepressants have significantly more relapses/recurrences In the follow-up period than those who received MBCT with antidepressants.

The study addresses the question about whether patients with remitted depression on maintenance antidepressants who were randomized to receive mindfulness-based cognitive therapy (MBCT) have poorer outcomes than those randomized to remaining on their antidepressants.

The study found that poorer outcomes – more relapses – were experienced by patients switching to MBCT verses those remaining on antidepressants plus MBCT.

Strengths of the study

The patients were carefully assessed with validated semi structured interviews to verify they had recurrent past depression, were in current remission, and were taking their antidepressants. Assessment has an advantage over past studies that depended on less reliable primary-care physicians’ records to ascertain eligibility. There’s ample evidence that primary-care physicians often do not make systematic assessments deciding whether or not to preparation on antidepressants.

The control group. The comparison/control group continued on antidepressants after they were assessed by a psychiatrist who made specific recommendations.

 Power analysis. Calculation of sample size for this study was based on a noninferiority design. That meant that the investigators wanted to establish that within particular limit (25%), whether switching to MBCT produce poor outcomes.

A conventional clinical trial is designed to see if the the null hypothesis can rejected of no differences between intervention and control group. As an noninferiority trial, this study tested the null hypothesis that the intervention, shifting patients to MBCT would not result in an unacceptable rise, set at 25% more relapses and recurrences. Noninferiority trials are explained here.

Change in plans for the study

The protocol for the study originally proposed a more complex design. Patients would be randomized to one of three conditions: (1) continuing antidepressants alone; (2) continuing antidepressants, but with MBCT; or (3) MBCT alone. The problem the investigators encountered was that many patients had a strong preference and did not want to be randomized. So, they conducted two separate randomized trials.

This change in plans was appropriately noted in a modification in the trial registration.

The companion study examined whether adding MBCT to maintenance antidepressants reduce relapses. The study was published first:

Huijbers MJ, Spinhoven P, Spijker J, Ruhé HG, van Schaik DJ, van Oppen P, Nolen WA, Ormel J, Kuyken W, van der Wilt GJ, Blom MB. Adding mindfulness-based cognitive therapy to maintenance antidepressant medication for prevention of relapse/recurrence in major depressive disorder: Randomised controlled trial. Journal of Affective Disorders. 2015 Nov 15;187:54-61.

A copy can be obtained from this depository.

It was a smaller study – 35 patients randomized to MBCT alone and 33 patients randomized to a combination of MBCT and continued antidepressants. There were no differences in relapse/recurrence in 15 months.

An important limitation on generalizability

 The patients were recruited from university-based mental health settings. The minority of patients who move from treatment of depression in primary care to a specially mental health settings proportionately include more with moderate to severe depression and with a more defined history of past depression. In contrast, the patients being treated for depression in primary care include more who were mild to moderate and whose current depression and past history have not been systematically assessed. There is evidence that primary-care physicians do not make diagnoses of depression based on a structured assessment. Many patients deemed depressed and in need of treatment will have milder depression and only meet the vaguer, less validated diagnosis of Depression Not Otherwise Specified.

Declaration of interest

The authors indicated no conflicts of interest to declare for either study.

Added February 29: This may be a true statement for the core Dutch researchers who led in conducted the study. However, it is certainly not true for the British collaborator who may have served as a consultant and got authorship as result. He has extensive conflicts of interest and gains a lot personally and professionally from promotion of mindfulness in the UK. Read on.

The previous British study in The Lancet

Kuyken W, Hayes R, Barrett B, Byng R, Dalgleish T, Kessler D, Lewis G, Watkins E, Brejcha C, Cardy J, Causley A. Effectiveness and cost-effectiveness of mindfulness-based cognitive therapy compared with maintenance antidepressant treatment in the prevention of depressive relapse or recurrence (PREVENT): a randomised controlled trial. The Lancet. 2015 Jul 10;386(9988):63-73.

I provided my extended critique of this study in a previous blog post:

Is mindfulness-based therapy ready for rollout to prevent relapse and recurrence in depression?

The study protocol claimed it was designed as a superiority trial, but the authors did not provide the added sample size needed to demonstrate superiority. And they spun null findings, starting in their abstract:

However, when considered in the context of the totality of randomised controlled data, we found evidence from this trial to support MBCT-TS as an alternative to maintenance antidepressants for prevention of depressive relapse or recurrence at similar costs.

What is wrong here? They are discussing null findings as if they had conducted a noninferiority trial with sufficient power to show that differences of a particular size could be ruled out. Lots of psychotherapy trials are underpowered, but should not be used to declare treatments can be substituted for each other.

Contrasting features of the previous study versus the present one

Spinning of null findings. According to the trial registration, the previous study was designed to show that MBCT was superior to maintenance antidepressant treatment and preventing relapse and recurrence. A superiority trial tests the hypothesis that an intervention is better than a control group by a pre-set margin. For a very cool slideshow comparing superiority to noninferiority trials, see here .

Rather than demonstrating that MBCT was superior to routine care with maintenance antidepressant treatment, The Lancet study failed to find significant differences between the two conditions. In an amazing feat of spin, the authors took to publicizing this has a success that MBCT was equivalent to maintenance antidepressants. Equivalence is a stricter criterion that requires more than null findings – that any differences be within pre-set (registered) margins. Many null findings represent low power to find significant differences, not equivalence.

Patient selection. Patients were recruited from primary care on the basis of records indicating they had been prescribed antidepressants two years ago. There was no ascertainment of whether the patients were currently adhering to the antidepressants or whether they were getting effective monitoring with feedback.

Poorly matched, nonequivalent comparison/control group. The guidelines that patients with recurrent depression should remain on antidepressants for two years when developed based on studies in tertiary care. It’s likely that many of these patients were never systematically assessed for the appropriateness of treatment with antidepressants, follow-up was spotty, and many patients were not even continuing to take their antidepressants with any regularit

So, MBCT was being compared to an ill-defined, unknown condition in which some proportion of patients do not need to be taken antidepressants and were not taking them. This routine care also lack the intensity, positive expectations, attention and support of the MBCT condition. If an advantage for MBCT had been found – and it was not – it might only a matter that there was nothing specific about MBCT, but only the benefits of providing nonspecific conditions that were lacking in routine care.

The unknowns. There was no assessment of whether the patients actually practiced MBCT, and so there was further doubt that anything specific to MBCT was relevant. But then again, in the absence of any differences between groups, we may not have anything to explain.

  • Given we don’t know what proportion of patients were taking an adequate maintenance doses of antidepressants, we don’t know whether anything further treatment was needed for them – Or for what proportion.
  • We don’t know whether it would have been more cost-effective simply to have a depression care manager  recontact patients recontact patients, and determine whether they were still taking their antidepressants and whether they were interested in a supervised tapering.
  • We’re not even given the answer of the extent to which primary care patients provided with an MBCT actually practiced.

A well orchestrated publicity campaign to misrepresent the findings. Rather than offering an independent critical evaluation of The Lancet study, press coverage offered the investigators’ preferred spin. As I noted in a previous blog

The headline of a Guardian column  written by one of the Lancet article’s first author’s colleagues at Oxford misleadingly proclaimed that the study showed

freeman promoAnd that misrepresentation was echoed in the Mental Health Foundation call for mindfulness to be offered through the UK National Health Service –

 

calls for NHS mindfulness

The Mental Health Foundation is offering a 10-session online course  for £60 and is undoubtedly prepared for an expanded market

Declaration of interests

WK [the first author] and AE are co-directors of the Mindfulness Network Community Interest Company and teach nationally and internationally on MBCT. The other authors declare no competing interests.

Like most declarations of conflicts of interest, this one alerts us to something we might be concerned about but does not adequately inform us.

We are not told, for instance, something the authors were likely to know: Soon after all the hoopla about the study, The Oxford Mindfulness Centre, which is directed by the first author, but not mentioned in the declaration of interest publicize a massive effort by the Wellcome Trust to roll out its massive Mindfulness in the Schools project that provides mindfulness training to children, teachers, and parents.

A recent headline in the Times: US & America says it all.

times americakey to big bucks 

 

 

A Confirmation bias in subsequent citing

It is generally understood that much of what we read in the scientific literature is false or exaggerated due to various Questionable Research Practices (QRP) leading to confirmation bias in what is reported in the literature. But there is another kind of confirmation bias associated with the creation of false authority through citation distortion. It’s well-documented that proponents of a particular view selectively cite papers in terms of whether the conclusions support of their position. Not only are positive findings claimed original reports exaggerated as they progress through citations, negative findings receie less attention or are simply lost.

Huijbers et al.transparently reported that switching to MBCT leads to more relapses in patients who have recovered from depression. I confidently predict that these findings will be cited less often than the poorer quality The Lancet study, which was spun to create the appearance that it showed MBCT had equivalent  outcomes to remaining on antidepressants. I also predict that the Huijbers et al MBCT study will often be misrepresented when it is cited.

Added February 29: For whatever reason, perhaps because he served as a consultant, the author of The Lancet study is also an author on this paper, which describes a study conducted entirely in the Netherlands. Note however, when it comes to the British The Lancet study,  this article cites it has replicating past work when it was a null trial. This is an example of creating a false authority by distorted citation in action. I can’t judge whether the Dutch authors simply accepted the the conclusions offered in the abstract and press coverage of The Lancet study, or whether The Lancet author influenced their interpretation of it.

I would be very curious and his outpouring of subsequent papers on MBCT, whether The author of  The Lancet paper cites this paper and whether he cites it accurately. Skeptics, join me in watching.

What do I think is going on it in the study?

I think it is apparent that the authors have selected a group of patients who have remitted from their depression, but who are at risk for relapse and recurrence if they go without treatment. With such chronic, recurring depression, there is evidence that psychotherapy adds little to medication, particularly when patients are showing a clinical response to the antidepressants. However, psychotherapy benefits from antidepressants being added.

But a final point is important – MBCT was never designed as a primary cognitive behavioral therapy for depression. It was intended as a means of patients paying attention to themselves in terms of cues suggesting there are sliding back into depression and taking appropriate action. It’s unfortunate that been oversold as something more than this.

 

Effect of a missing clinical trial on what we think about cognitive behavior therapy

  • Data collection for a large, well-resourced study of cognitive behavior therapy (CBT) for psychosis was completed years ago, but the study remains unpublished.
  • Its results could influence the overall evaluation of CBT versus alternative treatments if integrated with what is already known.
  • Political considerations can determine whether completed psychotherapy studies get published or remain lost.
  • This rich example demonstrates the strong influence of publication bias on how we assess psychotherapies.
  • What can be done to reduce the impact of this particular study having gone missing?

A few years ago Ben Goldacre suggested that we do a study of the registration of clinical trials.

lets'collaborate

I can’t remember the circumstances, but Goldacre and I did not pursue the idea further. I was already committed to studying psychological interventions, in which Goldacre was much less interested. Having battled to get American Psychological Association to fully accept and implement CONSORT in its journals, I was well aware how difficult it was getting the professional organizations offering the prime outlets for psychotherapy studies to accept needed reform. I wanted to stay focused on that.

I continue to follow Goldacre’s work closely and cite him often. I also pay particular attention to John Ioannidis’ follow up of his documentation that much of what we found in the biomedical literature is false or exaggerated, like:

Ioannidis JP. Clinical trials: what a waste. BMJ. 2014 Dec 10;349:g7089

Many trials are entirely lost, as they are not even registered. Substantial diversity probably exists across specialties, countries, and settings. Overall, in a survey conducted in 2012, only 30% of journal editors requested or encouraged trial registration.

In a seeming parallel world, I keep showing that in psychology the situation is worse. I had a simple explanation why that I now recognize was naïve: Needed reforms enforced by regulatory bodies like the US Food and Drug Administration (FDA) take longer to influence the psychotherapy literature, where there are no such pressures.

I think we now know that in both biomedicine and, again, psychology, that broad declarations of government and funding bodies and even journals’ of a commitment to disclose a conflict of interest, registering trials, sharing data, are insufficient to ensure that the literature gets cleaned up.

Statements were published across 14 major medical journals endorsing routine data sharing]. Editors of some of the top journals immediately took steps to undermine the implementation in their particular journals. Think of the specter of “research parasites, raised by the editors of New England Journal of Medicine (NEJM).

Another effort at reform

Following each demonstration that reforms are not being implemented, we get more pressures to do better. For instance, the 2015 World Health Organization (WHO) position paper:

Rationale for WHO’s New Position Calling for Prompt Reporting and Public Disclosure of Interventional Clinical Trial Results

WHO’s 2005 statement called for all interventional clinical trials to be registered. Subsequently, there has been an increase in clinical trial registration prior to the start of trials. This has enabled tracking of the completion and timeliness of clinical trial reporting. There is now a strong body of evidence showing failure to comply with results-reporting requirements across intervention classes, even in the case of large, randomised trials [37]. This applies to both industry and investigator-driven trials. In a study that analysed reporting from large clinical trials (over 500 participants) registered on clinicaltrials.gov and completed by 2009, 23% had no results reported even after a median of 60 months following trial completion; unpublished trials included nearly 300,000 participants [3]. Among randomised clinical trials (RCTs) of vaccines against five diseases registered in a variety of databases between 2006–2012, only 29% had been published in a peer-reviewed journal by 24 months following study completion [4]. At 48 months after completion, 18% of trials were not reported at all, which included over 24,000 participants. In another study, among 400 randomly selected clinical trials, nearly 30% did not publish the primary outcomes in a journal or post results to a clinical trial registry within four years of completion [5].

Why is this a problem?

  • It affects understanding of the scientific state of the art.

  • It leads to inefficiencies in resource allocation for both research and development and financing of health interventions.

  • It creates indirect costs for public and private entities, including patients themselves, who pay for suboptimal or harmful treatments.

  • It potentially distorts regulatory and public health decision making.

Furthermore, it is unethical to conduct human research without publication and dissemination of the results of that research. In particular, withholding results may subject future volunteers to unnecessary risk.

How the psychotherapy literature is different from a medical literature.

Unfortunately for the trustworthiness of the psychotherapy literature, the WHO statement is limited to medical interventions. We probably won’t see any direct effects on the psychotherapy literature anytime soon.

The psychotherapy literature has all the problems in implementing reforms that we see in biomedicine – and more. Professional organizations like the American Psychological Association and British Psychological Society publishing psychotherapy research have the other important function of ensuring their clinical membership developer’s employment opportunities. More opportunities for employment show the organizations are meeting their members’ needs this results in more dues-paying members.

The organizations don’t want to facilitate third-party payers citing research that particular interventions that their membership is already practicing are inferior and need to be abandoned. They want the branding of members practicing “evidence-based treatment” but not the burden of members having to make decisions based on what is evidence-based. More basically, psychologists’ professional organizations are cognizant of the need to demonstrate a place in providing services that are reimbursed because they improve mental and physical health. In this respect, they are competing with biomedical interventions for the same pot of money.

So, journals published by psychological organizations have vested interests and not stringently enforcing standards. The well-known questionable research practices of investigators are strengthened by questionable publication practices, like confirmation bias, that are tied to the organizations’ institutional agenda.

And the lower status journals that are not published by professional organizations may compromise their standards for publishing psychotherapy trials because of the status that having these articles confers.

Increasingly, medical journals like The Lancet and The Lancet Psychiatry are seen as more prestigious for publishing psychotherapy trials, but they take less seriously the need to enforce standards for psychotherapy studies the regulatory agencies require for biomedical interventions. Example: The Lancet violated its own policies and accepted publication Tony Morrison’s CBT for psychosis study  for publication when it wasn’t registered until after the trial and started. The declared outcomes were vague enough so they could be re-specified after results were known .

Bottom line, in the case of publishing all psychotherapy trials consistent with published protocols: the problem is taken less seriously than if it were a medical trial.

Overall, there is less requirement for psychotherapy trials be registered and less attention paid by editors and reviewers as to whether trials were registered, and whether outcomes are analytic plans were consistent between the registration in the published study.

In a recent blog post, I identified results of a trial that had been published with switched outcomes and then re-published in another paper with different outcomes, without the registration even being noted.

But for all the same reasons cited by the recent WHO statement, publication of all psychotherapy trials matters.

archaeologist digging for goldRecovering an important CBT trial gone missing

I am now going to review the impact of a large, well resourced study of CBT for psychosis remaining on published. I identified the study by a search of the ISRCTN:

The ISRCTN registry is a primary clinical trial registry recognised by WHO and ICMJE that accepts all clinical research studies (whether proposed, ongoing or completed), providing content validation and curation and the unique identification number necessary for publication. All study records in the database are freely accessible and searchable.

I then went back to the literature to see what it happened with it. Keep in mind that this step is not even possible for the many psychotherapy trials that are simply not registered at all.

Many trials are not registered because they are considered pilot and feasibility studies and therefore not suitable for entering effect sizes into the literature. Yet, if significant results are found, they will be exaggerated because they come from an underpowered study. And such results become the basis for entering results into the literature as if it were a planned clinical trial, with considerable likelihood of not being able to be replicated.

There are whole classes of clinical and health psychology interventions that are dominated by underpowered, poor quality studies that should have been flagged as for evidence or excluded altogether. So, in centering on this trial, I’m picking an important example because it was available to be discovered, but there is much of their there is not available to be discovered, because it was not registered.

CBT versus supportive therapy for persistent positive symptoms in psychotic disorders

The trial registration is:

Cognitive behavioural treatment for persistent positive symptoms in psychotic disorders SRCTN29242879DOI 10.1186/ISRCTN29242879

The trial registration indicates that recruitment started on January 1, 2007 and ended on December 31, 2008.

No publications are listed. I and others have sent repeated emails to the principal investigator inquiring about any publications and have failed to get a response. I even sent a German colleague to visit him and all he would say was that results were being written up. That was two years ago.

Google Scholar indicates the principal investigator continues to publish, but not the results of this trial.

A study to die for

The study protocol is available as a PDF

Klingberg S, Wittorf A, Meisner C, Wölwer W, Wiedemann G, Herrlich J, Bechdolf A, Müller BW, Sartory G, Wagner M, Kircher T. Cognitive behavioural therapy versus supportive therapy for persistent positive symptoms in psychotic disorders: The POSITIVE Study, a multicenter, prospective, single-blind, randomised controlled clinical trial. Trials. 2010 Dec 29;11(1):123.

The methods section makes it sound like a dream study with resources beyond what is usually encountered for psychotherapy research. If the protocol is followed, the study would be an innovative, large, methodologically superior study.

Methods/Design: The POSITIVE study is a multicenter, prospective, single-blind, parallel group, randomised clinical trial, comparing CBT and ST with respect to the efficacy in reducing positive symptoms in psychotic disorders. CBT as well as ST consist of 20 sessions altogether, 165 participants receiving CBT and 165 participants receiving ST. Major methodological aspects of the study are systematic recruitment, explicit inclusion criteria, reliability checks of assessments with control for rater shift, analysis by intention to treat, data management using remote data entry, measures of quality assurance (e.g. on-site monitoring with source data verification, regular query process), advanced statistical analysis, manualized treatment, checks of adherence and competence of therapists.

The study was one of the rare ones providing for systematic assessments of adverse events and any harm to patients. Preumably if CBT is powerful enough to affect positive change, it can have negative effects as well. But these remain entirely a matter of speculation.

Ratings of outcome were blinded and steps were taken to preserve the blinding even if an adverse event occurred. This is important because blinded trials are less susceptible to investigator bias.

Another unusual feature is the use of a supportive therapy (ST) credible, but nonspecific condition as a control/comparison.

ST is thought as an active treatment with respect to the patient-therapist relationship and with respect to therapeutic commitment [21]. In the treatment of patients suffering from psychotic disorders these ingredients are viewed to be essential as it has been shown consistently that the social network of these patients is limited. To have at least one trustworthy person to talk to may be the most important ingredient in any kind of treatment. However, with respect to specific processes related to modification of psychotic beliefs, ST is not an active treatment. Strategies specifically designed to change misperceptions or reasoning biases are not part of ST.

Use of this control condition allows evaluation of the important question of whether any apparent effects of CBT are due to the active ingredients of that approach or to the supportive therapeutic relationship within which the active ingredients are delivered.

Being able to rule out the effects of CBT are due to nonspecific effects justifies the extra resources needed to provide specialized training in CBT, if equivalent effects are obtained in the ST group, it suggests that equivalent outcomes can be achieved simply by providing more support to patients, presumably by less trained and maybe even lay personnel.

It is a notorious feature of studies of CBT for psychosis that they lack comparison/control groups in any way equivalent to the CBT in terms of nonspecific intensity, support, encouragement, and positive expectations. Too often, the control group are ill-defined treatment as usual (TAU) that lacks regular contact and inspires any positive expectations. Basically CBT is being compared to inadequate treatment and sometimes no treatment and so any apparent effects that are observed are due to correcting these inadequacies, not any active ingredient.

The protocol hints in passing at the investigators’ agenda.

This clinical trial is part of efforts to intensify psychotherapy research in the field of psychosis in Germany, to contribute to the international discussion on psychotherapy in psychotic disorders, and to help implement psychotherapy in routine care.

Here we see an aim to justify implementation of CBT for psychosis in routine care in Germany. We have seen something similar with repeated efforts of German to demonstrate that long-term psychodynamic psychotherapy is more effective than shorter, less expensive treatments, despite the lack of credible data [ ].

And so, if the results would not contribute to getting psychotherapy implemented in routine care in Germany, do they get buried?

Science & Politics of CBT for Psychosis

A rollout of a CBT study for psychosis published in Lancet made strong claims in a BBC article and audiotape promotion.

morroson slide-page-0

 

 

 

The attention attracted critical scrutiny that these claims couldn’t sustain. After controversy on Twitter, the BBC headline was changed to a more modest claim.

Criticism mounted:

  • The study retained fewer participants receiving CBT at the end of the study than authors.
  • The comparison treatment was ill-defined, but for some patients meant no treatment because they were kicked out of routine care for refusing medication.
  • A substantial proportion of patients assigned to CBT began taking antipsychotic medication by the end of the study.
  • There was no evidence that the response to CBT was comparable to that achieved with antipsychotic medication alone in clinical trials.
  • No evidence that less intensive, nonspecific supportive therapy would not have achieved the same results as CBT.

And the authors ended up conceding in a letter to the editor that their trial had been registered after data collection had started and it did not produce evidence of equivalence to antipsychotic medication.

In a blog post containing the actual video of the presentation before his British Psychological Society, Keith Laws declares

Politics have overcome the science in CBT for psychosis

Recently the British Psychological Society invited me to give a public talk entitled CBT: The Science & Politics behind CBT for Psychosis. In this talk, which was filmed…, I highlight the unquestionable bias shown by the National Institute of Clinical Excellence (NICE) committee  (CG178) in their advocacy of CBT for psychosis.

The bias is not concealed, but unashamedly served-up by NICE as a dish that is high in ‘evidence-substitute’, uses data that are past their sell-by-date and is topped-off with some nicely picked cherries. I raise the question of whether committees – with such obvious vested interests – should be advocating on mental health interventions.

I present findings from our own recent meta-analysis (Jauhar et al 2014) showing that three-quarters of all RCTs have failed to find any reduction in the symptoms of psychosis following CBT. I also outline how trials which have used non-blind assessment of outcomes have inflated effect sizes by up to 600%. Finally, I give examples where CBT may have adverse consequences – both for the negative symptoms of psychosis and for relapse rates.

A pair of well-conducted and transparently reported Cochrane reviews suggest there is little evidence for the efficacy of CBT for psychosis (*)

cochrane slide-page-0                          cochrane2-page-0

 

These and other slides are available in a slideshow presentation of a talk I gave at the Edinburgh Royal  Infirmary.

Yet, even after having to be tempered in the face of criticism, the original claims of the Morrison study get echoed in the antipsychiatry Understanding Psychosis:

“Other forms of therapy can also be helpful, but so far it is CBTp that has been most intensively researched. There have now been several meta-analyses (studies using a statistical technique that allows findings from various trials to be averaged out) looking at its effectiveness. Although they each yield slightly different estimates, there is general consensus that on average, people gain around as much benefit from CBT as they do from taking psychiatric medication.”

Such misinformation can confuse patients making difficult decisions about whether to accept antipsychotic medication.

go on without mejpgIf the results from the missing CBT for psychosis study became available…

If the Klingberg study were available and integrated with existing data, it would be one of the largest and highest quality studies and it would provide insight into any advantage of CBT for psychosis. For those who can be convinced by data, a null finding from a large studythat added to mostly small and methodologically unsophisticated studies could be decisive.

A recent meta-analysis of CBT for prevention of psychosis by Hutton and Taylor includes six studies and mentions the trial protocol in passing:

Two recent trials of CBT for established psychosis provide examples of good practice for reporting harms (Klingberg et al. 20102012) and CONSORT (Consolidated Standards of Reporting Trials) provide a sensible set of recommendations (Ioannidis et al. 2004).

Yet, it does not provide indicate why it is missing and is not included in a list of completed but unpublished studies. Yet, the protocol indicates a study considerably larger than any of the studies that were included.

To communicate a better sense of the potential importance of this missing study and perhaps place more pressures on the investigators to release its results, I would suggest that future meta-analyses state:

The protocol for Klingberg et al. Cognitive behavioural treatment for persistent positive symptoms in psychotic disorders indicates that recruitment was completed in 2008. No publications have resulted. Emails to Professor Klingberg about the status of the study failed to get a response. If the study were completed consistent with its protocol, it would represent one of the largest studies of CBT for psychosis ever and one of the few with a fair comparison between CBT and supportive therapy. Inclusion of the results could potentially substantially modify the conclusions of the current meta-analysis.

 

Stalking a Cheshire cat: Figuring out what happened in a psychotherapy intervention trial

John Ioannidis, the “scourge of sloppy science”  has documented again and again that the safeguards being introduced into the biomedical literature against untrustworthy findings are usually ineffective. In Ioannidis’ most recent report , his group:

…Assessed the current status of reproducibility and transparency addressing these indicators in a random sample of 441 biomedical journal articles published in 2000–2014. Only one study provided a full protocol and none made all raw data directly available.

As reported in a recent post in Retraction Watch, Did a clinical trial proceed as planned? New project finds out, Psychiatrist Ben Goldacre has a new project with

…The relatively straightforward task of comparing reported outcomes from clinical trials to what the researchers said they planned to measure before the trial began. And what they’ve found is a bit sad, albeit not entirely surprising.

Ben Goldacre specifically excludes psychotherapy studies from this project. But there are reasons to believe that the psychotherapy literature is less trustworthy than the biomedical literature because psychotherapy trials are less frequently registered, adherence to CONSORT reporting standards is less strict, and investigators more routinely refuse to share data when requested.

Untrustworthiness of information provided in the psychotherapy literature can have important consequences for patients, clinical practice, and public health and social policy.

cheshire cat1The study that I will review twice switched outcomes in its reports, had a poorly chosen comparison control group and flawed analyses, and its protocol was registered after the study started. Yet, the study will likely provide data for decision-making about what to do with primary care patients with a few unexplained medical symptoms. The recommendation of the investigators is to deny these patients medical tests and workups and instead provide them with an unvalidated psychiatric diagnosis and a treatment that encourages them to believe that their concerns are irrational.

In this post I will attempt to track what should have been an orderly progression from (a) registration of a psychotherapy trial to (b) publishing of its protocol to (c) reporting of the trial’s results in the peer-reviewed literature. This exercise will show just how difficult it is to make sense of studies in a poorly documented psychological intervention literature.

  • I find lots of surprises, including outcome switching in both reports of the trial.
  • The second article reporting results of the trial that does not acknowledge registration, minimally cites the first reports of outcomes, and hides important shortcomings of the trial. But the authors inadvertently expose new crucial shortcomings without comment.
  • Detecting important inconsistencies between registration and protocols and reports in the journals requires an almost forensic attention to detail to assess the trustworthiness of what is reported. Some problems hide in plain sight if one takes the time to look, but others require a certain clinical connoisseurship, a well-developed appreciation of the subtle means by which investigators spin outcomes to get novel and significant findings.
  • Outcome switching and inconsistent cross-referencing of published reports of a clinical trial will bedevil any effort to integrate the results of the trial into the larger literature in a systematic review or meta-analysis.
  • Two journals – Psychosomatic Medicine and particularly Journal of Psychosomatic Research– failed to provide adequate peer review of articles based on this trial, in terms of trial registration, outcome switching, and allowing multiple reports of what could be construed as primary outcomes from the same trial into the literature.
  • Despite serious problems in their interpretability, results of this study are likely to be cited and influence far-reaching public policies.
  • cheshire cat4The generalizability of results of my exercise is unclear, but my findings encourage skepticism more generally about published reports of results of psychotherapy interventions. It is distressing that more alarm bells have not been sounded about the reports of this particular study.

The publicly accessible registration of the trial is:

Cognitive Behaviour Therapy for Abridged Somatization Disorder (Somatic Symptom Index [SSI] 4,6) patients in primary care. Current controlled trials ISRCTN69944771

The publicly accessible full protocol is:

Magallón R, Gili M, Moreno S, Bauzá N, García-Campayo J, Roca M, Ruiz Y, Andrés E. Cognitive-behaviour therapy for patients with Abridged Somatization Disorder (SSI 4, 6) in primary care: a randomized, controlled study. BMC Psychiatry. 2008 Jun 22;8(1):47.

The second report of treatment outcomes in Journal of Psychosomatic Research

Readers can more fully appreciate the problems that I uncovered if I work backwards from the second published report of outcomes from the trial. Published in Journal of Psychosomatic Research, the article is behind a pay wall, but readers can write to the corresponding author for a PDF: mgili@uib.es. This person is also the corresponding author for the second paper in Psychosomatic Medicine, and so readers might want to request both papers.

Gili M, Magallón R, López-Navarro E, Roca M, Moreno S, Bauzá N, García-Cammpayo J. Health related quality of life changes in somatising patients after individual versus group cognitive behavioural therapy: A randomized clinical trial. Journal of Psychosomatic Research. 2014 Feb 28;76(2):89-93.

The title is misleading in its ambiguity because “somatising” does not refer to an established diagnostic category. In this article, it refers to an unvalidated category that encompasses a considerable proportion of primary care patients, usually those with comorbid anxiety or depression. More about that later.

PubMed, which usually reliably attaches a trial registration number to abstracts, doesn’t do so for this article 

The article does not list the registration, and does not provide the citation when indicating that a trial protocol is available. The only subsequent citations of the trial protocol are ambiguous:

More detailed design settings and study sample of this trial have been described elsewhere [14,16], which explain the effectiveness of CBT reducing number and severity of somatic symptoms.

The above quote is also the sole citation of a key previous paper that presents outcomes for the trial. Only an alert and motivated reader would catch this. No opportunity within the article is provided for comparing and contrasting results of the two papers.

The brief introduction displays a decided puffer fish phenomenon, exaggerating the prevalence and clinical significance of the unvalidated “abridged somatization disorder.” Essentially, the authors invoke the  problematic, but accepted psychiatric diagnostic categories somatoform or somatization disorders in claiming validity for a diagnosis with much less stringent criteria. Oddly, the category has different criteria when applied to men and women: men require four unexplained medical symptoms, whereas women require six.

I haven’t previously counted the term “abridged” in psychiatric diagnosis. Maybe the authors mean “subsyndromal,” as in “subsyndromal depression.” This is a dubious labeling because it suggested all characteristics needed for diagnosis are not present, some of which may be crucial. Think of it: is a persistent cough subsyndromal lung cancer or maybe emphysema? References to symptoms being “subsyndromal”often occur in context where exaggerated claims about prevalence are being made with inappropriate, non-evidence-based inferences  about treatment of milder cases from the more severe.

A casual reader might infer that the authors are evaluating a psychiatric treatment with wide applicability to as many as 20% of primary care patients. As we will see, the treatment focuses on discouraging any diagnostic medical tests and trying to convince the patient that their concerns are irrational.

The introduction identifies the primary outcome of the trial:

The aim of our study is to assess the efficacy of a cognitive behavioural intervention program on HRQoL [health-related quality of life] of patients with abridged somatization disorder in primary care.

This primary outcome is inconsistent with what was reported in the registration, the published protocol, and the first article reporting outcomes. The earlier report does not even mention the inclusion of a measure of HRQoL, measured by the SF-36. It is listed in the study protocol as a “secondary variable.”

The opening of the methods section declares that the trial is reported in this paper consistent with the Consolidated Standards of Reporting Clinical Trials (CONSORT). This is not true because the flowchart describing patients from recruitment to follow-up is missing. We will see that when it is reported in another paper, some important information is contained in that flowchart.

The methods section reports only three measures were administered: a Standardized Polyvalent Psychiatric Interview (SPPI), a semistructured interview developed by the authors with minimal validation; a screening measure for somatization administered by primary care physicians to patients whom they deemed appropriate for the trial, and the SF-36.

Crucial details are withheld about the screening and diagnosis of “abridged somatization disorder.” If these details had been presented, a reader would further doubt the validity of this unvalidated and idiosyncratic diagnosis.

Few readers, even primary care physicians or psychiatrists, will know what to make of the Smith’s guidelines (Googling it won’t yield much), which is essentially a matter of simply sending a letter to the referring GP. Sending such a letter is a notoriously ineffective intervention in primary care. It mainly indicates that patients referred to a trial did not get assigned to an active treatment. As I will document later, the authors were well aware that this would be an ineffectual control/comparison intervention, but using it as such guarantees that their preferred intervention would look quite good in terms of effect size.

The two active interventions are individual- and group-administered CBT which is described as:

Experimental or intervention group: implementation of the protocol developed by Escobar [21,22] that includes ten weekly 90-min sessions. Patients were assessed at 4 time points: baseline, post-treatment, 6 and 12 months after finishing the treatment. The CBT intervention mainly consists of two major components: cognitive restructuring, which focuses on reducing pain-specific dysfunctional cognitions, and coping, which focuses on teaching cognitive and behavioural coping strategies. The program is structured as follows. Session 1: the connection between stress and pain. Session 2: identification of automated thoughts. Session 3: evaluation of automated thoughts. Session 4: questioning the automatic thoughts and constructing alternatives. Session 5: nuclear beliefs. Session 6: nuclear beliefs on pain. Session 7: changing coping mechanisms. Session 8: coping with ruminations, obsessions and worrying. Session 9: expressive writing. Session 10: assertive communication.

There is sparse presentation of data from the trial in the results section, but some fascinating details await a skeptical, motivated reader.

Table 1 displays social demographic and clinical variables. Psychiatric comorbidity is highly prevalent. Readers can’t tell exactly what is going on, because the authors’ own interview schedule is used to assess comorbidity. But it appears that all but a small minority of patients diagnosed with “abridged somatization disorder” have substantial anxiety and depression. Whether these symptoms meet formal criteria cannot be determined. There is no mention of physical comorbidities.

But there is something startling awaiting an alert reader in Table 2.

sf-36 gili

There is something very odd going on here, and very likely a breakdown of randomization. Baseline differences in the key outcome measure, SF-36 are substantially greater between groups than any within-group change. The treatment as usual condition (TAU) has much lower functioning [lower scores mean lower functioning] than the group CBT condition, which is substantially below the individual CBT difference.

If we compare the scores to adult norms, all three groups of patients are poorly functioning, but those “randomized” to TAU are unusually impaired, strikingly more so than the other two groups.

Keep in mind that evaluations of active interventions, in this case CBT, in randomized trials always involve a between difference between groups, not just difference observed within a particular group. That’s because a comparison/control group is supposed to be equivalent for nonspecific factors, including natural recovery. This trial is going to be very biased in its evaluation of individual CBT, a group within which patients started much higher in physical functioning and ended up much higher. Statistical controls fail to correct for such baseline differences. We simply do not have an interpretable clinical trial here.

cheshire cat2The first report of treatment outcomes in Psychosomatic Medicine

 Moreno S, Gili M, Magallón R, Bauzá N, Roca M, del Hoyo YL, Garcia-Campayo J. Effectiveness of group versus individual cognitive-behavioral therapy in patients with abridged somatization disorder: a randomized controlled trial. Psychosomatic medicine. 2013 Jul 1;75(6):600-8.

The title indicates that the patients are selected on the basis of “abridged somatization disorder.”

The abstract prominently indicates the trial registration number (ISRCTN69944771), which can be plugged into Google to reach the publicly accessible registration.

If a reader is unaware of the lack of validation for “abridged somatization disorder,” they probably won’t infer that from the introduction. The rationale given for the study is that

A recently published meta-analysis (18) has shown that there has been ongoing research on the effectiveness of therapies for abridged somatization disorder in the last decade.

Checking that meta-analysis, it only included a single null trial for treatment of abridged somatization disorder. This seems like a gratuitous, ambiguous citation.

I was surprised to learn that in three of the five provinces in which the study was conducted, patients

…Were not randomized on a one-to-one basis but in blocks of four patients to avoid a long delay between allocation and the onset of treatment in the group CBT arm (where the minimal group size required was eight patients). This has produced, by chance, relatively big differences in the sizes of the three arms.

This departure from one-to-one randomization was not mentioned in the second article reporting results of the study, and seems an outright contradiction of what is presented there. Neither is it mentioned nor in the study protocol. This patient selection strategy may have been the source of lack of baseline equivalence of the TAU and to intervention groups.

For the vigilant skeptic, the authors’ calculation of sample size is an eye-opener. Sample size estimation was based on the effectiveness of TAU in primary care visits, which has been assumed to be very low (approximately 10%).

Essentially, the authors are justifying a modest sample size because they don’t expect the TAU intervention is utterly ineffective. How could authors believe there is equipoise, that the comparison control and active interventions treatments could be expected to be equally effective? The authors seem to say that they don’t believe this. Yet,equipoise is an ethical and practical requirement for a clinical trial for which human subjects are being recruited. In terms of trial design, do the authors really think this poor treatment provides an adequate comparison/control?

In the methods section, the authors also provide a study flowchart, which was required for the other paper to adhere to CONSORT standards but was missing in the other paper. Note the flow at the end of the study for the TAU comparison/control condition at the far right. There was substantially more dropout in this group. The authors chose to estimate the scores with the Last Observation Carried Forward (LOCF) method which assumes the last available observation can be substituted for every subsequent one. This is a discredited technique and particularly inappropriate in this context. Think about it: the TAU condition was expected by the authors to be quite poor care. Not surprisingly,  more patients assigned to it dropped out. But they might have  dropped out while deteriorating, and so the last observation obtained is particularly inappropriate. Certainly it cannot be assumed that the smaller number of dropouts from the other conditions were from the same reason. We have a methodological and statistical mess on our hands, but it was hidden from us in our discussion of the second report.

 

flowchart

Six measures are mentioned: (1) the Othmer-DeSouza screening instrument used by clinicians to select patients; (2) the Screening for Somatoform Disorders (SOMS, a 39 item questionnaire that includes all bodily symptoms and criteria relevant to somatoform disorders according to either DSM-IV or ICD-10; (3) a Visual Analog Scale of somatic symptoms (Severity of Somatic Symptoms scale) that patients useto assess changes in severity in each of 40 symptoms; (4) the authors own SPPI semistructured psychiatric interview for diagnosis of psychiatric morbidity in primary care settings; (5) the clinician administered Hamilton Anxiety Rating Scale; and the (6) Hamilton Depression Rating Scale.

We are never actually told what the primary outcome is for the study, but it can be inferred from the opening of the discussion:

The main finding of the trial is a significant improvement regardless of CBT type compared with no intervention at all. CBT was effective for the relief of somatization, reducing both the number of somatic symptoms (Fig. 2) and their intensity (Fig. 3). CBT was also shown to be effective in reducing symptoms related to anxiety and depression.

But I noticed something else here, after a couple of readings. The items used to select patients and identify them with “abridged somatization disorder” reference  39 or 40 symptoms, and men only needing four, while women only needing six symptoms for a diagnosis. That means that most pairs of patients receiving a diagnosis will not have a symptom in common. Whatever “abridged somatization disorder” means, patients who received this diagnosis are likely to be different from each other in terms of somatic symptoms, but probably have other characteristics in common. They are basically depressed and anxious patients, but these mood problems are not being addressed directly.

Comparison of this report to the outcomes paper  reviewed earlier shows none of these outcomes are mentioned as being assessed and certainly not has outcomes.

Comparison of this report to the published protocol reveals that number and intensity of somatic symptoms are two of the three main outcomes, but this article makes no mention of the third, utilization of healthcare.

Readers can find something strange in Table 2 presenting what seems to be one of the primary outcomes, severity of symptoms. In this table the order is TAU, group CBT, and individual CBT. Note the large difference in baseline symptoms with group CBT being much more severe. It’s difficult to make sense of the 12 month follow-up because there was differential drop out and reliance on an inappropriate LOCR imputation of missing data. But if we accept the imputation as the authors did, it appears that they were no differences between TAU and group CBT. That is what the authors reported with inappropriate analyses of covariance.

Moreno severity of symptoms

The authors’ cheerful take away message?

This trial, based on a previous successful intervention proposed by Sumathipala et al. (39), presents the effectiveness of CBT applied at individual and group levels for patients with abridged somatization (somatic symptom indexes 4 and 6).

But hold on! In the introduction, the authors’ justification for their trial was:

Evidence for the group versus individual effectiveness of cognitive-behavioral treatment of medically unexplained physical symptoms in the primary care setting is not yet available.

And let’s take a look at Sumathipala et al.

Sumathipala A, Siribaddana S, Hewege S, Sumathipala K, Prince M, Mann A. Understanding the explanatory model of the patient on their medically unexplained symptoms and its implication on treatment development research: a Sri Lanka Study. BMC Psychiatry. 2008 Jul 8;8(1):54.

The article presents speculations based on an observational study, not an intervention study so there is no success being reported.

The formal registration 

The registration of psychotherapy trials typically provides sparse details. The curious must consult the more elaborate published protocol. Nonetheless, the registration can often provide grounds for skepticism, particularly when it is compared to any discrepant details in the published protocol, as well as subsequent publications.

This protocol declares

Study hypothesis

Patients randomized to cognitive behavioural therapy significantly improve in measures related to quality of life, somatic symptoms, psychopathology and health services use.

Primary outcome measures

Severity of Clinical Global Impression scale at baseline, 3 and 6 months and 1-year follow-up

Secondary outcome measures

The following will be assessed at baseline, 3 and 6 months and 1-year follow-up:
1. Quality of life: 36-item Short Form health survey (SF-36)
2. Hamilton Depression Scale
3. Hamilton Anxiety Scale
4. Screening for Somatoform Symptoms [SOMS]

Overall trial start date

15/01/2008

Overall trial end date

01/07/2009

The published protocol 

Primary outcome

Main outcome variables:

– SSS (Severity of somatic symptoms scale) [22]: a scale of 40 somatic symptoms assessed by a 7-point visual analogue scale.

– SSQ (Somatic symptoms questionnaire) [22]: a scale made up of 40 items on somatic symptoms and patients’ illness behaviour.

When I searched for, Severity of Clinical Global Impression, the primary outcome declared in the registration , and I could find no reference to it.

The protocol was submitted on May 14, 2008 and published on June 22, 2008. This suggests that the protocol was submitted after the start of the trial.

To calculate the sample size we consider that the effectiveness of usual treatment (Smith’s norms) is rather low, estimated at about 20% in most of the variables [10,11]. We aim to assess whether the new intervention is at least 20% more effective than usual treatment.

Comparison group

Control group or standardized recommended treatment for somatization disorder in primary care (Smith’s norms) [10,11]: standardized letter to the family doctor with Smith’s norms that includes: 1. Provide brief, regularly scheduled visits. 2. Establish a strong patient-physician relationship. 3. Perform a physical examination of the area of the body where the symptom arises. 4. Search for signs of disease instead of relying of symptoms. 5. Avoid diagnostic tests and laboratory or surgical procedures. 6. Gradually move the patient to being “referral ready”.

Basically, TAU, the comparison/control group involves simply sending a letter to referring physicians encouraging them simply to meet regularly with the patients but discouraged diagnostic test or medical procedures. Keep in mind that patients for this study were selected by the physicians because they found them particularly frustrating to treat. Despite the authors’ repeated claims about the high prevalence of “abridged somatization disorder,” they relied on a large number of general practice settings to each contribute only a few patients . These patients are very heterogeneous in terms of somatic symptoms, but most share anxiety or depressive symptoms.

House of GodThere is an uncontrolled selection bias here that makes generalization from results of the study problematic. Just who are these patients? I wonder if these patients have some similarity to the frustrating GOMERS (Get Out Of My Emergency Room) in the classic House of God, a book described by Amazon  as “an unvarnished, unglorified, and amazingly forthright portrait revealing the depth of caring, pain, pathos, and tragedy felt by all who spend their lives treating patients and stand at the crossroads between science and humanity.”

Imagine the disappointment about the referring physicians and the patients when consent to participate in this study simply left the patients back in routine care provided by the same physicians . It’s no wonder that the patients deteriorated and that patients assigned to this treatment were more likely to drop out.

Whatever active ingredients the individual and group CBT have, they also include some nonspecific factors missing from the TAU comparison group: frequency and intensity of contact, reassurance and support, attentive listening, and positive expectations. These nonspecific factors can readily be confused with active ingredients and may account for any differences between the active treatments and the TAU comparison. What terrible study.

The two journals providing reports of the studies failed to responsibility to the readership and the larger audience seeking clinical and public policy relevance. Authors have ample incentive to engage in questionable publication practices, including ignoring and even suppressing registration, switching outcomes, and exaggerating the significance of their results. Journals of necessity must protect authors from their own inclinations, as well as the readers and the larger medical community from on trustworthy reports. Psychosomatic Medicine and Journal of Psychosomatic Research failed miserably in their peer review of these articles. Neither journal is likely to be the first choice for authors seeking to publish findings from well-designed and well reported trials. Who knows, maybe the journals’ standards are compromised by the need to attract randomized trials for what is construed as a psychosomatic condition, at least by the psychiatric community.

Regardless, it’s futile to require registration and posting of protocols for psychotherapy trials if editors and reviewers ignore these resources in evaluating articles for publication.

Postscript: imagine what will be done with the results of this study

You can’t fix with a meta analysis what investigators bungled by design .

In a recent blog post, I examined a registration for a protocol for a systematic review and meta-analysis of interventions to address medically unexplained symptoms. The review protocol was inadequately described, had undisclosed conflicts of interest, and one of the senior investigators had a history of switching outcomes in his own study and refusing to share data for independent analysis. Undoubtedly, the study we have been discussing meets the vague criteria for inclusion in this meta-analysis. But what outcomes will be chosen, particularly when they should only be one outcome per study? And will be recognized that these two reports are actually the same study? Will key problems in the designation of the TAU control group be recognized, with its likely inflation of treatment effects, when used to calculate effect sizes?

cheshire_cat_quote_poster_by_jc_790514-d7exrjeAs you can see, it took a lot of effort to compare and contrast documents that should have been in alignment. Do you really expect those who conduct subsequent meta-analyses to make those multiple comparisons or will they simply extract multiple effect sizes from the two papers so far reporting results?

Obviously, every time we encounter a report of a psychotherapy in the literature, we won’t have the time or inclination to undertake such a cross comparison of articles, registration, and protocol. But maybe we should be skeptical of authors’ conclusions without such checks.

I’m curious what a casual reader would infer from encountering either of these reports of this clinical trial I have reviewed in a literature search, but not the other one.

 

 

PLSO-Blogs-Survey_240x310
http://plos.io/PLOSblogs16