What we can learn from a PLOS Medicine study of antidepressants and violent crime

Update October 1 7:58 PM: I corrected an inaccuracy in response to a comment by DJ Jaffe, for which I am thankful.

An impressively large-scale study published in PLOS Medicine of the association between antidepressants and violent crime is being greeted with strong opinions from those who haven’t read it. But even those who attempt to read the article might miss some of the nuances and the ambiguity that its results provide.

2305701220_0fc3d01183_bIn this issue of Mind the Brain, we will explore some of these nuances, which are fascinating in themselves. But the article also provides excellent opportunities to apply the critical appraisal skills needed for correlational observational studies using administrative data sets.

Any time there is a report of a mass shooting in the media, a motley crew of commentators immediately announces that the shooter is mentally ill and has been taking psychotropic medication. Mental illness and drugs are the problem, not guns, we are told. Sprinkled among the commentators are opponents of gun-control, Scientologists, and psychiatrists seeking to make money serving as expert witnesses. They are paid handsomely to argue for the diminished responsibility for the shooter or for product liability suits against Pharma. Rebuttals will be offered by often equally biased commentators, some of them receiving funds from Pharma.

every major shoorting
This is not from the Onion, but a comment left at a blog that expresses a commonly held view.

guns-health-care-82880109353

What is generally lost is that most shooters are not mentally ill and are not taking psychotropic medication.

Yet such recurring stories in the media have created a strong impression in the public and even professionals that a large scientific literature exists which establishes a tie between antidepressant use and violence.

Even when there has been some exposure to psychotropic medication, its causal role in the shooting cannot be established either from the facts of the case or the scientific literature.

The existing literature is seriously limited in quality and quantity and contradictory in its conclusions. Ecological studies [ 1, 2,]  conclude that the availability of antidepressants may reduce violence on a community level. An “expert review” and a review of reports of adverse events conclude there is a link between antidepressants and violence. However, reports of adverse events being submitted to regulatory agencies can be strongly biased, including by recent claims in the media. Reviews of adverse events do not distinguish between correlates of a condition like depression and effects of the drug being used to treat it. Moreover, authors of these particular reviews were serving as expert witnesses in legal proceedings. Authorship adds to their credibility and publicizes their services.

The recent study in PLOS Medicine should command the attention of anyone interested in the link between antidepressants and violent crime. Already there have been many tweets and at least one media story claiming vindication of the Scientologists as being right all along  I expected the release of the study and its reaction in the media would give me another opportunity to call attention to the entrenched opposing sides in the antidepressant wars  who only claim to be driven by strength of evidence and dismiss any evidence contrary to their beliefs, as well as the gullibility of journalists. But the article and its coverage in the media are developing a very different story.

At the outset, I should say I don’t know if evidence can be assembled for an unambiguous case that antidepressants are strongly linked to violent crime. Give up on us ever been able to rely on a randomized trial in which we examine whether participants randomized to receiving an antidepressant rather than a placebo are convicted more often for violent crimes. Most persons receiving antidepressant will not be convicted for a violent crime. The overall base rate of convictions is too low to monitor as an outcome a randomized trial. We are left having to sort through correlational observational, clinical epidemiological data typically collected for other purposes.

I’m skeptical about there being a link strong enough to send a clear signal through all the noise in the data sets that we can assemble to look for it. But the PLOS Medicine article represents a step forward.

stop Association does not equal causation
From Health News Review

Correlation does not equal causality.

Any conceivable data set in which we can search will pose the challenges of competing explanations from other variables that might explain the association.

  • Most obviously, persons prescribed antidepressants suffer from conditions that may themselves increase the likelihood of violence.
  • The timing of persons seeking treatment with antidepressants may be influenced by circumstances that increase their likelihood of violence.
  • Violent persons are more likely to be under the influence of alcohol and other drugs and to have histories of use of these substances.
  • Persons taking antidepressants and consuming alcohol and other drugs may be prone to adverse effects of the combination.
  • Violent persons have characteristics and may be in circumstances with a host of other influences that may explain their behavior.
  • Violent persons may themselves be facing victimization that increases the likelihood of their committing violence and having a condition warranting treatment with antidepressants.

Etc, etc.

The PLOS Medicine article introduces a number of other interesting possibilities for such confounding.

Statistical controls are never perfect

Studies will always incompletely specify of confounds and imperfectly measure them. Keep in mind that completeness of statistical control requires that all possible confounding factors be identified and measured without error. These ideal conditions are not attainable. Yet any application of statistics to “control” confounds that do not meet these ideal conditions risks producing less accurate estimate of effects than simply examining basic associations. Yet, we already know that these simple associations are not sufficient to indicate causality.

The PLOS Medicine article doesn’t provide definitive answers, but it presents data with greater sophistication than has previously been available. The article’s careful writing should make misinterpretation or missing of its main points less likely. And one of the authors – Professor Seena Fazel of the Department of Psychiatry, Oxford University – did an exemplary job of delivering careful messages to any journalist who would listen.

Professor Seena Fazel
Professor Seena Fazel

Professor Fazel can be found explaining his study in the media at 8:45 in a downloadable BBC World New Health Check News mp3.

Delving into the details of the article

The PLOS Medicine article is of course open access and freely available.

Molero, Y., Lichtenstein, P., Zetterqvist, J., Gumpert, C. H., & Fazel, S. (2015). Selective Serotonin Reuptake Inhibitors and Violent Crime: A Cohort Study. PLoS Med, 12(9), e1001875.

Supplementary material are also available from the web [1, 2, 3] for the study including a completed standardized STROBE checklist of items  that should be included in reports of observational studies, additional tables, and details of the variables and how they were obtained.

An incredible sample

Out of Sweden’s total population of 7,917,854 aged 15 and older in 2006, the researchers identified 856,493 individuals who were prescribed a selective serotonin reuptake inhibitor (SSRI) antidepressant from 2006-2009 and compared them to the 7,061,361 Swedish individuals who were not been prescribed this medication in that four year period.

SSRIs  were chosen for study because they represent the bulk of antidepressants being prescribed and also because SSRIs are the class of antidepressants to which the question of an association with violence of the most often raised. Primary hypotheses were about the SSRIs as a group, but secondary analyses focused on individual SSRIs – fluoxetine, citalopram, paroxetine, sertraline, fluvoxamine, and escitalopram. It was not expected that the analyses at the level of individual SSRI drugs have sufficient statistical power to explore associations with violent crimes. Data were also collected on non-SSRI antidepressants and other psychotropic medication, and these data were used to adjust for medications taken concurrently with SSRIs.

With these individuals’ unique identification number, the researchers collected information on the particular medications and dates of prescription from the Swedish Prescribed Drug Register. The register provides complete data on all prescribed and dispensed medical drugs from all pharmacies in Sweden since July 2005. The unique identification number also allowed obtaining information concerning hospitalizations and outpatient visits and reasons for visit and diagnoses.

crime sceneThese data were then matched against information on convictions for violent crimes for the same period from the Swedish national crime register.

These individuals were followed from January 1, 2006, to December 31, 2009.

During this period 1% of individuals prescribed an SSRI were convicted of a violent crime versus .6% of those not being prescribed an SSRI. The article focused on the extent to which prescription of an SSRI affected the likelihood of committing a violent crime and considered other possibilities for any association that was found.

A clever analytic strategy

Epidemiologic studies most commonly make comparisons between individuals differing in their exposures to particular conditions in terms of whether they have particular outcomes. Detecting bona fide causal associations can be derailed by other characteristics being associated with both antidepressants and violent crimes. An example of a spurious relationship is one between coffee drinking and cardiovascular disease. Exposure to coffee may be associated with lung cancer, but the association is spurious, due to smokers smoking Confoundinglighting up when they have coffee breaks. Taking smoking into account eliminates the association of coffee and cardiovascular disease. In practice, it can be difficult to identify such confounds, particularly when they are left unmeasured or imperfectly measured.

So, such Between-individual analyses of people taking antidepressants and those who are not are subject to a full range of unmeasured, but potentially confounding background variables.

For instance, in an earlier study in the same population, some of these authors found that individuals with a full (adjusted OR 1.5, 95% CI 1.3-1.6) or half (adjusted OR 1.2, 95% CI 1.1-1.4) sibling with depression were themselves more likely to be convicted of violent crime, after controlling for age, sex, low family income and being born abroad. The influence of such familial risk can be misconstrued in a standard between-individual analysis.

This article supplemented between-individual analyses with within-individual stratified Cox proportional hazards regressions. Each individual exposed to antidepressants was considered separately and served as his/her own control. Thus, these within-individual analyses examined differences in violent crimes in the same individuals over time periods differing in whether they had exposure to an antidepressant prescription. Periods of exposure became the unit of analysis, not just individuals.

The linked Swedish data sets that were used are unusually rich. It would not be feasible to obtain such data in other countries, and certainly not the United States.

The results as summarized in the abstract

Using within-individual models, there was an overall association between SSRIs and violent crime convictions (hazard ratio [HR] = 1.19, 95% CI 1.08–1.32, p < 0.001, absolute risk = 1.0%). With age stratification, there was a significant association between SSRIs and violent crime convictions for individuals aged 15 to 24 y (HR = 1.43, 95% CI 1.19–1.73, p < 0.001, absolute risk = 3.0%). However, there were no significant associations in those aged 25–34 y (HR = 1.20, 95% CI 0.95–1.52, p = 0.125, absolute risk = 1.6%), in those aged 35–44 y (HR = 1.06, 95% CI 0.83–1.35, p = 0.666, absolute risk = 1.2%), or in those aged 45 y or older (HR = 1.07, 95% CI 0.84–1.35, p = 0.594, absolute risk = 0.3%). Associations in those aged 15 to 24 y were also found for violent crime arrests with preliminary investigations (HR = 1.28, 95% CI 1.16–1.41, p < 0.001), non-violent crime convictions (HR = 1.22, 95% CI 1.10–1.34, p < 0.001), non-violent crime arrests (HR = 1.13, 95% CI 1.07–1.20, p < 0.001), non-fatal injuries from accidents (HR = 1.29, 95% CI 1.22–1.36, p < 0.001), and emergency inpatient or outpatient treatment for alcohol intoxication or misuse (HR = 1.98, 95% CI 1.76–2.21, p < 0.001). With age and sex stratification, there was a significant association between SSRIs and violent crime convictions for males aged 15 to 24 y (HR = 1.40, 95% CI 1.13–1.73, p = 0.002) and females aged 15 to 24 y (HR = 1.75, 95% CI 1.08–2.84, p = 0.023). However, there were no significant associations in those aged 25 y or older. One important limitation is that we were unable to fully account for time-varying factors.

Hazard ratios (HRs) are explained here and are not to be confused with odds ratios (ORs) explained here. Absolute risk (AR) is the most intuitive and easy to understand measure of risk and is explained here, along with reasons that hazard ratios don’t tell you anything about absolute risk.

Principal findings

  • There was an association between receiving a prescription for antidepressants and violent crime.
  • When age differences were examined, the 15-24 age range was the only one from which the association was significant.
  • No association was found for other age groups.
  • The association held for both males and females analyze separately in the 15 – 24 age range. But…

Things not to be missed in the details

Only a small minority of persons prescribed an antidepressant were convicted of a violent crime, but the likelihood of a conviction in persons exposed to antidepressants was increased in this 15 to 24 age range.

There isn’t a dose-response association between SSRI use and convictions for violent crimes. Even in the 15 to 24 age range, periods of moderate or high exposure to SSRIs were not associated with violent crimes any more than no exposure. Rather, the association occurred only in those individuals with low exposure.

A dose response association would be reflected in the more exposure to antidepressants an individual had, the greater the level of violent crimes. A dose response association is a formal criterion for a causal association adequate evidence of a causal relationship between an incidence and a possible consequence.

In the age bracket for which this association between antidepressant use and conviction of a violent crime was significant, antidepressant use was also associated with an increased risk of violent crime arrests, non-violent crime convictions, and non-violent crime arrests, using emergency inpatient and or outpatient treatment for alcohol intoxication or misuse.

Major caveats

The use of linked administrative data sets concerning both antidepressant prescription and violent crimes is a special strength of this study. It allows a nuanced look at an important question with evidence that could not otherwise be assembled. But administrative data have well-known limitations.

The data were not originally captured with the research questions in mind and so key variables, including data concerning potential confounds were not necessarily collected. The quality control for the administrative purposes for which these data were collected, may differ greatly from what is needed in their use as research data. There may be systematic errors and incomplete data and inaccurate coding, including of the timing of these administrative events.

Administrative data do not always mesh well with the concepts with which we may be most concerned. This study does not directly assess violent behavior, only arrest and convictions. Most violent behavior does not result in an arrest or conviction and so this is a biased proxy for behavior.

This study does not directly assess diagnosis of depression, only diagnosis by specialists. We know from other studies that in primary and specialty medical settings, there may be no systematic effort to assess clinical depression by interview. The diagnoses that are recorded may simply be only serve to justify a clinical decision made on the basis other than a patient meeting research criteria for depression. Table 1 in the article suggests that only about a quarter of the patients exposed to antidepressants actually had a diagnosis of depression. And throughout this article, there was no distinction made between unipolar depression and the depressed phase of a bipolar disorder. This distinction may be important, given the small minority of individuals who were convicted of a violent crime while exposed to a SSRI.

Alcohol-and-Anti-DepressantsPerhaps one of the greatest weaknesses of this data set is its limited assessment of alcohol and substance use and abuse. For alcohol, we are limited to emergency inpatient or outpatient treatment for alcohol intoxication or misuse. For substance abuse, we have only convictions designated as substance-related. These are poor proxies for more common actual alcohol and substance use, which for a variety of reasons may not show up in these administrative data. Substance-related convictions are simply too infrequent to serve as a suitable control variable or even proxy for substance. It is telling that in the 15-24 age range, alcohol intoxication or misuse is associated with convictions for violent crimes with a strength (HR = 1.98, 95% CI 1.76–2.21, p < 0.001) greater than that found for SSRIs.

There may be important cultural differences between Sweden and other countries to which we want to generalize in terms of the determinants of arrest and conviction, but also treatment seeking for depression and the pathways for obtaining antidepressant medication. There may also be differences in institutional response to drug and alcohol use and misuse, including individuals’ willingness and ability to access services.

An unusual strength of this study is its use of within-individual analyses to escape some of the problems of more typical between-individual analyses not being able to adequately control for stable sources of differences. But, we can’t rely on these analyses to faithfully capture crucial sequences of events that happen quickly in terms of which events occurred first. The authors note that they

cannot fully account for time-varying risk factors, such as increased drug or alcohol use during periods of SSRI medication, worsening of symptoms, or a general psychosocial decline.

Findings examining non-fatal injuries from accidents as well as emergency inpatient or outpatient treatment for alcohol intoxication or misuse as time-varying confounders are tantalizing, but we reached the limits of the administrative data in trying to pursue them.

What can we learn from this study?

Readers seeking a definitive answer from the study to the question of whether antidepressants cause violent behavior or even violent crime will be frustrated.

There does not seem to be a risk of violent crime in individuals over 25 taking antidepressants.

The risk confined to individuals aged between 15 and 25 is, according to the authors, modest, but not insignificant. It represents a 20 to 40% increase in the low likelihood of being convicted of a violent crime. But it is not necessarily causal. The provocative data suggesting that low exposure, rather than no exposure or moderate or high exposure to antidepressants should give pause and suggest something more complex than simple causality may be going on.

This is an ambiguous but important point. Low exposure could represent non-adherence, inconsistent adherence, or periods in which there was a sudden stopping of medication, the effects of which might generate an association between the exposure and violent crimes. It could also represent the influence of time-dependent variables such as use of alcohol or substances that escaped control in the within-individual analyses.

There are parallels between results of the present study what is observed in other data sets. Most importantly, the data have some consistency with reports of suicidal ideation and deliberate self-harm among children and adolescents exposed to antidepressants. The common factor may be increased sensitivity of younger persons to antidepressants and particularly to their initiation and withdrawal or sudden stopping, the sensitivity reflected in impulsive and risk-taking behavior.

The take away message

Data concerning links between SSRIs and violent crime invite premature and exaggerated declarations of implications for public health and public policy.

At another blog, I’ve suggested that the British Medical Journal requirement that that observational studies have a demarcated section addressing these issues encourages authors to go beyond their data in order to increase the likelihood of publication – authors have to make public health and public policy recommendations to show that their data are newsworthy enough for publication. It’s interesting thata media watch group  criticized BMJ for using too strong causal language in covering this observational PLOS Medicine article.

I’m sure that the authors of this article felt pressure to address whether a black box warning inserted into the packaging of SSRIs was warranted by these data. I agree with them not recommending this at this time because of the strength of evidence and ambiguity in the interpretation of these administrative data. But I agree that the issue of young people being prescribed SSRIs needs more research and specifically elucidation of why low dose increases the likelihood of violence versus no or medium to high dose.

The authors do make some clinical recommendations, and their spokesperson Professor Fazel is particularly clear but careful in his interview with BBC World New Health Check News. My summary of what is said in the interview and in other media contacts is

  • Adolescents and young adults should be prescribed SSRIs should be on the basis of careful clinical interviews to ascertain a diagnosis consistent with practice guidelines for prescribing these drugs and that the drug be prescribed at therapeutic level.
  • These patients should be educated about the necessity of taking these medications consistently and advised against withdrawal or stopping the medication quickly without consultation and supervision of a professional.
  • These patients should be advised against taking these medications with alcohol or other drugs, with the explanation that there could be serious adverse reactions.

In general, young persons may be more sensitive to SSRIs, particularly when starting or stopping, and particularly when taken in the presence of alcohol or other drugs.

The importance of more research concerning nature of the sensitivity is highlighted by the findings of the PLOS Medicine article and the issues these findings point to but do not resolve.

Molero Y, Lichtenstein P, Zetterqvist J, Gumpert CH, Fazel S (2015) Selective Serotonin Reuptake Inhibitors and Violent Crime: A Cohort Study. PLoS Med 12(9): e1001875. doi:10.1371/journal.pmed.1001875

The views expressed in this post represent solely those of its author, and not necessarily those of PLOS or PLOS Medicine.

Delusional? Trial in Lancet Psychiatry claims brief CBT reduces paranoid delusions

lancet psychiatryIn this issue of Mind the Brain, I demonstrate a quick assessment of the conduct and reporting of a clinical trial.  The authors claimed in Lancet Psychiatry a “first ever” in targeting “worries” with brief cognitive therapy as a way of reducing persistent persecutory delusions in psychotic persons. A Guardian article written by the first author claims effects were equivalent to what is obtained with antipsychotic medication. Lancet Psychiatry allowed the authors a sidebar to their article presenting glowing testimonials of 3 patients making extraordinary gains. Oxford University lent its branding* to the first author’s workshop promoted with a video announcing a status of “evidence-based” for the treatment.

There is much claiming to be new here. Is it a breakthrough in treatment of psychosis and in standards for reporting a clinical trial? Or is what is new not praiseworthy?

I identify the kinds of things that I sought in first evaluating the Lancet Psychiatry article and what additional information needed to be consulted to assess the contribution to the field and relevance to practice.

The article is available open access.

Its publication was coordinated with the first author’s extraordinarily self-promotional elarticle in The Guardian

The Guardian article makes the claim that

benefits were what scientists call “moderate” – not a magic bullet, but with meaningful effects nonetheless – and are comparable with what’s seen with many anti-psychotic medications.

The advertisement for the workshop is here

 

The Lancet Psychiatry article also cites the author’s self-help book for lay persons. There was no conflict of interest declared.

Probing the article’s Introduction

Reports of clinical trials should be grounded in a systematic review of the existing literature. This allows readers to place the study in the context of existing research and the unsolved clinical and research problems the literature poses. This background prepares the reader to evaluate the contribution the particular trial can make.

Just by examining the references for the introduction, we can find signs of a very skewed presentation.

The introduction cites 13 articles, 10 of which are written by the author and an eleventh is written by a close associate. The remaining 2 citations are more generic, to a book and an article about causality.

Either the author is at the world center of this kind of research or seriously deficient in his attention to the larger body of evidence. At the outset, the author announces a bold reconceptualization of the role of worry in causing psychotic symptoms:

Worry is an expectation of the worst happening. It consists of repeated negative thoughts about potential adverse outcomes, and is a psychological component of anxiety. Worry brings implausible ideas to mind, keeps them there, and increases the level of distress. Therefore we have postulated that worry is a causal factor in the development and maintenance of persecutory delusions, and have tested this theory in several studies.

This is controversial, to say the least. The everyday experience of worrying is being linked to persecutory delusions. A simple continuum seems to be proposed – people can start off with everyday worrying and end out with a psychotic delusion and twenty years of receiving psychiatric services. Isn’t this too simplistic or just plain wrong?

Has no one but the author done relevant work or even reacted to the author’s work? The citations provided in the introduction suggest the author’s work is all we need in order to interpret this study in the larger context of what is known about psychotic persecutory delusions.

Contrast my assessment with the author’s own:

Panel 2: Research in context
Systematic review We searched the ISRCTN trial registry and the PubMed database with the search terms “worry”,“delusions”. “persecutory”,“paranoia”,and “schizophrenia”without date restrictions, for English-language publications of randomised controlled trials investigating the treatment of worry in patients with persecutory delusions. Other than our pilot investigation12 there were no other such clinical trials in the medical literature. We also examined published meta-analyses on standard cognitive behavioural therapy (CBT) for persistent delusions or hallucinations, or both.

The problem is that “worry” is a nonspecific colloquial term, not a widely used scientific one. For the author to require that studies have “worry” as a keyword in order to be retrieved is a silly restriction.

PubMedI welcome readers to redo the PubMed search dropping this term. Next replace “worry” with “anxiety.” Furthermore, the author makes unsubstantiated assumptions about a causal role for worry/anxiety in development of delusions. Drop the “randomized controlled trial” restriction from the PubMed search and you find a large relevant literature. Persons with schizophrenia and persecutory delusions are widely acknowledged to be anxious. But you won’t find much suggestion in this literature that the anxiety is causal or that people progress from worrying about something to developing schizophrenia and persecutory delusions. This seems a radical version gone wild of the idea that normal and psychotic experiences are on a continuum, concocted with a careful avoidance of contrary evidence.

Critical appraisal of clinical trials often skips examination of whether the background literature cited to justify the study is accurate and balanced. I think this brief foray has demonstrated that it can be important in establishing whether an investigator is claiming false authority for a view with cherry picking and selective attention to the literature.

Basic design of the study

The 150 patients randomized in this study are around 40 years old. Half of the sample of has been in psychiatric services for 11 or more years, with 29% of the patients in the intervention group and 19% in the control group receiving services for more than 20 years. The article notes in passing that all patients were prescribed antipsychotic medication at the outset of the study except 1 in the intervention group and 9 in the control group – 1:9? It is puzzling how such differences emerged if randomization was successful in controlling for baseline differences. Maybe it demonstrates the limitations of block randomization.

The intervention is decidedly low intensity for what is presumably a long standing symptom in chronically psychotic population.

We aimed to provide the CBT worry-reduction intervention in six sessions over 8 weeks. Each session lasted roughly an hour and took place in NHS clinics or at patients’ homes.

The six sessions were organized around booklets shared by the patient and therapist.

The main techniques were psychoeducation about worry, identification and reviewing of positive and negative beliefs about worry, increasing awareness of the initiation of worry and individual triggers, use of worry periods, planning activity at times of worry (which could include relaxation), and learning to let go of worry.

Patients were expected to practice exercises from the author’s self-help book for lay persons.

The two main practical techniques to reduce worry were then introduced: the use of worry periods (confining worry to about a 20 minute set period each day) and planning of activities at peak worry times. Worry periods were implemented flexibly. For example, most patients set up one worry period a day, but they could choose to have two worry periods a day or, in severe instances, patients instead aimed for a worry-free period. Ideally, the worry period was then substituted with a problem-solving period.

Compared to what?

The treatment of the control group was ill-defined routine care “delivered according to national and local service protocols and guidelines.” Readers are not told how much treatment the patients received or whether their care was actually congruent with these guidelines. Routine care of mental health patients in the community is notoriously deficient. That over half of these patients had been in services for more than a decade suggests that treatment for many of them had tapered off and was being delivered with no expectation of improvement.

To accept this study as an evaluation of the author’s therapy approach, we need to know how much in the way of other treatment was received by patients in both the intervention and control group. Were patients in the routine care condition, as I suspect, largely being ignored? The intervention group got 6 sessions of therapy over 8 weeks. Is that a substantial increase in psychotherapy or even in time to talk with a professional over what they would otherwise receive? Did being assigned to the intervention also increase patients’ other contact with mental health services? If the intervention therapists heard that patients was having problems with medication or serious unmet medical needs, how did they respond?

The authors report collecting data concerning receipt of services with the Client Service Receipt Inventory, but nowhere is that reported.

Most basically, we don’t know what elements the comparison/control group controlled. We have no reason to presume that the amount of contact time and basic relationship with a treatment provider was controlled.

As I have argued before, it is inappropriate and arguably unethical to use ill defined routine care or treatment-as-usual in the evaluation of a psychological intervention. We cannot tell if any apparent benefits to patients having been assigned to the intervention are due to correcting the inadequacies of routine care, including its missing of basic elements of support, attention, and encouragement. We therefore cannot tell if there are effective elements to the intervention other than  these nonspecific factors.

We cannot tell if any positive results to this trial suggest encourage dissemination and implementation or only improving likely deficiencies in the treatment received by patients in long term psychiatric care.

In terms of quickly evaluating articles reporting clinical trials, we see that imply asking “compared to what” and jumping to the comparison/control condition revealed a lot of deficiencies at the outset in what this trial could reveal.

Measuring outcomes

Two primary outcomes were declared – changes in the Penn State Worry Questionnaire and the Psychotic Symptoms Rating Scale- Delusion (PSYRATS-delusion) subscale. The authors use multivariate statistical techniques to determine whether patients assigned to the intervention group improved more on either of these measures, and whether specifically reduction in worry caused reductions in persecutory delusions.

Understand what is at stake here: the authors are trying to convince us that this is a groundbreaking study that shows that reducing worry with a brief intervention reduces long standing persecutory delusions.

The authors lose substantial credibility if we look closely at their primary measures, including their items, not just the scale names.

what-me-worry-715605The Penn State Worry Questionnaire (PSWQ) is a 16 item questionnaire widely used with college student, community and clinical samples. Items include

When I am under pressure I worry a lot.

I am always worrying about something.

And reverse direction items scored so greater endorsement indicates less worrying –

I do not tend to worry about things.

I never worry about anything.

I know, how many times does basically the same question have to be asked?

The questionnaire is meant to be general. It focuses on a single complaint that could be a symptom of anxiety. While the questionnaire could be used to screening for anxiety disorders, it does not provide a diagnosis of a mental disorder, which requires other symptoms be present. Actually, worry is only one of three components of anxiety. The others are physiological – like racing heart, sweating, or trembling – and behavioral – like avoidance or procrastination.

But “worry” is also a feature of depressed mood. Another literature discusses “worry” as “rumination.” We should not be surprised to find this questionnaire functions reasonably well as a screen for depression.

But past research has shown that even in nonclinical populations, using a cutpoint to designate high versus low worriers results in unstable classification. Without formal intervention, many of those who are “high” become  “low” over time.

In order to be included in this study, patients had to have a minimum score of 44 on the PSWQ. If we skip to the results of the study we find that the patients in the intervention group dropped from 64.8 to 56.1 and those receiving only routine care dropped from 64.5 to 59.8. The average patient in either group would have still qualified for inclusion in the study at the end of follow up.

The second outcome measure, the Psychotic Symptoms Rating Scale- Delusion subscale has six items: duration and frequency of preoccupation; intensity of distress; amount of distressing content; conviction and disruption. Each item is scored 0-4, with 0 = no problem and 4 = maximum severity.

The items are so diverse that interpretation of a change in the context of an intervention trial targeting worry becomes difficult. Technically speaking, the lack of comparability among items is so great that the measure cannot be considered an interval scale for which conventional parametric statistics could be used. We cannot reasonably assume changes in one item is equivalent to changes in other items.

It would seem, for instance, that amount of preoccupation with delusions, amount and intensity of distress, and amount of preoccupation with delusions are very different matters. The intervention group changed from a mean of 18.7 on a scale with a possible score of 24 to 13.6 at 24 weeks; the control group from 18.0 to 16.4. This change could simply represent reduction in the amount and intensity of distress, not in patients’ preoccupation with the delusions, their conviction that the delusions are true, or the disruption in their lives. Overall, the PSYRATS-delusion subscale is not a satisfactory measure on which to make strong claims about reducing worry reducing delusions. The measure is too contaminated with content similar to the worries questionnaire. We might only be finding ‘changes in worries results in changes in worries.”

Checking primary outcomes is important in evaluating a clinical trial, but in this case, it was crucial to examine what the measures assessed at an item content level. Too often reviewers uncritically accept the name of an instrument as indicating what it validly measures when used as an outcome measure.

The fancy multivariate analyses do not advance our understanding of what went on in the study. The complex statistical analyses might simply be demonstrating patients were less worried as seen in questionnaires and interview ratings based on what patients say when asked whether they are distressed.

My summary assessment is that a low intensity intervention is being evaluated against an ill-defined treatment as usual. The outcome measures are too nonspecific and overlapping to be helpful. We may simply be seeing effects of contact and reassurance among patients who are not getting much of either. So what?

testimonialsBring on the patient endorsements

Panel 1: Patient comments on the intervention presents glowing endorsements from 3 of the 73 patients assigned to the intervention group. The first patient describes the treatment as “extremely helpful” and as providing a “breakthrough.” The second patient suggests describing starting treatment being lost and without self-confidence but now being relaxed at times of the day that had previously been stressful. The third patient declared

“The therapy was very rewarding. There wasn’t anything I didn’t like. I needed that kind of therapy at the time because if I  didn’t have that therapy at that time, I wouldn’t be here.

Wow, but these dramatic gains seem inconsistent with the modest gains registered with the quantitative primary outcome measures. We are left guessing how these endorsements were elicited – where they obtained in a context where patients were expected to express gratitude for the extra attention they received? –  and the criteria by which the particular quotes were selected from what is presumably a larger pool.

Think of the outcry if Lancet Psychiatry extended this innovation to reporting of clinical trials to evaluations of medications by their developers. If such side panels are going to be retained in the future in the reporting of a clinical trial, maybe it would be best that they be marked “advertisement” and accompanied by a declaration of conflict of interest.

A missed opportunity to put the authors’ intervention to a fair test

In the Discussion section the authors state

although we think it highly unlikely that befriending or supportive counselling [sic] would have such persistent effects on worry and delusions, this possibility will have to be tested specifically in this group.

Actually, the authors don’t have much evidence of anything but a weak effect that might well have been achieved with befriending or supportive counseling delivered by persons with less training. We should be careful of accepting claims of any clinically significant effects on delusions. At best, the authors have evidence that distress associated with delusions was reduced and that in any coordination in scores between the two measurs may simply reflect confounding of the two outcome measures.

It is a waste of scarce research funds, an unethical waste of patients willingness to contribute to science to compare this low intensity psychotherapy to ill-described, unquantified treatment as usual. Another low intensity treatment like befriending or supportive counseling might provide sufficient elements of attention, support, and raised expectations to achieve comparable results.

Acknowledging the Supporting Cast

In evaluating reports of clinical trials, it is often informative to look to footnotes and acknowledgments, as well as the main text. This article acknowledges Anthony Morrison as a member of the Trial Steering Committee and Douglas Turkington as a member of the Data Monitoring and Ethics Committee. Readers of Mind the Brain might recognize Morrison as first author of a Lancet trial that I critiqued for exaggerated claims and Turkington as the first author of a trial that became an internet sensation when post-publication reviewers pointed out fundamental problems in the reporting of data.  Turkington and an editor of the journal in which the report of the trial was published counterattacked.

All three of these trials involve exaggerated claims based on a comparison between CBT and an ill-defined routine care. Like the present one, Morrison’s trial failed to report data concerning collected receipt of services. And in an interview with Lancet, Morrison admitted to avoiding a comparison between CBT and anything but routine care out of concern that differences might not be found with any treatment providing a supportive relationship, even basic supportive counseling.

MRCA note to funders

This project (09/160/06) was awarded by the Efficacy and Mechanism Evaluation (EME) Programme, and is funded by the UK Medical Research Council (MRC) and managed by the UK NHS National Institute for Health Research (NIHR) on behalf of the MRC-NIHR partnership.

Really, UK MRC, you are squandering scarce funds on methodologically poor, often small trials for which investigators make extravagant claims and that don’t include a comparison group allowing control for nonspecific effects. You really ought to insist on better attention to the existing literature in justifying another trial and adequate controls for amount of contact time, attention and support.

Don’t you see the strong influence of investigator allegiance dictating reporting of results consistent with the advancement of the investigators’ product?

I don’t understand why you allowed the investigator group to justify the study with such idiosyncratic, highly selective review of the literature driven by substituting a colloquial term “worry” for more commonly used search terms.

Do you have independent review of grants by persons who are more accepting of the usual conventions of conducting and reporting trials? Or are you faced with the problems of a small group of reviewers giving out money to like-minded friends and family? Note that the German Federal Ministry of Education and Research (BMBF) has effectively dealt with inbred old boy networks by excluding Germans from the panels of experts reviewing German grants. Might you consider the same strategy in getting more seriously about funding projects with some potential for improving patient care? Get with it, insist on rigor and reproducibility in what you fund.

*We should make too much of Oxford lending its branding to this workshop. Look at the workshops to which Harvard Medical School lends its labels.

Sordid tale of a study of cognitive behavioral therapy for schizophrenia gone bad

What motivates someone to publish that paper without checking it? Laziness? Naivety? Greed? Now that’s one to ponder. – Neuroskeptic, Science needs vigilantes.

feared_and_hated_by_a_world_they_have_sworn_to_pro_by_itomibhaa-d4kx9bd.pngWe need to

  • Make the world safe for post-publication peer review (PPR) commentary.
  • Ensure appropriate rewards for those who do it.
  • Take action against those who try to make life unpleasant for those who are toil hard for a scientific literature that is more trustworthy.

In this issue of Mind the Brain, I set the stage for my teaming up with Magneto to bring some bullies to justice.

The background tale of a modest study of cognitive behavior therapy (CBT) for patients with schizophrenia has been told in bits and pieces elsewhere.

The story at first looked like it was heading for a positive outcome more worthy of a blog post than the shortcomings of a study in an obscure journal. The tale would go

A group organized on the internet called attention to serious flaws in the reporting of a study. We then witnessed the self-correcting of science in action.

If only this story was complete and accurately described scientific publishing today

Daniel Lakens’ blog post, How a Twitter HIBAR [Had I Been A Reviewer] ends up as a published letter to the editor recounts the story beginning with expressions of puzzlement and skepticism on Twitter.

Gross errors were made in a table and a figure. These were bad enough in themselves, but seemed to point to reported results not seem supporting the claims made in the article.

A Swedish lecturer blogged Through the looking glass into an oddly analyzed clinical paper .

Some of those involved in the Twitter exchange banded together in writing a letter to the editor.

Smits, T., Lakens, D., Ritchie, S. J., & Laws, K. R. (2014). Statistical errors and omissions in a trial of cognitive behavior techniques for psychosis: commentary on Turkington et al. The Journal of Nervous and Mental Disease, 202(7), 566.

Lakens explained in his blog

Now I understand that getting criticism on your work is never fun. In my personal experience, it very often takes a dinner conversation with my wife before I’m convinced that if people took the effort to criticize my work, there must be something that can be improved. What I like about this commentary is that is shows how Twitter is making post-publication reviews possible. It’s easy to get in contact with other researchers to discuss any concerns you might have (as Keith did in his first Tweet). Note that I have never met any of my co-authors in real life, demonstrating how Twitter can greatly extend your network and allows you to meet interesting and smart people who share your interests. Twitter provides a first test bed for your criticisms to see if they hold up (or if the problem lies in your own interpretation), and if a criticism is widely shared, can make it fun to actually take the effort to do something about a paper that contains errors.

Furthermore,

It might be slightly weird that Tim, Stuart, and myself publish a comment in the Journal of Nervous and Mental Disease, a journal I guess none of us has ever read before. It also shows how Twitter extends the boundaries between scientific disciplines. This can bring new insights about reporting standards  from one discipline to the next. Perhaps our comment has made researchers, reviewers, and editors who do research on cognitive behavioral therapy aware of the need to make sure they raise the bar on how they report statistics (if only so pesky researchers on Twitter leave you alone!). I think this would be great, and I can’t wait until researchers from another discipline point out statistical errors in my own articles that I and my closer peers did not recognize, because anything that improves the way we do science (such as Twitter!) is a good thing.

Hindsight: If the internet group had been the original reviewers of the article…

The letter was low key and calmly pointed out obvious errors. You can see it here. Tim Smit’s blog Don’t get all psychotic on this paper: Had I (or we) Been A Reviewer (HIBAR) describes what had to be left out to keep within the word limit.

the actual table originalTable 2 had lots of problems –

  • The confidence intervals were suspiciously wide.
  • The effect sizes seemed too large for what the modest sample size should yield.
  • The table was inconsistent with information in the abstract.
  • Neither they table nor the accompanying text had any test of significance nor reporting of means and standard deviations.
  • Confidence intervals for two different outcomes were identical, yet one had the same value for its effect size as its lower bound.

Figure 5 Click to Enlarge

Figure 5 was missing labels and definitions on both axes, rendering it uninterpretable. Duh?

The authors of the letter were behaving like a blue helmeted international peacekeeping force, not warriors attacking bad science.

peacekeepersBut you don’t send peacekeeping troops into an active war zone.

In making recommendations, the Internet group did politely introduce the R word:

We believe the above concerns mandate either an extensive correction, or perhaps a retraction, of the article by Turkington et al. (2014). At the very least, the authors should reanalyze their data and report the findings in a transparent and accurate manner.

Fair enough, but I doubt the authors of the letter appreciated how upsetting this reasonable advice was or anticipated what reaction would be coming.

A response from an author of the article and a late night challenge to debate

The first author of the article published a reply

Turkington, D. (2014). The reporting of confidence intervals in exploratory clinical trials and professional insecurity: a response to Ritchie et al. The Journal of Nervous and Mental Disease, 202(7), 567.

He seemed to claim to re-examine the study data and

  • The findings were accurately reported.
  • A table of means and standard deviations was unnecessary because of the comprehensive reporting of confidence intervals and p-values in the article.
  • The missing details from the figure were self-evident.

The group who had assembled on the internet was not satisfied. An email exchange with Turkington and the editor of the journal confirmed that Turkington had not actually re-examined the raw file data, but only a summary with statistical tables.

The group requested the raw data. In a subsequent letter to the editor, they would describe Turkington as timely the providing the data, but the exchange between them was anything but cordial. Turkington at first balked, saying that the data were not readily available because the statistician had retired. He nonetheless eventually provided the data, but not before first sending off a snotty email –

Click to Enlarge
Click to Enlarge

Tim Smit declined:

Dear Douglas,

Thanks for providing the available data as quick as possible. Based on this and the tables in the article, we will try to reconstruct the analysis and evaluate our concerns with it.

With regard to your recent invitation to “slaughter” me at Newcastle University, I politely want to decline that invitation. I did not have any personal issue in mind when initiating the comment on your article, so a personal attack is the least of my priorities. It is just from a scientific perspective (but an outsider to the research topic) that I was very confused/astonished about the lack of reporting precision and what appears to be statistical errors. So, if our re-analysis confirms that first perception, then I am of course willing to accept your invitation at Newcastle university to elaborate on proper methodology in intervention studies, since science ranks among the highest of my priorities.

Best regards,

Tim Smits

When I later learned of this email exchange, I wrote to Turkington and offered to go to Newcastle to debate either as Tim Smits’ second or to come alone. Turkington asked me to submit my CV to show that I wasn’t a crank. I complied, but he has yet to accept my offer.

A reanalysis of the data and a new table

Smits, T., Lakens, D., Ritchie, S. J., & Laws, K. R. (2015). Correcting Errors in Turkington et al.(2014): Taking Criticism Seriously. The Journal of nervous and mental disease, 203(4), 302-303.

The group reanalyzed the data and the title of their report leaked some frustration.

We confirmed that all the errors identified by Smits et al. (2014) were indeed errors. In addition, we observed that the reported effect sizes in Turkington et al. (2014) were incorrect by a considerable margin. To correct these errors, Table 2 and all the figures in Turkington et al. (2014) need to be changed.

The sentence in the Abstract where effect sizes are specified needs to be rewritten.

A revised table based on their reanalyses was included:

new tableGiven the recommendation of their first letter was apparently dismissed –

To conclude, our recommendation for the Journal and the authors would now be to acknowledge that there are clear errors in the original Turkington et al. (2014) article and either accept our corrections or publish their own corrigendum. Moreover, we urge authors, editors, and reviewers to be rigorous in their research and reviewing, while at the same time being eager to reflect on and scrutinize their own research when colleagues point out potential errors. It is clear that the authors and editors should have taken more care when checking the validity of our criticisms. The fact that a rejoinder with the title “A Response to Ritchie et al. [sic]” was accepted for publication in reply to a letter by Smits et al. (2014) gives the impression that our commentary did not receive the attention it deserved. If we want science to be self-correcting, it is important that we follow ethical guidelines when substantial errors in the published literature are identified.

Sound and fury signifying nothing

Publication of their letter was accompanied by a blustery commentary from the statistical editor for the journal full of innuendo and pomposity.

quote-a-harmless-hilarity-and-a-buoyant-cheerfulness-are-not-infrequent-concomitants-of-genius-and-we-charles-caleb-colton-294969

Cicchetti, D. V. (2015). Cognitive Behavioral Techniques for Psychosis: A Biostatistician’s Perspective. The Journal of Nervous and Mental Disease, 203(4), 304-305.

He suggested that the team assembled on the internet

reanalyzed the data of Turkington et al. on the basis that it contained some serious errors that needed to be corrected. They also reported that the statistic that Turkington et al. had used to assess effect sizes (ESs) was an inappropriate metric.

Well, did Turkington’s table contain errors and was the metric inappropriate? If so, was a formal correction or even retraction needed? Cicchetti reproduced the internet groups’ table, but did not immediately offer his opinion. So, the uncorrected article stands as published. Interested persons downloading it from behind the journal’s paywall won’t be alerted to the controversy.

hello potInstead of dealing with the issues at hand, Cicchetti launched into an irrelevant lecture about Jacob Cohen’s arbitrary designation of effect sizes as small, medium, or large. Anything he said had already appeared clearer and more accurately in an article by Daniel Laken, one of the internet group authors. Cicchetti cited that article, but only as a basis for libeling the open access journal in which it appeared.

To be perfectly candid, the reader needs to be informed that the journal that published the Lakens (2013) article, Frontiers in Psychology, is one of an increasing number of journals that charge exorbitant publication fees in exchange for free open access to published articles. Some of the author costs are used to pay reviewers, causing one to question whether the process is always unbiased, as is the desideratum. For further information, the reader is referred to the following Web site: http://www.frontiersin.org/Psychology/fees.

love pomposityCicchetti further chastised the internet group for disrespecting the saints of power analysis.

As an additional comment, the stellar contributions of Helena Kraemer and Sue Thiemann (1987) were noticeable by their very absence in the Smits et al. critique. The authors, although genuinely acknowledging the lasting contributions of Jacob Cohen to our understanding of ES and power analysis, sought to simplify the entire enterprise

Jacob Cohen is dead and cannot speak. But good Queen Mother Helena is very much alive and would surely object to being drawn into this nonsense. I encourage Cicchetti to ask what she thinks.

Ah, but what about the table based on the re-analyses of the internet group that Cicchetti had reproduced?

The reader should also be advised that this comment rests upon the assumption that the revised data analyses are indeed accurate because I was not privy to the original data.

Actually, when Turkington sent the internet group the study data, he included Cicchetti in the email.

The internet group experienced one more indignity from the journal that they had politely tried to correct. They had reproduced Turkington’s original table in their letter. The journal sent them an invoice for 106 euros because the table was copyrighted. It took a long email exchange before this billing was rescinded.

Science Needs Vigilantes

Imagine a world where we no longer depend on a few cronies of an editor to decide once and forever the value of a paper. This would replace the present order in which much of the scientific literature is untrustworthy, where novelty and sheer outrageousness of claims are valued over robustness.

Imagine we have constructed a world where post publication commentary is welcomed and valued. Data are freely available for reanalysis and the rewards are there for performing those re-analyses.

We clearly are not there yet and certainly not with this flawed article. The sequence of events that I have described has so far not produced a correction of a paper. As it stands, the paper concludes that nurses can and should be given a brief training that will allow them to effectively treat patients with severe and chronic mental disorder. This paper encourages actions that may put such patients and society at risk because of ineffectual and neglectful treatment.

The authors of the original paper and the editor responded with dismissal of the criticisms, ridicule, and, the editor at least, libeling open access journals. Obviously, we have not reached the point at which those willing to re-examine and if necessary, re-analyze data, are appropriately respected and protected from unfair criticism. The current system of publishing gives authors who have been questions and editors who are defensive of their work, no matter how incompetent and inept it may be, the last word. But there is always the force of social media- tweets and blogs.

The critics were actually much too kind and restrained in a critique narrowly based on re-analyses. They ignored so much about

  • The target paper as an underpowered feasibility study being passed off a source of estimates of what a sufficiently sized randomized trial would yield.
  • The continuity between the mischief done in this article with tricks and spin in the past work of the author Turkington.
  • The laughably inaccurate lecture of the editor.
  • The lowlife journal in which the article was published.

These problems deserve a more unrestrained and thorough trashing. Journals may not yet be self-correcting, but blogs can do a reasonable job of exposing bad science.

Science needs vigilantes, because of the intransigence of those pumping crap into the literature.

Coming up next

In my next issue of Mind the Brain I’m going to team up with Magneto. You may recall I previously collaborated with him and Neurocritic to scrutinize some junk science that Jim Coan and Susan Johnson had published in PLOS One. Their article crassly promoted to clinicians what they claimed was a brain-soothing couples therapy. We obtained an apology and a correction in the journal for undeclared conflict of interest.

Magneto_430But that incident left Magneto upset with me. He felt I did not give sufficient attention to the continuity between how Coan had slipped post hoc statistical manipulations in the PLOS article to get positive results and what he had done in a past paper with Richard Davison. Worse, I had tipped off Jim Coan about our checking his work. Coan launched a pre-emptive tirade against post-publication scrutiny, his now infamous Negative Psychology rant  He focused his rage on Neuroskeptic, not Neurocritic or me, but the timing was not a coincidence. He then followed up by denouncing me on Facebook as the Chopra Deepak of skepticism.

I still have not unpacked that oxymoronic statement and decided if it was a compliment.

OK, Magneto, I will be less naïve and more thorough this round. I will pass on whatever you uncover.

Check back if you just want to augment your critical appraisal skills with some unconventional ones or if you just enjoy a spectacle. If you want to arrive at your own opinions ahead of time, email Douglas Turkington douglas.turkington@ntw.nhs.uk and for a PDF of his paywalled article. Tell him I said hello. The offer of a debate still stands.