A “Kindling” Model of the Development of Addiction

Sometimes, while daydreaming in the shower or in the car, an insight hits you out of the blue. That happened last week. It occurred to me that the best neurobiological model of addiction has a serious missing link. Addiction develops. It grows. A successful theory of addiction needs to be a developmental theory – a theory of neural development.

In my view, the neural basis of addiction is best captured by Berridge and Robinson’s model of incentive sensitization. In a nutshell, these researchers show that “wanting” and “liking” are quite independent, subserved by different neurochemicals, and addiction is characterized by “wanting,” not “liking.” That helps explain why addicts keep on craving – and obtaining – their substance of choice long after it stops being pleasureful. The research supporting the model shows that, contrary to a tenacious myth, dopamine does not cause pleasure. Rather, dopamine is critical for the pursuit of goals, including the behaviors required to reach them and – most important– the powerful motivations needed to execute those behaviors. According to Berridge and colleagues, dopamine gets released from the midbrain in buckets when addicts are presented with cues associated with their substance of choice. A sight, sound, or memory, reminiscent of that stuff (e.g., a cramp in the gut, the fleeting glimpse of someone who looks like a drug buddy, a scrap of paper dotted with a few flecks of white powder) will activate dopamine release and send it straight to the nucleus accumbens (NAcc; a major component of the ventral striatum) where it induces goal-oriented behavior (when the stuff is available) or craving (when it’s not).  But the power of cues to elicit the addictive impulse must take time to develop. It’s not present the first time you try drugs or booze, or even necessarily the 20th time. This process is therefore called incentive sensitization; because the cues that trigger drug seeking become sensitized over time.

The trouble is, Berridge and colleagues don’t explain how this sensitization takes hold. The cues must be processed somewhere in the back half of your brain (where “perception” first arises) and they must activate the amygdala, the famous limbic structure that produces emotional feelings on the basis of perception. But how do these perceptual and evaluative processes come to trigger the urge, the thrust, the powerful desire that is the essence of approach motivation? That would have to take place in the frontal cortex and its master motivator, the striatum.

I’d been reading a recent chapter by Berridge and Robinson (2011), in preparation for a class I’d soon be teaching on the neuroscience of addiction. But the paper seemed to be missing something – a mechanism. This bothered me for a few days, and then, out of the blue, I remembered the idea of “kindling” – something I’d read about years before. Kindling in neuropsychology initially meant the tendency for animals to get seizures more and more predictably in response to less and less of the seizure-inducing stimulus. So rats might go into a seizure when exposed to an electric shock, but the amount of electricity needed to evoke the seizure would diminish, session after session, and seizures would eventually occur spontaneously, without any shock.

The kindling model was used at first to understand epileptic seizures, which occur more frequently, with less to trigger them, as people age. But in 1992, Robert Post published a kindling model of depression.  According to Post, depressive episodes follow the same developmental trajectory as epilepsy. An initial depressive episode is triggered by a major stressor, but subsequent bouts of depression are triggered by less and less adversity. That’s why adult depressives get, um, depressed….so easily. In his excellent book, Listening to Prozac, Peter Kramer fleshes out the kindling model of depression. He tells of patients who become more vulnerable to depression with age, despite increasingly minor triggers, as exemplified by a concentration camp survivor whose depression, initially elicited by a horrible experience in his youth, came back to haunt him later in life. More recent work uses the kindling concept to describe the development of bipolar disorder and PTSD. What all these accounts have in common is the notion that sensitivity to certain cues – cues that elicit negative effects or negative affects – increases with development.

Could that be what’s going on in addiction? Could kindling explain incentive sensitization? It could, but so far kindling models have only been applied to the elicitation of negative thoughts, moods, and emotions. Can kindling also apply to the elicitation of actions? Impulsive or compulsive actions like acquiring drugs, gambling away the down payment on your house, or drinking yourself out of your marriage or your job? Instead of taking place in the back half of the brain or the amygdala, kindling would have to arise in the frontal brain, the goal-seeking part, and in particular the ventral striatum or NAcc, the seat of motivated action.

That was the thought that came unbidden last week. Hadn’t I read about kindling in the striatum – somewhere? I went to my computer, and the first paper I looked up was a little-known chapter by Don Tucker — Tucker, D. M. (2001). Motivated anatomy: a core-and-shell model of corticolimbic architecture. Handbook of Neuropsychology, 2nd  Edition, Volume 5, Gainotti (Ed.). Elsevier. All I remembered was that the chapter had knocked me out when I first read it, and Tucker has long been one of my favorite brain theorists. I hit the jackpot! In previous work, Tucker had explained how depression could indeed be kindled in the limbic system – most probably in the amygdala. After all, it’s the amygdala’s job to get sensitized, so that stimuli evoke immediate emotional responses based on previous associations. But in this chapter, Tucker went on to describe kindling in the striatum, the source of voluntary behavior.

So, here’s how it might work: The amygdala, which lies at the gateway of perception, gets sensitized by stimuli. As an addict, you get more and more moved by drug-related cues – the empty pill bottle, the email address of your dealer, the buzzing neon sign in front of the liquor store. And the NAcc gets more and more sensitized to a specific goal – getting the substance or doing the activity — and to the series of moves you can make, you must make, to achieve that goal. Sure enough, there’s a dense pathway of axons leading from the amygdala to the striatum, a one-way street from biased perception to biased action.  So incentive sensitization can take place within and between these two systems, both of which are highly programmable, plastic, modifiable, and loaded with synapses that get molded by experience. Both the amygdala and the striatum serve as hubs at the center of extensive cortical systems. Together they form a macrosystem sometimes referred to as the extended amygdala.  The shaping of synaptic networks within this region may be the mechanism by which meaningful perceptions and meaningful actions converge over development, thereby colonizing one’s emotional memory and cementing one’s emotional habits.

I see addiction as developing in exactly that way. The difference from “normal” development is that the kindling that leads to addiction is focused on goals that are so desirable, so very attractive, that they set in motion a feedback loop between “wanting” and doing. Then, other goals become less and less salient, and that’s a serious problem.

“Strong evidence” for a treatment evaporates with a closer look: Many psychotherapies are similarly vulnerable.

Note: BMC Medicine subsequently invited a submission based on this blog post.

Coyne, J. C., & Kwakkenbos, L. (2013). Triple P-Positive Parenting programs: the folly of basing social policy on underpowered flawed studies. BMC Medicine, 11(1), 11.

It is now available here:

Promoters of Triple P parenting enjoy opportunities that developers and marketers of other “evidence-supported” psychosocial interventions and psychotherapies only dream of. With a previously uncontested designation as strongly supported by evidence, Triple P is being rolled out by municipalities, governmental agencies, charities, and community-based programs worldwide. These efforts generate lots of cash from royalties and license fees, training, workshops, and training materials, in addition to the prestige of being able to claim that an intervention has navigated the treacherous path from RCT to implementation in the community.

With hundreds of articles extolling its virtues, dozens of randomized trials, and consistently positive systematic reviews, the status of the Triple P parenting intervention as evidence supported would seem beyond being unsettled by yet another review. Some of the RCTs are quite small, but there are public health level interventions, including one involving 7000 children from child protective services. Could this be an instance in which it should be declared “no further research necessary”? Granting agencies have decided not to fund further evaluation of interventions on the basis of a much smaller volume of seemingly less unanimous data.

But the weaknesses revealed in a recent systematic review and meta-analysis of the Triple P by Philip Wilson and his Scottish colleagues show how apparently strong evidence can evaporate when it is given a closer look. Other apparently secure “evidence supported” treatments undoubtedly share these weaknesses and the review provides a model of where to look. But when I took careful look, I discovered that Wilson and colleagues glossed over a very important weakness in the body of evidence for Triple P. They noted it, but didn’t dwell on it. So, weakness in the body of evidence for Triple P is much greater than a reader might conclude from Wilson and colleagues’ review.

 WARNING! Spoiler Ahead. At this point, readers might want to download the article and form their own impressions, before reading on and discovering what I found. If so, they can click on this link and access the freely available, open access article.

Wikipedia describes Triple P as

a multilevel parenting intervention with the main goal of increasing the knowledge, skills, and confidence of parents at the population level and, as a result, reduce the prevalence of mental health, emotional, and behavioral problems in children and adolescents. The program is a universal preventive intervention (all members of the given population participate) with selective interventions specifically tailored for at risk children and parents.

A Triple P website for parents advertises

the international award winning Triple P – Positive Parenting Program®, backed by over 25 years of clinically proven, world wide research, has the answers to your parenting questions and needs. How do we know? Because we’ve listened to and worked with thousands of parents and professionals across the world. We have the knowledge and evidence to prove that Triple P works for many different families, in many different circumstances, with many different problems, in many different places!

The Triple P website for practitioners declares

As an individual practitioner or a practitioner working within an organisation you need to be sure that the programs you implement, the consultations you provide, the courses you undertake and the resources you buy actually work.

Triple P is one of the only evidence-based parenting programs available worldwide, founded on over 30 years of clinical and empirical research.

Disappearing positive evidence

In taking stock of Triple P, Wilson and colleagues applied objective criteria in a way that readily allows independent evaluation of their results.

They identified 33 eligible studies, almost all of them positive in indicating that Triple P has positive effects on child adjustment.

  • Of the 33 studies, most involving media-recruited families so that participants in the trials were self-selected and more motivated than if they are clients referred from community services or involuntarily getting treatment mandated by child protection agencies.
  • 31/ 33 studies compared Triple P interventions with waiting list or no-treatment comparison groups. This suggests that Triple P may be better than doing nothing with these self-referred families, but doesn’t control for simply providing attention, support, and feedback. The better outcomes for families getting Triple P versus getting than wait list or no treatment may reflect families assigned to these control conditions registering the disappointment with not getting what they had sought in answering the media ads.
  • In contrast, the two studies involving an active control group showed no differences between groups.
  • The trials evaluating Triple P typically administered a battery of potential outcomes, and there is no evidence for any trials that particular measures were chosen ahead of time as the primary outcomes. There was considerable inconsistency among studies using the same instruments in decisions about which subscales were reported and emphasized. Not declaring outcomes ahead of time provides a strong temptation for selective reporting of outcomes. Investigators analyze the data, decide what measures puts Triple P in the most favorable light, and declare post hoc those outcomes as primary.
  • Selective reporting of outcomes occurred in the the abstracts of these studies. Only 4/33 abstracts report any negative findings and 32/33 abstracts were judged to give a more favorable picture of the effects of Triple P.
  • Most papers only reported maternal assessments of child behavior and the small number of studies that obtained assessments from fathers did not find positive treatment effects from the father’s perspective. This may simply indicate the detachment and obliviousness of the fathers, but can also point to a bias in the reports of mothers who had made more of an investment in getting treatment.
  • Comparisons of intervention and control groups beyond the duration of the intervention were only possible in five studies. So, positive results may be short-lived.
  • Of the three trials that tested population level effects of Triple P, two were not randomized trials, but had quasi-experimental designs with significant intervention and control group differences at baseline. A third trial reported a reduction in child maltreatment, but examination of results indicate that this was due to an unexplained increased in child maltreatment in the control area, not a decrease in the intervention area.
  • Thirty-two of the 33 eligible studies were authored by Triple-P affiliated personnel, but only two had a conflict of interest statement. Not only is there strong possibility of investigator allegiance exerting an effect on the reported outcome of trials, there are undeclared conflicts of interest.

The dominance of small, underpowered for quality studies

Wilson and colleagues noted a number of times in their review that many of the trials are small, but they do not dwell on how many, how small, or with what implications. My colleagues have adopted the lower limit of 35 participants in the smallest group for inclusion of trials in meta-analyses. The rationale is that any trial that is smaller than this does not have a 50% probability of detecting a moderate sized effect, even if it is present. Small trials are subject to publication bias in that if results are not claimed to be statistically significant, they will not to get published because the trial was insufficiently powered to obtain a significant effect. On the other hand, when significant results are obtained, they are greeted with great enthusiasm precisely because the trials are so small. Small trials, when combined with flexible rules for deciding when to stop a trial (often based on a peek at the data), failure to specify primary outcomes ahead of time, and flexible rules for analyses, can usually be made to appear to yield positive findings, but that will not be replicated. Small studies are vulnerable to outliers and sampling error and randomization does not necessarily equalize group differences they can prove crucial in determining results. Combining published small trials  in a meta-analysis does not address these problems, because of publication bias and because of all or many of the trials sharing methodological problems.

What happens when we apply the exclusion criterion to Triple P trials of <35 participants in the smallest group? Looking at table 2 in Wilson and colleagues’ review, we see that 20/23 of the individual papers included in the meta-analyses are excluded. Many of the trials quite small, with eight trials having less than 20 participants (9 -18) in the smallest group. Such trials should be statistically quite unlikely to detect even a moderate sized effect, and that so many nonetheless get significant findings attests to a publication bias. Think of it: with such small cell sizes, arbitrary addition or subtraction of a single participant can alter results. Figure 2 in the review provides the forest plot of effect sizes for two of the key outcome measures reported in Triple P trials. Small trials account for the outlier strongest finding, but also the weakest finding, underscoring sampling error. Meta-analyses attempt to control for the influence of small trials by introducing weights, but this strategy fails when the bulk of the trials are small. Again examining figure 2, we see that even with the weights, small trials still add up to over 83% of the contribution to the overall effect size. Of the three trials that are not underpowered, two have nonsignificant effects entered into the meta-analysis. The confidence intervals for the one moderate size trial that is positive barely excludes zero (.06).

Wilson and colleagues pointed to serious deficiencies in the body of evidence supporting the efficacy of Triple P parenting programs, but once we exclude underpowered trials, there is little evidence left.

Are Triple P parenting programs ready for widespread dissemination and implementation?

Rollouts of the kind that Triple P is now undergoing are expensive and consume resources that will not be available for alternatives. Yet, critical examination of the available evidence suggests little basis for assuming that Triple P parenting programs will have benefits commensurate with their cost.

In contrast to the self-referring families stayed in randomized trials, the families in the community are likely to be more socially disadvantaged, often single parent, and often coming to treatment only because of pressure and even mandated attendance. Convenience samples of self-referred participants are acceptable in the early stages of evaluation of an intervention, but ultimately the most compelling evidence must come from participants more representative of the population who will be treated in the community.

Would other evidence supported interventions survive this kind of scrutiny?

Triple P parenting interventions have the apparent support of a large literature that is unmatched in size by most treatments claiming to be evidence supported. In a number of articles and blog posts, I have shown that other treatments claimed to be evidence supported often have only weak evidence. Similar to Triple P, other treatments are largely evaluated by investigators who have vested financial and professional interests in demonstrating their efficacy, in studies that are underpowered, and with a high risk of bias, notably in the failure to specify which of many outcomes that are assessed are primary. Similar to Triple P, psychotherapies routinely get labeled as having strong evidence based solely on studies that involve comparisons with no treatment or waitlist controls. Effect sizes exaggerate the advantage over these therapies over patient simply getting nonspecific, structured opportunities for attention, support, and feedback under conditions of positive expectations. And, finally, similar to what Wilson and colleagues found for Triple P, there often large gaps between the way findings are depicted in abstracts for reports of RCTs and what can be learned from the results sections of the actual articles.

In a recent blog post, I also showed that American Psychological Association Division 12 Clinical Psychology had designated Acceptance and Commitment Therapy (ACT) as having strong evidence for efficacy n hospitalized psychotic patients, only to have that designation removed when I demonstrated that the basis for this judgment was two null flawed and small trials. Was that shocking or even surprising? Stay tuned.

In coming blog posts, I will demonstrate problems with claims of other treatments being evidence-based, but hopefully this blog provides readers with tools to investigate for themselves.

Failure to replicate as an opportunity for learning

I am currently reading Kuhn’s “The Structure of Scientific Revolutions” and often find myself travelling back in time to a family dinner at home.

Many years ago, while still an undergrad in Argentina, I returned home from a tiring day in the lab complaining that the experiment ‘hadn’t worked’. My dad looked at me dismissively and said:

“The experiment worked, you just don’t know what variables you didn’t control”.

My dad is not a scientist, but an avid reader of popular science books. He had attended a couple of years of chemistry before leaving University to start his own business, so it was hard not to attack him with my mashed potatoes. But this was probably the most important lesson I learned in my entire science career:

Failure to replicate exposes those unknown/unthought-of variables that determine the result of a given experiment.

As a PhD student later on, I was expected to replicate previous findings before moving on with my own work that built on those findings. In many cases I replicated successfully, in other cases I didn’t. I had to even replicate work from within the lab. When I failed, we uncovered nuances about how each of us were actually ‘doing’ the work. In some cases, replicability came down in those nuances that were not written down in the lab protocols and recipes. But in all cases we learned something from those failures.

I expect the same from myself and my students, though they (and many of my colleagues) find that re-doing what has been done is a waste of time. I don’t. And here is why:

Let’s say that someone described the expression pattern of a protein in the brain of species A and I want to see if the expression pattern is the same in the brain of species B. I follow the same protocol, and find a difference between the two species. Now, how do I decide whether that difference is a species-specific difference or something else that I am doing that is different from what the original authors did and that I did not account for? Well, the only way of knowing is by trying to replicate the original findings in species A. If I can replicate, then I can more confidently argue that it is a species-specific difference (at least with respect to that specific protocol). If I can’t then I have further defined the boundaries within which those original findings are valid. Win-Win.

by Sam UL cc-by-nc-sa on flickr

This brings up another reaction to the results of experiments: How hard do we work at trying to get an experiment to ‘work’ when we expect it won’t? For example: if I expect (based on the published literature) that a protein is not being expressed in a particular brain region, I may quite quickly accept a negative result. But if I did not have this pre-knowledge, I might go through a lot of different attempts before I am convinced it is not there. So the readiness with which we accept or not a negative or positive result is influenced by that pre-knowledge. But how deeply do we go into that published literature to examine how well justified that “pre-kowledge” is? As I go through the literature I often find manuscripts that make claims where I would like to see how the inter-lab or inter-individual variability has been accounted for, or at least considered, or what it took to accept a positive or negative result.

Every now and then I find someone in my lab that can’t replicate my own results. I welcome that. In the majority of the cases, we can easily identify what the variable is – in other we uncover something that we had not thought of that may be influencing our results. After all, one can only control the variables that one thinks of a priori. How is one to control for variables one does not think of? Well, those will become obvious when someone fails to replicate.

So why are scientists so reactive to these failures to replicate? After all, it is quite likely that the group failing to replicate also did not think of those variables until they got their results. A few months ago PLOS, FigShare and Science Exchange launched the Reproducibility Initiative that, as they say will help correct the literature out there, but I think also define better the conditions that make an experiment work one or another way.

So, back to my dad. All experiments work, even those that give us an unexpected result. What I learned from dad is that being a good scientist is not abut dismissing “bad experiments” and discarding the results, but more about looking deeper into what variables might have led to a different result. In many cases, it might be a bad chemical batch – in others it might uncover a crucial variable that defines the boundaries of validity of a result.

I call that progress.

Kuhn, T. S. (1962). The structure of scientific revoutions. Chicago: The University of Chicago Press.
*** Update 16/11/12: I noticed that I had mistakenly used an image with full copyright, despite having narrowed my original search to CC-licenced content. I apologise for this oversight. I have now removed the original image and replaced it with one with a suitable licence.


Why Addiction is NOT a Brain Disease

Addiction to substances (e.g., booze, drugs, cigarettes) and behaviors (e.g., eating, sex, gambling) is an enormous problem, seriously affecting something like 40% of individuals in the Western world. Attempts to define addiction in concrete scientific terms have been highly controversial and are becoming increasingly politicized. What IS addiction? We as scientists need to know what it is, if we are to have any hope of helping to alleviate it.

There are three main definitional categories for addiction: a disease, a matter of choice, and self-medication. There is some overlap among these meta-models, but each has unique implications for treatment, from the level of government policy to that of available options for individual sufferers.

The dominant party line in the U.S. and Canada is that addiction is a brain disease. For example, according to the National Institute on Drug Abuse (NIDA), “Addiction is defined as a chronic, relapsing brain disease that is characterized by compulsive drug seeking and use, despite harmful consequences.” In this post, I want to challenge that idea based on our knowledge of normal brain change and development.

Why many professionals define addiction as a disease.

The idea that addiction is a type of disease or disorder has a lot of adherents. This should not be surprising, as the loudest and strongest voices in the definitional wars come from the medical community. Doctors rely on categories to understand people’s problems, even problems of the mind. Every mental and emotional problem fits a medical label, from borderline personality disorder to autism to depression to addiction. These conditions are described as tightly as possible, and listed in the DSM (Diagnostic and Statistical Manual of Mental Disorders) and the ICD (International Classification of Diseases) for anyone to read.

I won’t try to summarize all the terms and concepts used to define addiction as a disease, but Steven Hyman, M.D., previous director of NIMH and Provost of Harvard University, does a good job of it. His argument, which reflects the view of the medical community more generally (e.g., NIMH, NIDA, the American Medical Association), is that addiction is a condition that changes the way the brain works, just like diabetes changes the way the pancreas works. Nora Volkow M.D. (the director of NIDA) agrees. Going back to the NIDA site, “Brain-imaging studies from drug-addicted individuals show physical changes in areas of the brain that are critical for judgment, decisionmaking, learning and memory, and behavior control.” Specifically, the dopamine system is altered so that only the substance of choice is capable of triggering dopamine release to the nucleus accumbens (NAC), also referred to as the ventral striatum, while other potential rewards do so less and less. The NAC is responsible for goal-directed behaviour and for the motivation to pursue goals.

Different theories propose different roles for dopamine in the NAC. For some, dopamine means pleasure. If only drugs or alcohol can give you pleasure, then of course you will continue to take them. For others, dopamine means attraction. Berridge’s theory (which has a great deal of empirical support) claims that cues related to the object of addiction become “sensitized,” so they greatly increase dopamine and therefore attraction — which turns to craving when the goal is not immediately available. But pretty much all the major theories agree that dopamine metabolism is altered by addiction, and that’s why it counts as a disease. The brain is part of the body, after all.

What’s wrong with this definition?

It’s accurate in some ways. It accounts for the neurobiology of addiction better than the “choice” model and other contenders. It explains the helplessness addicts feel: they are in the grip of a disease, and so they can’t get better by themselves. It also helps alleviate guilt, shame, and blame, and it gets people on track to seek treatment. Moreover, addiction is indeed like a disease, and a good metaphor and a good model may not be so different.

What it doesn’t explain is spontaneous recovery. True, you get spontaneous recovery with medical diseases…but not very often, especially with serious ones. Yet many if not most addicts get better by themselves, without medically prescribed treatment, without going to AA or NA, and often after leaving inadequate treatment programs and getting more creative with their personal issues. For example, alcoholics (which can be defined in various ways) recover “naturally” (independent of treatment) at a rate of 50-80% depending on your choice of statistics (but see this link for a good example). For many of these individuals, recovery is best described as a developmental process — a change in their motivation to obtain the substance of choice, a change in their capacity to control their thoughts and feelings, and/or a change in contextual (e.g., social, economic) factors that get them to work hard at overcoming their addiction. In fact, most people beat addiction by working really hard at it. If only we could say the same about medical diseases!

The problem with the disease model from a brain’s-eye view.

According to a standard undergraduate text: “Although we tend to think of regions of the brain as having fixed functions, the brain is plastic: neural tissue has the capacity to adapt to the world by changing how its functions are organized…the connections among neurons in a given functional system are constantly changing in response to experience (Kolb, B., & Whishaw, I.Q. [2011] An introduction to brain and behaviour. New York: Worth). To get a bit more specific, every experience that has potent emotional content changes the NAC and its uptake of dopamine. Yet we wouldn’t want to call the excitement you get from the love of your life, or your fifth visit to Paris, a disease. The NAC is highly plastic. It has to be, so that we can pursue different rewards as we develop, right through childhood to the rest of the lifespan. In fact, each highly rewarding experience builds its own network of synapses in and around the NAC, and that network sends a signal to the midbrain: I’m anticipating x, so send up some dopamine, right now! That’s the case with romantic love, Paris, and heroin. During and after each of these experiences, that network of synapses gets strengthened: so the “specialization” of dopamine uptake is further increased. London just doesn’t do it for you anymore. It’s got to be Paris. Pot, wine, music…they don’t turn your crank so much; but cocaine sure does. Physical changes in the brain are its only way to learn, to remember, and to develop. But we wouldn’t want to call learning a disease.

So how well does the disease model fit the phenomenon of addiction? How do we know which urges, attractions, and desires are to be labeled “disease” and which are to be considered aspects of normal brain functioning? There would have to be a line in the sand somewhere. Not just the amount of dopamine released, not just the degree of specificity in what you find rewarding: these are continuous variables. They don’t lend themselves to two (qualitatively) different states: disease and non-disease.

In my view, addiction (whether to drugs, food, gambling, or whatever) doesn’t fit a specific physiological category. Rather, I see addiction as an extreme form of normality, if one can say such a thing. Perhaps more precisely: an extreme form of learning. No doubt addiction is a frightening, often horrible, state to endure, whether in oneself or in one’s loved ones. But that doesn’t make it a disease.

The Complexities of Diagnosing Posttraumatic Stress Disorder (PTSD)

When I was in medical school, senior physicians would frequently usher a group of us students into a patient’s room so we might hear them tell the story of their illness.  It seemed that the more classic the story was for a particular illness the more intense was their ushering.  We would huddle around the patient’s bed all of us transfixed by the doctor interviewing the patient. I remember hanging on the patient’s every last word and, simultaneously, shifting through the textbook data stored in my brain in search of a diagnostic match.  When done, the senior doctor would turn around and challenge us to diagnose what ailed the patient and we would respond with a flurry of answers. I still remember the thrill of solving the puzzle, of making a “textbook diagnosis”.

Image courtesy of coalitionforveterans.org

These days, almost 20 years later, it seems I rarely meet a patient with a “text book diagnosis” and the patients I care for in real life clinical practice are more complex than those described in the pages of thick medical texts.  Perhaps, nowhere does this complexity become more apparent than when I meet patients who have experienced a severe psychological trauma.

In my work as a psychiatrist that go to “text book” is called the DSM IV, the diagnostic and statistical Manual of Mental Disorders which is currently in its fourth version.  This is the standard diagnostic manual used by psychiatrists and psychologists all over the USA.

In this 943 paged book, under chapter 7 titled, Anxiety Disorders, one can find several pages devoted to Posttraumatic Stress Disorder (PTSD).  Page after page documents all one could possibly need to know about diagnosing PTSD: the core clinical features, associated features and disorders, specific cultural and age features, prevalence of PTSD, clinical course of PTSD, familial patterns and Differential Diagnoses (i.e. other disorders that look like PTSD but are not)

Yet, as valuable as these pages are, this diagnosis of PTSD still appears dissatisfying to many.

In her 1992 landmark text, Trauma and Recovery, Judith Herman M.D., a Harvard psychiatrist, argued that “the diagnosis of posttraumatic stress disorder as it is presently defined does not fit accurately enough the complicated symptoms seen in survivors of prolonged repeated trauma”.  She proposed that the syndrome that follows upon exposure to prolonged repeated trauma needs its own name and offered the new term, “complex PTSD”.

I find myself thinking of Dr. Herman’s complex PTSD diagnosis often these days—I think complex PTSD better explains some of the symptoms I see in my patients who have experienced severe trauma. In such cases I find the DSM IV wanting and instead find that the complex PTSD diagnosis holds more real life value or clinical utility.

The DSM IV is currently undergoing a revision with the latest version, the DSM 5¸slated to come out in May of 2013. This has raised the possibility that complex PTSD would be included as a separate diagnostic entity in the DSM-5.  But it is not so easy to get into the DSM­, for a new disorder to be considered for entry a strict set of criteria need to be met: Is there a clear definition of the disorder? Are there reliable methods to diagnose the disorder? In the case of complex PTSD, is it truly distinct from PTSD or just a different, perhaps more severe, type of PTSD? What is the value of adding a new diagnosis—how will it change the way we care for those living with PTSD?

In fact, vigorous discussion over this very question was recently published in the Journal of Traumatic Stress, an academic journal published by the International Society for Traumatic Stress Studies. Leaders and experts in the field of traumatic case articulately state their arguments for and against the inclusion of complex PTSD in the DSM 5.

One issue fundamental to my specialty that is no doubt fueling this controversy is the lack of objective biomarkers available to mental health professionals to diagnose mental disorders such as PTSD.  A limitation of much of our diagnosis in psychiatry is that we base our diagnosis on the self report of our patient and have limited blood tests or scans at our disposal to make an “objective” diagnosis.

On a positive note we can be reassured that psychiatry is in the midst of a biological revolution, hurtling toward a time when it will soon be able to diagnose with blood tests and brain scans and offer tailored treatments to patients. Still, this does not obviate me from my duty to heal the pain of those suffering today and though I work with a diagnostic system that is imperfect, I know that that does not make such a system invalid when used properly.

The diagnostic status of complex PTSD is controversial and not likely to be resolved soon, in the meantime, I will have to get used to living in a world where patients with “text book diagnoses” appear to be scarce, and, instead, venture into more ambiguous territory. Textbooks aside, I try instead to make sense of the mental dysfunction I am witnessing in the hope that it offers some meaning to the person seeking help from me and, through this validation, perhaps an improved sense of their overall well being.

The views expressed are those of the author and do not necessarily reflect the official policy or position of the Department of Veterans Affairs or the United States Government.

I am holding my revised manuscript hostage until the editor forwards my complaint to a rogue reviewer.

This blog post started as a reply to an editor who had rendered a revise and resubmit decision on my invited article based on a biased review. I realized the dilemma I faced was a common one, but unlike many authors, I am sufficiently advanced in my career to take the risk of responding publicly, rather than just simply cursing to myself and making the changes requested by the rogue reviewer. Many readers will resonate with the issues I identify, even if they do not yet feel safe enough making such a fuss. Readers who are interested in the politics and professional intrigue of promoting screening cancer patients for distress might also like reading my specific responses to the reviewer. I end with an interesting analogy, which is probably the best part of the blog.

Dear Editor,

I appreciate the opportunity to receive reviews and revise and resubmit my manuscript. However, my manuscript is now being held hostage in a safe place. It will be released to you when you assure me that my complaint has been forwarded to a misbehaving reviewer with a request for a response.

Unscrupulous reviewers commonly abuse their anonymity and gatekeeping function by unfairly controlling what appears in peer reviewed journals. They do so to further their own selfish and professional guild interests.

They usually succeed in imposing their views, coercing authors to bring their manuscripts into compliance with their wishes or risk rejection. Effects on the literature include gratuitous citations of the reviewer’s work, authors distorting the reporting of findings in ways that flatter the reviewer’s work, and the suppression of negative findings.  More fundamentally, however, such unscrupulous tactics corrupt what science does best: confronting ideas with evidence and allowing the larger scientific community to decide how and in what form, if any, the idea survives the confrontation.

With their identities masked, unscrupulous reviewers bludgeon authors in unlit alleyways and slip away.  Victimized authors are reluctant to complain because they do not want to threaten future prospects with the journal and so simply give in.

This time, however, I am announcing the crime in the bright daylight of the Internet.  I have not yet unmasked  the reviewer with 100% certainty (although I can say he has a bit of an accent that is different than the other members of his department), but I can ask you forward my communication to him and extend an offer to debate me at a symposium or other professional gathering.

My manuscript was invited as one of two sides of a conference debate concerning screening cancer patients for distress. Although held at 8:30 AM on the last day of the conference, the debate was packed, and as one of the organizers of the conference said afterwards, we woke the crowd up. The other speaker and I had substantial disagreements, but we found common ground in engaging each other in good humor. Some people that I talked with afterwards said that I had persuaded them, but more importantly, others said that I had forced them to think.

Discussions of whether we should screening cancer patients for distress have rapidly moved from considering the evidence to agitation for international  recommendations and even mandating of screening. Pharma has poured millions of dollars into the development of quality indicators that can be used to monitor whether oncologists ask patients about psychological distress and if the patients indicate that they are experiencing distress, the indicators record what action was taken. Mandating screening is of great benefit to Pharma because these quality indicators can be met by oncologists casually offering antidepressants to distressed patients, without formal diagnosis or follow-up.

As I’ve shown in my research, breast cancer patients are already receiving antidepressant prescriptions at an extraordinary rate, and often in the absence of ever having had a two week mood disturbance in their lives. Receiving a prescription for an antidepressant has become an alternative to allowing patients unhurried times with cancer care professionals to discuss why they are distressed and the preferred way of addressing their distress.

This reviewer’s comments are just another effort at suppressing discussion of the lack of evidence that screening cancer patients for distress will improve their outcomes. You’re well aware of other such efforts. Numerous professional advocacy groups have gain privileged access to ostensibly peer-reviewed journals for articles promoting screening with the argument that it would take too long to accumulate evidence whether screening really benefits patients. The flabbiness of their arguments and the poor quality of some of these papers attest to their not having been adequately tempered by peer review.

A phony consensus group has been organized and claims to have done a systematic review of the evidence concerning screening. When I contacted the authors, they conceded that there was no formal process for organizing the group or arriving at consensus. Rather, it was a convenience group of persons already known to have strong positive opinions of screening and there were strict controls on what would go into the paper.  I’ve taken us a close look at that paper and found serious flaws in the identification, classification, and integration of studies. The paper would be ripe for one of the withering point by point deconstructions that my colleagues and I are notorious for. Unfortunately the paper is published in a journal that does not allow post publication commentary and so, at least in the journal in which was published, it will evade critique.

This reviewer abused both the role as gatekeeper for my manuscript and the anonymity of reviewers in demanding that I make changes in the manuscript that were not based on the weight of evidence, but rather  on an insistence that I fall in line with the dictates of party lines and professional politics. requiring the promotion of screening of cancer patients for distress, despite the utter lack of evidence This reviewer insists that I not call attention to lack of evidence screening benefits patients and instead praise screening for its benefits to professionals.

Below I italicized some of the reviewer’s comments, with my responses interspersed:

My review is in many ways unusual and for the sake of clarity and fairness requires a substantial preamble.
The manuscript represents a transcript of one speaker’s portion (Coyne) of a 2-sided debate and both contributions are meant to be published side by side…
I will try to provide this review in an unbiased fashion but that will be a mighty challenge because I was never a swing voter, I had a position prior to this debate and this position is leaning ‘pro’-screening, as long as some key foundational conditions are in place.

Okay, this reviewer declared loyalties ahead of time and provides a strong warning of bias.  But forewarning  is not an excuse for the reviewer not having taken on my manuscript.

I see an urgent need to remove the untenable categorical opinion (i.e., claim that there is no supporting research on screening (see opening line in abstract and page 10)) when the other paper clearly shows the (imperfect) opposite based on his systematic review.

Why the urgency? I make the argument that before we implement routine screening of cancer patients for distress, we need evidence that it will lead to improved patient outcomes. In that sense, screening for distress is no different than any other change in clinical procedures that is potentially harmful or costly and disruptive of existing efforts to meet patient needs.

Evidence would consist of a demonstration in a randomized trial that screening for distress and feedback to clinicians and patients leads to better patient outcomes than simply giving patients opportunities to talk to clinicians without regard to their scores on screening instruments and giving them the same access to services that screened patients have. The other side in the live debate conceded in the debate that there was as yet no such evidence, although I do not get the sense that the reviewer attended the debate.

The author needs to tone down what comes across as an almost personal attack of psycho-oncology researchers from the Calgary group, and needs to remove polemic language around the ‘6th Vital Sign’ (“sloganeering..”) ; 6th Vital sign is a concept created as a marketing strategy rather than a substantive issue.

I appreciate that the reviewer at least concedes calling distress the “sixth vital sign” is a marketing strategy, but the phrase has increasingly made it into the titles of peer-reviewed articles and is offered as a rationale for recommending and  even mandating screening in the absence of data. And let’s look at the vacuousness of this “marketing strategy.” It capitalizes on the well-established 4 vital signs: temperature, pulse or heart rate, blood pressure, and respiratory rate. These are all objective measures that do not depend on patient self-report. Pain has been proposed as the fifth vital sign, although it is controversial, in part because it is not objective. The “Calgary group” as the reviewer refers to them has championed making distress becoming the six sign, but distress is neither objective nor vital, and assessment depends on self-report. Temperature is measured with a thermometer, and distress is measured with a pencil and paper or touchscreen thermometer. But there the analogy ceases.

Coyne raises a number of tangential issues that don’t belong here; I think they merely distract:
[a] do we really need new pejorative lingo: “Anglo-American Linguistic Imperialism”  ?? I think not,  because the real point is that some terms translate better than others and ‘distress’ does not translate well.

How are these issues tangential? Proponents of screening call for international guidelines mandating routine screening, but distress is not a word that translates into many languages. I attended a symposium recently in which a French presenter described the bewilderment of cancer patients when they were asked to mark their level of distress on a picture of a thermometer. In many languages, it is not a matter of finding a direct translation of “distress”, because no direct translation exists and there is no unitary corresponding concept. And the linguistic problems are compounded when advocates stretch distress to include every psychological discomfort, spiritual issue, and side effect of cancer. One word cannot serve so many functions in other languages.

So, I think is a big issue to impose this Anglo-American term on other cultures and to insist patients respond even when there is no coherent concept being assessed in their language. For what purpose, international solidarity? The reviewer defends “sixth vital sign” from the “Calgary group” which I don’t think we need, but he disallows me my “Anglo-American linguistic imperialism” for which I provide an adequate rationale.

Another [partially] straw-man argument is that routine use of screening and follow through are expensive.  There are numerous settings where screening is done via touch-screen computer that autoscore and spit out summary sheets with ‘red-flagged’ results.  This is cheap and I don’t see how anyone could argue otherwise.

Screening is much more than getting patients to tap a touchscreen if the intention is to improve their well-being. Unfortunately, in some American settings that have implemented screening, patients tap a touchscreen thermometer to indicate their level of distress and results are whisked to an electronic medical record where the information is ignored. Is that what the reviewer wants?

Results of screening, particularly with a distress thermometer, are highly ambiguous and need to be followed up with an interview by professional. I cited research in my manuscript that most distressed patients are not seeking a referral, variously because their needs are already addressed elsewhere, they don’t see the cancer care setting as appropriate place to get services for their needs, they want services that are not available at the cancer center, or they are simply not convinced that they need services.

Many screening instruments have items referring to “being better off dead” or other indications of suicidal ideation or intention to self-harm. Although cancer patients endorsing such items have a small likelihood of attempting suicide, the issue needs to be addressed in an interview with a trained professional. In some cancer care settings, this could cost a patient $200, and most endorsements of such items turn out to be false positives. To not do follow-up assessments is unconscionable, unethical and could be the basis of a malpractice suit in many settings. To adopt a clinical policy of “don’t ask, don’t tell,” is equally unconscionable, unethical, and could conceivably be the basis of a malpractice suit.

Coyne reports (based on three studies) that almost half of the samples identified as depressed/anxious were already in psychological/psychiatric treatment when diagnosed with cancer.  While Coyne’s numbers were derived from good quality studies these numbers don’t jibe with population estimates.

If the reviewer doesn’t like my “good quality studies,” he should propose some others. As for the wild estimates of half or more of all the people in the community walking round with untreated mental illness, I don’t think we can take seriously the results of studies based on lay interviewers administering structured interview schedules to community residing persons as estimates of unmet needs for psychiatric services.

Coyne posits that screening should improve patient outcomes and offers a detailed section showing that we don’t yet have convincing evidence that it does; this is where the debate between [the opposing side] and Coyne is particularly interesting and valuable.  However, for reasons that are not explicated, Coyne and a number of individuals with whom he shares the ‘anti’ position never allow the argument that systematic screening has two other valuable functions, namely to offer a degree of social justice inherent in equal access to care, and it helps psychological service providers to use clinical population-derived data for clarifying resource needs and tracking system efficiency. I do wonder whether or not these latter two issues are affected by context, namely that they might be more naturally attractive to Psycho-Oncology clinicians in countries with universal health care.

What is this focusing first on the “ Calgary group” and now on “Coyne and a number of individuals” ?  Is the reviewer talking about rival gangs or ideas?

I fail to see how screening can “offer a degree of social justice inherent in equal access to care” if it does not improve patient outcomes. We know from lots of studies of screening for depression that persons with low income and other social disparities have a difficult time completing referrals, even when some of the obvious barriers like costs are removed. Studies find that persons with social disparities may need 25 efforts at contact by telephone, with up to eight completed, in order to get them to the first session of mental health treatment. Many of them will not return.

So, where is the social justice in referring low income and other disadvantaged patients to services they won’t get to, and especially when there is no assurance that the services are effective? The reviewer should visit an American community mental health setting where Medicaid patients are sent because psychiatrists prefer to treat patients who pay out of pocket. Or visit the bewildered primary care physicians who get sent cancer patients from Danish or Dutch cancer centers screening for distress.

Routine screening risks compounding  social disparities in receipt of services. Persons with higher income or other social resources are much more likely to complete the referrals that are offered. Even when services are free, people with social disparities are much less likely to show up than people who have the resources to get there.

I don’t know what the reviewer intends by saying screening should implemented because it gives providers clinical population-derived data for clarifying resource needs. I see this argument as a transparent effort to exploit patients who are not getting any benefit from screening to bolster support to hire professionals to be available to provide services. Think of it: would we provide mammograms to women simply to document a need for more oncologists, if the women do not get any benefit from the mammograms?

Imagine this scenario: attorneys push for screening the general population for unmet legal needs. With short checklists, pissed-off thermometers, and web-based surveys, they identify people having unresolved disputes with their relatives and neighbors that the attorneys could help them settle by suing each other. They thereby uncover what they consider unmet need for litigation. Now, some people may have misgivings about suing family and supposed friendsS. The attorneys could then argue that this is just due to a sense of stigma and launch anti-stigma campaigns to break down their resistance to accepting services.

The attorneys’ denial that their primary interest was to generate business for themselves would be more easily dismissed than that of mental health professionals calling for screening for cancer patients for distress. But the conflict of interest is just as great.


Troubles in the Branding of Psychotherapies as “Evidence Supported”

Is advertising a psychotherapy as “evidence supported,”  any less vacuous than “Pepsi’s the one”? A lot of us would hope so, having campaigned for rigorous scientific evaluation of psychotherapies in randomized controlled trials (RCTs), just as is routinely done with drugs and medical devices in Evidence-based Medicine (EBM). We have also insisted on valid procedures for generating, integrating, and evaluating evidence and have exposed efforts that fall short. We have been fully expecting that some therapies would emerge as strongly supported by evidence, while others would be found less so, and some even harmful.

Some of us now despair about the value of this labeling or worry that the process of identifying therapies as evidence supported has been subverted into something very different than we envisioned.  Disappointments and embarrassments in the branding of psychotherapies as evidence supported are mounting. A pair of what could be construed as embarrassments will be discussed in this blog.

Websites such as those at American Psychological Association Division 12 Clinical Psychology and SAMHSA’s National Registry of Evidence-based Programs and Practices offer labeling of specific psychotherapies as evidence supported. These websites are careful to indicate that a listing does not constitute an endorsement. For instance, the APA division 12 website declares

This website is for informational and educational purposes. It does not represent the official policy of Division 12 or the American Psychological Association, nor does it render individual professional advice or endorse any particular treatment.

Readers can be forgiven for thinking otherwise, particularly when such websites provide links to commercial sites that unabashedly promote the therapies with commercial products such as books, training videos, and workshops. There is lots of money to be made, and the appearance of an endorsement is coveted. Proponents of particular therapies are quick to send studies claiming positive findings to the committees deciding on listings with the intent of getting them acknowledged on these websites.

But now may be the time to begin some overdue reflection on how the label of evidence supported practice gets applied and whether there is something fundamentally wrong with the criteria.

Now you see it, now, you don’t: “Strong evidence” for the efficacy of acceptance and commitment therapy for psychosis

On September 3, 2012 the APA Division 12 website announced a rating of “strong evidence” for the efficacy of acceptance and commitment therapy for psychosis. I was quite skeptical. I posted links on Facebook and Twitter to a series of blog posts (1, 2, 3) in which I had previously debunked the study claiming to demonstrate that a few sessions of ACT significantly reduced rehospitalization of psychotic patients.

David Klonsky, a friend on FB who maintains the Division 12 treatment website quickly contacted me and indicated that he would reevaluate the listing after reading my blog posts and that he had already contacted the section editor to get her evaluation. Within a day, the labeling was changed to “designation under re-review as of 9/3/12”and it is now (10/16/12) “modest research support.”

David Klonsky is a serious, thoughtful guy with an unenviable job: keeping the Division 12 list of evidence supported treatments updated. This designation is no less important than it once was, but it is increasingly difficult to engage burned out committee members to evaluate the flood of new studies that proponents of particular therapies relentlessly send in. As we will see with this incident, the reports of studies that are considered are not necessarily reliable indicators of the efficacy of particular treatments, even when they come from prestigious, high impact journals.

The initial designation of ACT as having “strong evidence” for psychosis was mainly based on a single, well promoted study, claims for which made it all the way to Time magazine when it was first published.

Bach, P., & Hayes, S.C. (2002). The use of acceptance and commitment therapy to prevent the rehospitalization of psychotic patients: A randomized controlled trial. Journal of Consulting and Clinical Psychology, 70, 1129-1139.

Of course, the designation of strong evidence requires support of two randomized trials, but the second trial was a modest attempt at replication of this study and was explicitly labeled as a pilot study.

The Bach and Hayes  article has been cited 175 times as of 10/21/12 according to ISI Web of Science, mainly  for claims that appear in its abstract: patients receiving up to four sessions of an ACT intervention had “a rate of rehospitalization half that of TAU [treatment as usual] participants over a four-month follow-up [italics added].” This would truly be a powerful intervention, if these claims are true. And my check of the literature suggests that these claims are almost universally accepted. I’ve never seen any skepticism expressed in peer reviewed journals about the extraordinary claim of cutting rehospitalization in half.

Before reading further, you might want to examine the abstract and, even better, read the article for yourself and decide whether you are persuaded. You can even go to my first blog post on this study where I identify safe some of the things to look for in evaluating the claims. If these are your intentions, you might want to stop reading here and resume after considering these materials.

Warning! Here comes the spoiler.

  • It is not clear that rehospitalization was originally set as the primary outcome, and so there is a possible issue of a shifting primary outcome, a common tactic in repackaging a null trial as positive. Many biomedical journals require that investigators publish their protocols with a designated primary outcome before they enter the first patient into a trial. That is a strictly enforced requirement  for later publication of the results of the trial. But that is not yet usually done for RCTs testing psychotherapies.The article is based on a dissertation. I retrieved a copy andI found that  the title of it seemed to suggest that symptoms, not rehospitalization, were the primary outcome: Acceptance and Commitment Therapy in the Treatment of Symptoms of Psychosis.
  • Although 40 patients were assigned to each group, analyses only involved 35 per group. The investigators simply dropped patients from the analyses with negative outcomes that are arguably at least equivalent to rehospitalization in their seriousness: committing suicide or going to jail. Think about it, what should we make of a therapy that prevented rehospitalization but led to jailing and suicides of mental patients? This is not only a departure from intention to treat analyses, but the loss of patients is nonrandom and potentially quite relevant to the evaluation of the trial. Exclusion of these patients have substantial impact on the interpretation of results: the 5 patients missing from the ACT group represented 71% of the reported rehospitalizations  and the 5 patients missing from the TAU group represent 36% of the reported rehospitalizations in that group.
  • Rehospitalization is not a typical primary outcome for a psychotherapy study. But If we suspend judgment for a moment as to whether it was the primary outcome for this study, ignore the lack of intent to treat analyses, and accept 35 patients per group, there is still not a simple, significant difference between groups for rehospitalization. The claim of “half” is based on voodoo statistics.
  • The trial did assess the frequency of psychotic symptoms, an outcome that is closer to what one would rely to compare to this trial with the results of other interventions. Yet oddly, patients receiving the ACT intervention actually reported more, twice the frequency of symptoms compared to patients in TAU. The study also assessed how distressing hallucinations or delusions were to patients, what would be considered a patient oriented outcome, but there were no differences on this variable. One would think that these outcomes would be very important to clinical and policy decision-making and these results are not encouraging.

This study, which has been cited 64 times according to ISI Web of Science, rounded out the pair needed for a designation of strong support:

Gaudiano, B.A., & Herbert, J.D. (2006). Acute treatment of inpatients with psychotic symptoms using acceptance and commitment therapy: Pilot results. Behaviour Research and Therapy, 44, 415-437.

Appropriately framed as a pilot study, this study started with 40 patients and only delivered three sessions of ACT. The comparison condition was enhanced treatment as usual consisting of psychopharmacology, case management, and psychotherapy, as well as milieu therapy. Follow-up data were available for all but 2 patients. But this study is hardly the basis for rounding out a judgment of ACT as efficacious for psychosis.

  • There were assessments with multiple conventional psychotic symptom and functioning measures, as well as ACT specific measures. The only conventional measure to achieve significance was distress related to hallucinations and there were no differences in ACT specific measures. There were no significant differences in rehospitalization.
  • The abstract puts a positive spin on these findings: “At discharge from the hospital, results suggest that short-term advantages in effect of symptoms, overall improvement, social impairment, and distress associated with hallucinations. In addition, more participants in the ACT condition reach clinically significant symptom improvement at discharge. Although four-month rehospitalization rates were lower in the ACT group, these differences did not reach statistical significance.”

The provisional designation of ACT as having strong evidence of efficacy for psychosis could have had important consequences. Clinicians and policymakers could decide that merely providing three sessions of ACT is a sufficient and empirically validated approach to keep chronic mental patients from returning to the hospital and maybe even make discharge decisions based on whether patients had received ACT. But the evidence just isn’t there that ACT prevents rehospitalization, and when the claim is evaluated against what is known about the efficacy of psychotherapy for psychotics, it appears to be an unreasonable claim bordering on the absurd.

The redesignation of ACT as having modest support was based on additional consideration of a follow-up study of the Bach and Hayes, plus an additional feasibility study that involved 27 patients randomized to either to treatment as usual or 10 sessions of ACT plus treatment as usual. Its stated goal was to investigate the feasibility of using ACT to facilitate emotional recovery following psychosis, but as a feasibility study, included a full range of outcomes with the intention of deciding which would be important for assessing the impact of ACT in this population. The scales included the two subscales of the Hospital Anxiety and Depression Scale (HADS), the positive and negative syndrome scale, an ACT specific scale, and a measure of the therapeutic alliance.  Three of the patients assigned just treatment as usual dropped out and so intent to treat analysis were not conducted. With such a small sample, it is not surprising that there were no differences on most measures. The investigators noted that the patients receiving ACT and had fewer crisis contacts over the duration of the trial, but it is not clear whether this is simply due to the treatment as usual group not having regular treatment and therefore having to resort to crisis contacts.

The abstract of the study states “ACT appears to offer promise in reducing negative symptoms, depression and crisis contacts in psychosis”, which is probably a bit premature. Note also that across these three trials, there is a shift in the outcome to which the investigators point as evidence for the efficacy of ACT for psychosis. The assumption seems to be that any positive result can be claimed to represent a replication, even if other variables were cited for this purpose among the other studies.

Overall, this trial would also be rated as having high risk of bias because of the lack of intent to treat analyses and the failure to specify a primary outcome among the battery that was administered, but more importantly, it would simply be excluded from meta-analyses with which I have been associated because of too few patients in it. A high risk of bias plus too few patients discourages any confidence in these results.

Is treating PTSD with acupoint stimulation supported by evidence ?

Whether or not ACT is more efficacious than other therapies, as its proponents sometimes claim, or whether it is efficacious for psychosis, is debatable, but probably no one would consider ACT anything other than a bona fide therapy. The same does not hold for Emotional Freedom Therapy (EFT) and its key component, acupoint.  I’m sure there was much consternation at APA and Division 12 when stories circulated on the Internet that APA had declared EFT to be evidence supported.

Wikipedia offers the following definition of EFT:

Emotional Freedom Techniques (EFT) is a form of counseling intervention that draws on various theories of alternative medicine including acupuncture, neuro-linguistic programming, energy medicine, and Thought Field Therapy. During an EFT session, the client will focus on a specific issue while tapping on so-called “end points of the body’s energy meridians.”

Writing in The Skeptical Inquirer, Brandon Gaudiano and James Herbert argued that there is no plausible mechanism to explain how the specifics of EFT could add to its effectiveness and they have been described as unfalsifiable and therefore pseudoscientific. EFT is widely dismissed by skeptics, along with its predecessor, Thought Field Therapy and has been described in the mainstream press as “probably nonsense.”[2] Evidence has not been found for the existence of acupuncture points, meridians or other concepts involved in traditional Chinese medicine.

The scathing Gaudiano and Herbert critique is worth a read and calls attention to claims of EFT by proxy: patients improve when therapists tap themselves rather than the patients! My imagination runs wild: how about televised sessions in which therapists tap themselves and liberate thousands of patients around the world from their PTSD?

According to David Feinstein, aproponent of EFT, in including a chapter on Thought Field Therapy in an anthology of innovative psychotherapies, Corsini (2001) acknowledged that it was “either one of the greatest advances in psychotherapy or it is a hoax.”

Claims have been made for acupoint that even proponents of EFT consider “provocative,” “extraordinary,”  and “too good to be true.” An article published in Journal of Clinical Psychology (not an APA journal), reported that 105 people were treated in Kosovo for severe emotional reactions to past torture, rape, and witnessing loved ones being burned or raped. Strong improvement was observed in 103 of these patients, despite an average of only three sessions. For comparison purposes, exposure therapy involves at least 15 sessions in the literature claims nowhere near this efficacy. However, even more extraordinary results were claimed for the combined sample of 337 patients treated in visits to Kosovo, Rwanda, the Congo, and South Africa. The 337 individuals expressed 1016 traumatic memories of which 1013 were successfully resolved, resulting in substantial improvement in 334 patients. Unfortunately the details of this study remain on unpublished, but claims of these results appear in a forthcoming article in the APA journal Review of General Psychology.

Reports circulating on the Internet that APA had declared EFT to be an evidence supported approach stemmed from a press release by the EFT Universe that cited a statement from the same Review of General Psychology article:

A literature search identified 50 peer-reviewed papers that report or investigate clinical outcomes following the tapping of acupuncture points to address psychological issues. The 17 randomized controlled trials in this sample were critically evaluated for design quality, leading to the conclusion that they consistently demonstrated strong effect sizes and other positive statistical results that far exceed chance after relatively few treatment sessions. Criteria for evidence-based treatments proposed by Division 12 of the American Psychological Association were also applied and found to be met for a number of conditions, including PTSD (Feinstein, 2012).

Feinstein had been developing his claims about energy therapies such as EFT meeting the Division 12 criteria for a while. In a 2008 article in the APA journal Psychotherapy Theory, Research, Practice, Training, he declared

although the evidence is still preliminary, energy psychology has reached the minimum threshold for being designated has an evidence-based treatment, with one form having met the APA division 12 criteria as a “probably efficacious” treatment for specific phobias; another for maintaining weight loss.

In this 2008 article, Feinstein also cited a review in the online book review journal of APA in which Ilene Selrin, Past President of APA’s Division of Humanistic Psychology praised Feinstein’s book for its “valuable expansion of the traditional biopsychosocial model of psychology to include the dimension of energy” and energy psychology as representing “a new discipline that has been receiving attention due to its speed and effectiveness with difficult cases.”

The reports that EFT had been designated as an evidence supported treatment made the rounds for a few months, sometimes with the clarification that EFT met the criteria, but had not yet been labeled as evidence supported by Division 12. In some communities, stories about EFT or –as it was called– tapping therapy made the local TV news. KABC news Los Angeles titled a story,”‘Tapping’ therapy can relieve anxiety, stress, researchers say” and got an APA spokesperson to provide a muted comment

 “Has this tapping therapy been proven effective? We don’t think so at this point,” said Rhea Farberman, Executive Director for Public and Member Communications at the APA.

The comment went on to say that APA viewed stress and anxiety as serious but treatable issues for some persons and cognitive behavior therapy recommended, but not tapping therapy.

What do these incidents say about branding of psychotherapies as evidence supported?

I will explore this issue in greater depth in a future blog post, but for now we are left with some questions.

The first incident involved designation of a psychotherapy as having strong evidence of efficacy for psychosis, but was quickly changed first to under review and then to modest support. The precipitant for this downgrading seems to be blog posts that revealed the abstract of the key study to be misleading. Designation of a therapy as having strong evidence for its efficacy requires two positive randomized controlled trials. The second trial was described as a pilot study explicitly aimed at replicating the first one. Like the first one, its abstract declared positive findings. However, this study failed to replicate the first study’s claimed reduction in hospitalization, and a cursory examination of the results section revealed that this study, like the study that it attempted to replicate, was basically a null trial.

  • Do the current criteria employed by Division 12-only 2 positive trials and no attention to size or quality- set too low a bar for a therapy receiving the seemingly important branding of having strong evidence?
  • The revised status of ACT for psychosis is that it has modest support. But how does two null trials published with confirmatory bias constitute modest support?
  • Are there pitfalls in uncritically accepting claims in the abstracts of articles appearing in prestigious journals like JCCP?
  • More generally, to what extent do the shortcomings of articles appearing in prestigious journals like JCCP warrant skepticism, not only by reviewers for Division 12, but consumers more generally?
  • Should we expect a prestigious journals like JCCP to encourage and make a place for post publication peer review of the articles that have appeared there?
  • Should revised criteria for evidence supported therapies not just count whether there are two or only one positive trial, but incorporate formal quality ratings of trials for overall quality and risk of bias?

The second incident involves rumors of APA having designated as evidence supported a bizarre therapy with extravagant claims of efficacy. The rumor was based on a forthcoming review in an APA Journal that indicated that EFT had sufficient number of positive randomized trials to meet APA division 12 criteria for evidence supported. It was left to a media person from APA to clarify that APA did not endorse this therapy, but it was unclear on what basis this declaration was made.

  • If ACT for psychosis has modest support, where does EFT stand when evaluated by the same criteria?
  • Can sources other than APA Division 12 apply the criteria to psychotherapies and declare the therapies as warranting evidence-based status? If not, why not?
  • Do consumers, as well as proponents of innovative and even strange therapies, deserve evaluation with formal criteria by APA Division 12 and designation of the therapies not only as warranting a designation of “strong evidence” if they meet these criteria, but alternatively as having demonstrated a failure to accumulate evidence of efficacy, and even as having demonstrated possible harm?
  • If APA Division 12 takes on the task of publicizing the evidence based status of psychotherapies, does it thereby assume a responsibility to alert policy makers and consumers of therapies that fail to meet these criteria?
  • If application of the existing Division 12 criteria warrants EFT as having strong evidence of efficacy, what does that say about the adequacy of these criteria?

To be continued……