Why the scientific community needs the PACE trial data to be released

To_deposit_or_not_to_deposit,_that_is_the_question_-_journal.pbio.1001779.g001University and clinical trial investigators must release data to a citizen-scientist patient, according to a landmark decision in the UK. But the decision could still be overturned if the University and investigators appeal. The scientific community needs the decision to be upheld. I’ll argue that it’s unwise for any appeal to be made. The reasons for withholding the data in the first place were archaic. Overturning of the decision would set a bad precedent and would remove another tooth from almost toothless requirements for data sharing.

We didn’t need Francis Collins, Director of National Institutes of Health to tell us what we already knew, the scientific and biomedical literature is untrustworthy.

And there is the new report from the UK Academy of Medical Sciences, Reproducibility and reliability of biomedical research: improving research practice.

There has been a growing unease about the reproducibility of much biomedical research, with failures to replicate findings noted in high-profile scientific journals, as well as in the general and scientific media. Lack of reproducibility hinders scientific progress and translation, and threatens the reputation of biomedical science.

Among the report’s recommendations:

  • Journals mandating that the data underlying findings are made available in a timely manner. This is already required by certain publishers such as the Public Library of Science (PLOS) and it was agreed by many participants that it should become more common practice.
  • Funders requiring that data be released in a timely fashion. Many funding agencies require that data generated with their funding be made available to the scientific community in a timely and responsible manner

A consensus has been reached: The crisis in the trustworthiness of science can be only overcome only if scientific data are routinely available for reanalysis. Independent replication of socially significant findings is often unfeasible, and unnecessary if original data are fully available for inspection.

Numerous governmental funding agencies and regulatory bodies are endorsing routine data sharing.

The UK Medical Research Council (MRC) 2011 policy on data sharing and preservation  has endorsed principles laid out by the Research Councils UK including

Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner.

To enable research data to be discoverable and effectively re-used by others, sufficient metadata should be recorded and made openly available to enable other researchers to understand the research and re-use potential of the data. Published results should always include information on how to access the supporting data.

The Wellcome Trust Policy On Data Management and Sharing opens with

The Wellcome Trust is committed to ensuring that the outputs of the research it funds, including research data, are managed and used in ways that maximise public benefit. Making research data widely available to the research community in a timely and responsible manner ensures that these data can be verified, built upon and used to advance knowledge and its application to generate improvements in health.

The Cochrane Collaboration has weighed in that there should be ready access to all clinical trial data

Summary results for all protocol-specified outcomes, with analyses based on all participants, to become publicly available free of charge and in easily accessible electronic formats within 12 months after completion of planned collection of trial data;

Raw, anonymised, individual participant data to be made available free of charge; with appropriate safeguards to ensure ethical and scientific integrity and standards, and to protect participant privacy (for example through a central repository, and accompanied by suitably detailed explanation).

Many similar statements can be found on the web. I’m unaware of credible counterarguments gaining wide acceptance.

toothless manYet, endorsements of routine sharing of data are only a promissory reform and depend on enforcement that has been spotty, at best. Those of us who request data from previously published clinical trials quickly realize that requirements for sharing data have no teeth. In light of that, scientists need to watch closely whether a landmark decision concerning sharing of data from a publicly funded trial is appealed and overturned.

The Decision requiring release of the PACE data

The UK’s Information Commissioner’s Office (ICO) ordered Queen Mary University of London (QMUL) on October 27, 2015 to release anonymized from the PACE chronic fatigue syndrome trial data to an unnamed complainant. QMUL has 28 days to appeal.

Even if scientists don’t know enough to care about Chronic Fatigue Syndrome/Myalgic Encephalomyelitis, they should be concerned about the reasons that were given in a previous refusal to release the data.

I took a critical look at the long-term follow up results for the PACE trial in a previous Mind the Brain blog post  and found fatal flaws in the authors’ self-congratulatory interpretation of results. Despite authors’ claims to the contrary and their extraordinary efforts to encourage patients to report the intervention was helpful, there were simply no differences between groups at follow-up

Background on the request for release of PACE data

  • A complainant requested release of specific PACE data from QMUL under the Freedom of Information Act.
  • QMUL refused the request.
  • The complainant requested an internal review but QMUL maintained its decision to withhold the data.
  • The complainant contacted the ICO with concerns about how the request had been handled.
  • On October 27, 2015, the ICO sided with the complainant and order the release of the data.

A report outlines Queen Mary’s arguments for refusing to release the data and the Commissioner’s justification for siding with the patient requesting the data be released.

Reasons the request release of data was initially refused

The QMU PACE investigators claimed

  • They were entitled to withhold data prior to publication of planned papers.
  • An exemption to having to share data because data contained sensitive medical information from which it was possible to identify the trial participants.
  • Release of the data might harm their ability to recruit patients for research studies in the future.

The QMU PACE researchers specifically raised concerns about a motivated intruder being able to facilitate re-identification of participants:

In relation to a motivated intruder being able facilitate re-identification of participants, the University argued that:

“The PACE trial has been subject to extreme scrutiny and opponents have been against it for several years. There has been a concerted effort by a vocal minority whose views as to the causes and treatment of CFS/ME do not comport with the PACE trial and who, it is QMUL’s belief, are trying to discredit the trial. Indeed, as noted by the editor of the Lancet, after the 2011 paper’s publication, the nature of this comprised not a ‘scientific debate’ but an “orchestrated response trying to undermine the credibility of the study from patient groups [and]… also the credibility of the investigators and that’s what I think is one of the other alarming aspects of this. This isn’t a purely scientific debate; this is going to the heart of the integrity of the scientists who conducted this study.”

Magneto_430Bizarre. This is obviously a talented masked motivated intruder. Do they have evidence that Magneto is at it again? Mostly he now is working with the good guys, as seen in the help he gave Neurocritic and me.

Let’s think about this novel argument. I checked with University of Pennsylvania bioethicist Jon Merz, an expert who has worked internationally to train researchers and establish committees for the protection of human subjects. His opinion was clear:

The litany of excuses – not reasons – offered by the researchers and Queen Mary University is a bald attempt to avoid transparency and accountability, hiding behind legal walls instead of meeting their critics on a level playing field.  They should be willing to provide the data for independent analyses in pursuit of the truth.  They of course could do this willingly, in a way that would let them contractually ensure that data would be protected and that no attempts to identify individual subjects would be made (and it is completely unclear why anyone would care to undertake such an effort), or they can lose this case and essentially lose any hope for controlling distribution.

The ‘orchestrated response to undermine the credibility of the study’ claimed by QMU and the PACE investigators, as well as issue being raised of the “integrity of the scientists who conducted the study” sounds all too familiar. It’s the kind of defense that is heard from scientists under scrutiny of the likes of Open Science Collaborations, as in psychology and cancer. Reactionaries resisting post-publication peer review say we must be worried about harassment from

“replication police” “shameless little bullies,” “self-righteous, self-appointed sheriffs” engaged in a process “clearly not designed to find truth,” “second stringers” who were incapable of making novel contributions of their own to the literature, and—most succinctly—“assholes.”

Far fetched? Compare this to a QMU quote drawn from the National Radio, Australian Broadcast Company April 18, 2011 interview of Richard Horton and PACE investigator Michael Sharpe in which former Lancet Editor Richard Horton condemned:

A fairly small, but highly organised, very vocal and very damaging group of individuals who have…hijacked this agenda and distorted the debate…

dost thou feel‘Distorted the debate’? Was someone so impertinent as to challenge investigators’ claims about their findings? Sounds like Pubpeer  We have seen what they can do.

Alas, all scientific findings should be scrutinized, all data relevant to the claims that are made should be available for reanalysis. Investigators just need to live with the possibility that their claims will be proven wrong or exaggerated. This is all the more true for claims that have substantial impact on public policy and clinical services, and ultimately, patient welfare.

[It is fascinating to note that Richard Horton spoke at the meeting that produced the UK Academy of Medical Sciences report to which I provided a link above. Horton covered the meaning in a Lancet editorial  in which he amplified the sentiment of the meeting: “The apparent endemicity of bad research behaviour is alarming. In their quest for telling a compelling story, scientists too often sculpt data to fit their preferred theory of the world.” His editorial echoed a number of recommendations of the meeting report, but curiously omitted mentioning of data sharing.]

jacob-bronowski-scientist-that-is-the-essence-of-science-ask-anFortunately the ICO has rejected the arguments of QMUL and the PACE investigators. The Commissioner found that QMUL and the PACE investigators incorrectly interpreted regulations in their withholding of the data and should provide the complaint with the data or risk being viewed as in contempt of court.

The 30-page decision is a fascinating read, but here’s an accurate summary from elsewhere:

In his decision, the Commissioner found that QMUL failed to provide any plausible mechanism through which patients could be identified, even in the case of a “motivated intruder.” He was also not convinced that there is sufficient evidence to determine that releasing the data would result in the mass exodus of a significant number of the trial’s 640 participants nor that it would deter significant numbers of participants from volunteering to take part in future research.

Requirements for data sharing in the United States have no teeth and situation would be worsened by reversal of ICO decision

Like the UK, the United States supposedly has requirements for sharing of data from publicly funded trials. But good luck in getting support from regulatory agencies associated with funding sources for obtaining data. Here’s my recent story, still unfolding – or maybe, sadly, over, at least for now.

For a long time I’ve fought my own battles about researchers making unwarranted claims that psychotherapy extend the lives of cancer patients. Research simply does not support the claim. The belief that psychological factors have such influence on the course and outcome of cancer sets up cancer patients to be blamed and to blame themselves when they don’t overcome their disease by some sort of mind control. Our systematic review concluded

“No randomized trial designed with survival as a primary endpoint and in which psychotherapy was not confounded with medical care has yielded a positive effect.”

Investigators who conducted some of the best ambitious, well-designed trials to test the efficacy of psychological interventions on cancer but obtained null results echoed our assessment. The commentaries were entitled “Letting Go of Hope” and “Time to Move on.”

I provided an extensive review of the literature concerning whether psychotherapy and support groups increased survival time in an earlier blog post. Hasn’t the issue of mind-over-cancer been laid to rest? I was recently contacted by a science journalist interested in writing an article about this controversy. After a long discussion, he concluded that the issue was settled — no effect had been found — and he could not succeed in pitching his idea for an article to a quality magazine.

But as detailed here one investigator has persisted in claims that a combination of relaxation exercises, stress reduction, and nutritional counseling increases survival time. My colleagues and I gave this 2008 study a careful look.  We ran chi-square analyses of basic data presented in the paper’s tables. But none of our analyses of group assignment on mortality more disease recurrence was significant. The investigators’ claim of an effect depended on dubious multivariate analyses with covariates that could not be independently evaluated without a look at the data.

The investigator group initially attempted to block publication of a letter to the editor, citing a policy of the journal Cancer that critical letters could not be published unless investigators agreed to respond and they were refusing to respond. We appealed and the journal changed its policy and allowed us additional length to our letter.

We then requested from the investigator’s University Research Integrity Officer the specific data needed to replicate the multivariate analyses in which the investigators claimed an effect on survival. The request was denied:

The data, if disclosed, would reveal pending research ideas and techniques. Consequently, the release of such information would put those using such data for research purposes in a substantial competitive disadvantage as competitors and researchers would have access to the unpublished intellectual property of the University and its faculty and students.

Recall that we were requesting in 2014 specific data needed to evaluate analyses published in 2008.

I checked with statistician Andrew Gelman whether my objections to the multivariate analyses were well-founded and he agreed they were.

Since then, another eminent statistician Helena Kraemer has published an incisive critique of reliance in a randomized controlled trial on multivariate analyses and simple bivariate analyses do not support the efficacy of interventions. She labeled adjustments with covariates as a “source of false-positive findings.”

We appealed to the US Health and Human Services Office of Research Integrity  (ORI) but they indicated no ability to enforce data sharing.

Meanwhile, the principal investigator who claimed an effect on survival accompanied National Cancer Institute program officers to conferences in Europe and the United States where she promoted her intervention as effective. I complained to Robert Croyle, Director, NCI Division of Cancer Control and Population Sciences who twice has been one of the program officer’s co-presenting with her. Ironically, in his capacity as director he is supposedly facilitating data sharing for the division. Professionals were being misled to believe that this intervention would extend the lives of cancer patients, and the claim seemingly had the endorsement NCI.

I told Robert Croyle  that if only the data for the specific analyses were released, it could be demonstrated that the claims were false. Croyle did not disagree, but indicated that there was no way to compel release of the data.

The National Cancer Institute recently offered to pay the conference fees to the International Psycho-Oncology Congress in Washington DC of any professionals willing to sign up for free training in this intervention.

I don’t think I could get any qualified professional including  Croyle to debate me publicly as to whether psychotherapy increases the survival of cancer patients. Yet the promotion of the idea persists because it is consistent with the power of mind over body and disease, an attractive talking point

I have not given up in my efforts to get the data to demonstrate that this trial did not show that psychotherapy extends the survival of cancer patients, but I am blocked by the unwillingness of authorities to enforce data sharing rules that they espouse.

There are obvious parallels between the politics behind persistence of the claim in the US for psychotherapy increasing survival time for cancer patients and those in the UK about cognitive behavior therapy being sufficient treatment for schizophrenia in the absence of medication or producing recovery from the debilitating medical condition, Chronic Fatigue Syndrome/Myalgic Encephalomyelitis. There are also parallels to investigators making controversial claims based on multivariate analyses, but not allowing access to data to independently evaluate the analyses. In both cases, patient well-being suffers.

If the ICO upholds the release of data for the PACE trial in the UK, it will pressure the US NIH to stop hypocritically endorsing data sharing and rewarding investigators whose credibility depends on not sharing their data.

As seen in a PLOS One study, unwillingness to share data in response to formal requests is

associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.

Why the PACE investigators should not appeal

In the past, PACE investigators have been quite dismissive of criticism, appearing to have assumed that being afflicted with Chronic Fatigue Syndrome/Myalgic Encephalomyelitis precludes a critic being taken seriously, even when the criticism is otherwise valid. However, with publication of the long-term follow-up data in Lancet Psychiatry, they are now contending with accomplished academics whose criticisms cannot be so easily brushed aside. Yes, the credibility of the investigators’ interpretations of their data are being challenged. And even if they do not believe they need to be responsive to patients, they need to be responsive to colleagues. Releasing the data is the only acceptable response and not doing so risks damage to their reputations.

QMUL, Professors White and Sharpe, let the People’s data go.

 

Uninterpretable: Fatal flaws in PACE Chronic Fatigue Syndrome follow-up study

Earlier decisions by the investigator group preclude valid long-term follow-up evaluation of CBT for chronic fatigue syndrome (CFS).

CFS-Think-of-the-worst1At the outset, let me say that I’m skeptical whether we can hold the PACE investigators responsible for the outrageous headlines that have been slapped on their follow-up study and on the comments they have made in interviews.

The Telegraph screamed

Chronic Fatigue Syndrome sufferers ‘can overcome symptoms of ME with positive thinking and exercise’

Oxford University has found ME is not actually a chronic illness

My own experience critiquing media interpretation of scientific studies suggests that neither researchers nor even journalists necessarily control shockingly inaccurate headlines placed on otherwise unexceptional media coverage. On the other hand, much distorted and exaggerated media coverage starts with statements made by researchers and by press releases from their institutions.

The one specific quote attributed to a PACE investigator is unfortunate because of its potential to be misinterpreted by professionals, persons who suffer from chronic fatigue syndrome, and the people around them affected by their functioning.

“It’s wrong to say people don’t want to get better, but they get locked into a pattern and their life constricts around what they can do. If you live within your limits that becomes a self-fulfilling prophesy.”

It suggests that willfulness causes CFS sufferers’ impaired functioning. This is ridiculous as application of the discredited concept of fighting spirit to cancer patients’ failure to triumph against their life altering and life-threatening condition. Let’s practice the principle of charity and assume this is not the intention of the PACE investigator, particularly when there is so much more for which we should give them responsibility.

Go here for a fuller evaluation that I endorse of the Telegraph coverage of PACE follow-up study.

Having read the PACE follow-up study carefully, my assessment is that the data presented are uninterpretable. We can temporarily suspend critical thinking and some basic rules for conducting randomized trials (RCTs), follow-up studies, and analyzing the subsequent data. Even if we do, we should reject some of the interpretations offered by the PACE investigators as unfairly spun to fit what has already a distorted positive interpretation oPACE trial HQf the results.

It is important to note that the PACE follow-up study can only be as good as the original data it’s based on. And in the case of the PACE study itself, a recent longread critique by UC Berkeley journalism and public health lecturer David Tuller has arguably exposed such indefensible flaws that any follow-up is essentially meaningless. See it for yourself [1, 2, 3 ].

This week’s report of the PACE long term follow-up study and a commentary  are available free at the Lancet Psychiatry website after a free registration. I encourage everyone to download a copy before reading further. Unfortunately, some crucial details of the article are highly technical and some details crucial to interpreting the results are not presented.

I will provide practical interpretations of the most crucial technical details so that they are more understandable to the nonspecialist. Let me know where I fail.

1When Cherished Beliefs Clash with EvidenceTo encourage proceeding with this longread, but to satisfy those who are unwilling or unable to proceed, I’ll reveal my main points are

  • The PACE investigators sacrificed any possibility of meaningful long-term follow-up by breaking protocol and issuing patient testimonials about CBT before accrual was even completed.
  • This already fatal flaw was compounded with a loose recommendation for treatment after the intervention phase of the trial ended. The investigators provide poor documentation of which treatment was taken up by which patients and whether there was crossover in the treatment being received during follow up.
  • Investigators’ attempts to correct methodological issues with statistical strategies lapses into voodoo statistics.
  • The primary outcome self-report variables are susceptible to manipulation, investigator preferences for particular treatments, peer pressure, and confounding with mental health variables.
  • The Pace investigators exploited ambiguities in the design and execution of their trial with self-congratulatory, confirmatory bias.

The Lancet Psychiatry summary/abstract of the article

Background. The PACE trial found that, when added to specialist medical care (SMC), cognitive behavioural therapy (CBT), or graded exercise therapy (GET) were superior to adaptive pacing therapy (APT) or SMC alone in improving fatigue and physical functioning in people with chronic fatigue syndrome 1 year after randomisation. In this pre-specified follow-up study, we aimed to assess additional treatments received after the trial and investigate long-term outcomes (at least 2 years after randomisation) within and between original treatment groups in those originally included in the PACE trial.

Findings Between May 8, 2008, and April 26, 2011, 481 (75%) participants from the PACE trial returned questionnaires. Median time from randomisation to return of long-term follow-up assessment was 31 months (IQR 30–32; range 24–53). 210 (44%) participants received additional treatment (mostly CBT or GET) after the trial; with participants originally assigned to SMC alone (73 [63%] of 115) or APT (60 [50%] of 119) more likely to seek treatment than those originally assigned to GET (41 [32%] of 127) or CBT (36 [31%] of 118; p<0·0001). Improvements in fatigue and physical functioning reported by participants originally assigned to CBT and GET were maintained (within-group comparison of fatigue and physical functioning, respectively, at long-term follow-up as compared with 1 year: CBT –2·2 [95% CI –3·7 to –0·6], 3·3 [0·02 to 6·7]; GET –1·3 [–2·7 to 0·1], 0·5 [–2·7 to 3·6]). Participants allocated to APT and to SMC alone in the trial improved over the follow-up period compared with 1 year (fatigue and physical functioning, respectively: APT –3·0 [–4·4 to –1·6], 8·5 [4·5 to 12·5]; SMC –3·9 [–5·3 to –2·6], 7·1 [4·0 to 10·3]). There was little evidence of differences in outcomes between the randomised treatment groups at long-term follow-up.

Interpretation The beneficial effects of CBT and GET seen at 1 year were maintained at long-term follow-up a median of 2·5 years after randomisation. Outcomes with SMC alone or APT improved from the 1 year outcome and were similar to CBT and GET at long-term follow-up, but these data should be interpreted in the context of additional therapies having being given according to physician choice and patient preference after the 1 year trial final assessment. Future research should identify predictors of response to CBT and GET and also develop better treatments for those who respond to neither.

fem imageNote the contradiction here which will persist throughout the paper, the official Oxford University press release, quotes from the PACE investigators to the media, and media coverage. On the one hand we are told:

Improvements in fatigue and physical functioning reported by participants originally assigned to CBT and GET were maintained…

Yet we are also told:

There was little evidence of differences in outcomes between the randomised treatment groups at long-term follow-up.

Which statement is to be given precedence? To the extent that features of a randomized trial have been preserved in the follow-up (which we will see, is not actually the case), a lack of between group differences at follow-up should be given precedence over any persistence of change within groups from baseline. That is a not controversial point for interpreting clinical trials.

A statement about group differences at follow up should proceed and qualify any statement about within-group follow up. Otherwise why bother with a RCT in the first place?

The statement in the Interpretation section of the summary/abstract has an unsubstantiated spin in favor of the investigators’ preferred intervention.

Outcomes with SMC alone or APT improved from the 1 year outcome and were similar to CBT and GET at long-term follow-up, but these data should be interpreted in the context of additional therapies having being given according to physician choice and patient preference after the 1 year trial final assessment.

If we’re going to be cautious and qualified in our statements, there are lots of other explanations for similar outcomes in the intervention and control groups that are more plausible. Simply put and without unsubstantiated assumptions, any group differences observed earlier have dissipated. Poof! Any advantages of CBT and GET are not sustained.

How the PACE investigators destroyed the possibility of an interpretable follow-up study

imagesNeither the Lancet Psychiatry article nor any recent statements by the PACE investigators acknowledged how these investigators destroyed any possibility of analyses of meaningful follow-up data.

Before the intervention phase of the trial was even completed, even before accrual of patients was complete, the investigators published a newsletter in December 2008 directed at trial participants. An article appropriately reminds participants of the upcoming two and one half year follow-up. But then it acknowledges difficulty accruing patients, but that additional funding has been received from the MRC to extend recruiting. And then glowing testimonials appear on p. 3 of the newsletter about the effects of their intervention.

“Being included in this trial has helped me tremendously. (The treatment) is now a way of life for me, I can’t imagine functioning fully without it. I have nothing but praise and thanks for everyone involved in this trial.”

“I really enjoyed being a part of the PACE Trial. It helped me to learn more about myself, especially (treatment), and control factors in my life that were damaging. It is difficult for me to gauge just how effective the treatment was because 2007 was a particularly strained, strange and difficult year for me but I feel I survived and that the trial armed me with the necessary aids to get me through. It was also hugely beneficial being part of something where people understand the symptoms and illness and I really enjoyed this aspect.”

These testimonials are a horrible breach of protocol. Taken together with the acknowledgment of the difficulty accruing patients, the testimonials solicit expression of gratitude and apply pressure on participants to endorse the trial by providing a positive of their outcome. Some minimal effort is made to disguise the conditions from which the testimonials come. However, references to a therapist and, in the final quote above, to “control factors in my life that were damaging” leave no doubt that the CBT and GET favored by the investigators is having positive results.

Probably more than in most chronic illnesses, CFS sufferers turn to each other for support in the face of bewildering and often stigmatizing responses from the medical community. These testimonials represent a form of peer pressure for positive evaluations of the trial.

Any investigator group that would deliberately violate protocol in this manner deserves further scrutiny for other violations and threats to the validity of their results. I challenge defenders of the PACE study to cite other precedents for this kind of manipulation of clinical trials participants. What would they have thought if a drug company had done this for the evaluation of their medication?

The breakdown of randomization as further destruction of the interpretability of follow-up results

Returning to the Lancet Psychiatry article itself, note the following:

After completing their final trial outcome assessment, trial participants were offered an additional PACE therapy if they were still unwell, they wanted more treatment, and their PACE trial doctor agreed this was appropriate. The choice of treatment offered (APT, CBT, or GET) was made by the patient’s doctor, taking into account both the patient’s preference and their own opinion of which would be most beneficial. These choices were made with knowledge of the individual patient’s treatment allocation and outcome, but before the overall trial findings were known. Interventions were based on the trial manuals, but could be adapted to the patient’s needs.

Readers who are methodologically inclined might be interested in a paper in which I discuss incorporating patient preference in randomized trials, as well as another paper describing clinical trial conducted with German colleagues  in which we incorporated patient preference in evaluation of antidepressants and psychotherapy for depression in primary care. Patient preference can certainly be accommodated in a clinical trial in ways that preserve the benefits of randomization, but not as the PACE investigators have done.

Following completion of the treatment to which particular patients were randomly assigned, the PACE trial offered a complex negotiation between patient and trial physician about further treatment. This represents a thorough breakdown of the benefits of a controlled randomized trial for the evaluation of treatments. Any focus on the long-term effects of initial randomization is sacrificed by what could be substantial departures from that randomization. Any attempts at statistical corrections will fail.

Of course, investigators cannot ethically prevent research participants from seeking additional treatment. But in the case of PACE, the investigators encouraged departures from the randomized treatment yet did not adequately take into account the decisions that were made. An alternative would have been to continue with the randomized treatment, taking into account and quantifying any cross over into another treatment arm.

2When Cherished Beliefs Clash with EvidenceVoodoo statistics in dealing with incomplete follow-up data.

Between May 8, 2008, and April 26, 2011, 481 (75%) participants from the PACE trial returned questionnaires.

This is a very good rate of retention of participants for follow-up. The serious problem is that neither

  • loss to follow-up nor
  • whether there was further treatment, nor
  • whether there was cross over in the treatment received in follow-up versus the actual trial

is random.

Furthermore, any follow-up data is biased by the exhortation of the newsletter.

No statistical controls can restore the quality of the follow-up data to what would’ve been obtained with preservation of the initial randomization. Nothing can correct for the exhortation.

Nonetheless, the investigators tried to correct for loss of participants to follow-up and subsequent treatment. They described their effort in a technically complex passage, which I will subsequently interpret:

We assessed the differences in the measured outcomes between the original randomised treatment groups with linear mixed-effects regression models with the 12, 24, and 52 week, and long-term follow-up measures of outcomes as dependent variables and random intercepts and slopes over time to account for repeated measures.

We included the following covariates in the models: treatment group, trial stratification variables (trial centre and whether participants met the international chronic fatigue syndrome criteria,3 London myalgic encephalomyelitis criteria,4 and DSM IV depressive disorder criteria),18,19 time from original trial randomisation, time by treatment group interaction term, long-term follow-up data by treatment group interaction term, baseline values of the outcome, and missing data predictors (sex, education level, body-mass index, and patient self-help organisation membership), so the differences between groups obtained were adjusted for these variables.

Nearly half (44%; 210 of 479) of all the follow-up study participants reported receiving additional trial treatments after their final 1 year outcome assessment (table 2; appendix p 2). The number of participants who received additional therapy differed between the original treatment groups, with more participants who were originally assigned to SMC alone (73 [63%] of 115) or to APT (60 [50%] of 119) receiving additional therapy than those assigned to GET (41 [32%] of 127) or CBT (36 [31%] of 118; p<0·0001).

In the trial analysis plan we defined an adequate number of therapy sessions as ten of a maximum possible of 15. Although many participants in the follow-up study had received additional treatment, few reported receiving this amount (table 2). Most of the additional treatment that was delivered to this level was either CBT or GET.

The “linear mixed-effects regression models” are rather standard techniques for compensating for missing data by using all of the available data to estimate what is missing. The problem is that this approach assumes that any missing data are random, which is an untested assumption that is unlikely to be true in this study.

3aWhen Cherished Beliefs Clash with Evidence-page-0The inclusion of “covariates” is an effort to control for possible threats to the validity of the overall analyses by taking into account what is known about participants. There are numerous problems here. We can’t be assured that the results are any more robust and reliable than what would be obtained without these efforts at statistical control. The best publishing practice is to make the unadjusted outcome variables available and let readers decide. Greatest confidence in results is obtained when there is no difference between the results in the adjusted and unadjusted analyses.

Methodologically inclined readers should consult an excellent recent article by clinical trial expert, Helene Kraemer, A Source of False Findings in Published Research Studies Adjusting for Covariates.

The effectiveness of statistical controls depends on certain assumptions being met about patterns of variation within the control variables. There is no indication that any diagnostic analyses were done to determine whether possible candidate control variables should be eliminated in order to avoid a violation of assumptions about the multivariate distribution of covariates. With so many control variables, spurious results are likely. Apparent results could change radically with the arbitrary addition or subtraction of control variables. See here for a further explanation of this problem.

We don’t even know how this set of covariate/control variables, rather than some other set, was established. Notoriously, investigators often try out various combinations of control variables and present only those that make their trial looked best. Readers are protected from this questionable research practice only with pre-specification of analyses before investigators know their results—and in an unblinded trial, researchers often know the result trends long before they see the actual numbers.

See JP Simmons’  hilarious demonstration that briefly listening to the Beatles’ “When I’m 64” can be leave research participants a year and a half older younger than listening to “Kalimba” – at least when investigators have free reign to manipulate the results they want in an study without pre-registration of analytic plans.

Finally, the efficacy of complex statistical controls is widely overestimated and depends on unrealistic assumptions. First, it is assumed that all relevant variables that need to be controlled have been identified. Second, even when this unrealistic assumption has been met, it is assumed that all statistical control variables have been measured without error. When that is not the case, results can appear significant when they actually are not. See a classic paper by Andrew Phillips and George Davey Smith for further explanation of the problem of measurement error producing spurious findings.

What the investigators claim the study shows

In an intact clinical trial, investigators can analyze outcome data with and without adjustments and readers can decide which to emphasize. However, this is far from an intact clinical trial and these results are not interpretable.

The investigators nonetheless make the following claims in addition to what was said in the summary/abstract.

In the results the investigators state

The improvements in fatigue and physical functioning reported by participants allocated to CBT or GET at their 1 year trial outcome assessment were sustained.

This was followed by

The improvements in impairment in daily activities and in perceived change in overall health seen at 1 year with these treatments were also sustained for those who received GET and CBT (appendix p 4). Participants originally allocated to APT reported further improvements in fatigue, physical functioning, and impairment in daily activities from the 1 year trial outcome assessment to long-term follow-up, as did those allocated to SMC alone (who also reported further improvements in perceived change in overall health; figure 2; table 3; appendix p 4).

If the investigators are taking their RCT design seriously, they should give precedence to the null findings for group differences at follow-up. They should not be emphasizing the sustaining of benefits within the GET and CBT groups.

The investigators increase their positive spin on the trial in the opening sentence of the Discussion

The main finding of this long-term follow-up study of the PACE trial participants is that the beneficial effects of the rehabilitative CBT and GET therapies on fatigue and physical functioning observed at the final 1 year outcome of the trial were maintained at long-term follow-up 2·5 years from randomisation.

This is incorrect. The main finding   is that any reported advantages of CBT and GET at the end of the trial were lost by long-term follow up. Because an RCT is designed to focus on between group differences, the statement about sustaining of benefits is post-hoc.

The Discussion further states

In so far as the need to seek additional treatment is a marker of continuing illness, these findings support the superiority of CBT and GET as treatments for chronic fatigue syndrome.

This makes unwarranted and self-serving assumptions that treatment choice was mainly driven by the need for further treatment, when decision-making was contaminated by investigative preference, as stated in the newsletter. Note also that CBT is a novel treatment for research participants and more likely to be chosen on the basis of novelty alone in the face of overall modest improvement rates for the trial and lack of improvements in objective measures. Whether or not the investigators designate a limited range of self-report measures as primary, participant decision-making may be driven by other, more objective measures.

Regardless, investigators have yet to present any data concerning how decisions for further treatment were made, if such data exist.

The investigators further congratulate themselves with

There was some evidence from an exploratory analysis that improvement after the 1 year trial final outcome was not associated with receipt of additional treatment with CBT or GET, given according to need. However this finding must be interpreted with caution because it was a post-hoc subgroup analysis that does not allow the separation of patient and treatment factors that random allocation provides.

However, why is this analysis singled out has exploratory and to be interpreted with caution because it is a post-hoc subgroup analysis when similarly post-hoc subgroup analyses are recommended without such caution?

The investigators finally get around to depicting what should be their primary finding, but do so in a dismissive fashion.

Between the original groups, few differences in outcomes were seen at long-term follow-up. This convergence in outcomes reflects the observed improvement in those originally allocated to SMC and APT, the possible reasons for which are listed above.

The discussion then discloses a limitation of the study that should have informed earlier presentation and discussion of results

First, participant response was incomplete; some outcome data were missing. If these data were not missing at random it could have led to either overestimates or underestimates of the actual differences between the groups.

This minimizes the implausibility of the assumption of random missing variables, as well as the problems introduced by the complex attempts to control confounds statistically.

And then there is an unsubstantiated statement that is sure to upset persons who suffer from CFS and those who care for them.

the outcomes were all self-rated, although these are arguably the most pertinent measures in a condition that is defined by symptoms.

I could double the length of this already lengthy blog post if I fully discussed this. But let me raise a few issues.

  1. The self-report measures do not necessarily capture subjective experience, only forced choice responses to a limited set of statements.
  2. One of the two outcome measures, the physical health scale of the SF-36  requires forced choice responses to a limited set of statements selected for general utility across all mental and physical conditions. Despite its wide use, the SF-36 suffers from problems in internal consistency and confounding with mental health variables. Anyone inclined to get excited about it should examine  its items and response options closely. Ask yourself, do differences in scores reliably capture clinically and personally significant changes in the experience and functioning associated with the full range of symptoms of CHF?
  3. The validity other primary outcome measure, the Chalder Fatigue Scale depends heavily on research conducted by this investigator group and has inadequate validation of its sensitivity to change in objective measures of functioning.
  4. Such self-report measures are inexorably confounded with morale and nonspecific mental health symptoms with large, unwanted correlation tendency to endorse negative self-statements that is not necessarily correlated with objective measures.

Although it was a long time ago, I recall well my first meeting with Professor Simon Wessely. It was at a closed retreat sponsored by NIH to develop a consensus about the assessment of fatigue by self-report questionnaire. I listened to a lot of nonsense that was not well thought out. Then, I presented slides demonstrating a history of failed attempts to distinguish somatic complaints from mental health symptoms by self-report. Much later, this would become my “Stalking bears, finding bear scat in the woods” slide show.

you can't see itBut then Professor Wessely arrived at the meeting late, claiming to be grumbly because of jet lag and flight delays. Without slides and with devastating humor, he upstaged me in completing the demolition of any illusions that we could create more refined self-report measures of fatigue.

I wonder what he would say now.

But alas, people who suffer from CFS have to contend with a lot more than fatigue. Just ask them.

borg max[To be continued later if there is interest in my doing so. If there is, I will discuss the disappearance of objective measures of functioning from the PACE study and you will find out why you should find some 3-D glasses if you are going to search for reports of these outcomes.]