Critical analysis of a meta-analysis of a treatment by authors with financial interests at stake

trust me 2

Update June 2, 2014. This blog has been updated to  respond to information provided in comments on the blog, as well as my examination of the membership of International Scientific Advisory Committee for Triple P Parenting that these comments prompted.

The case appears stronger than first thought that this is a thoroughly flawed meta-analysis conducted by persons with substantial but undeclared,  financial interests in portraying triple P parenting as “evidence-supported” and effective. Moreover, they rely heavily on their own studies and those conducted by others with undeclared conflicts of interest. The risk of bias is high as result of including nonrandomized trials in the meta-analysis, likely selected exclusion of contrary data, likely spinning of the data that is presented, and the reliance on small, methodologically flawed studies whose authors have undeclared conflicts of interest.But don’t take my word for it. Read and see if you are persuaded also.

I recommend clinicians, policymakers, and patients not make decisions about Triple P Parenting on the basis of results of this meta-analysis.

This meta-analysis should serve as a wake-up call for greater enforcement of existing safeguards on generating and integrating evidence concerning the efficacy of psychological interventions, as well as a heightened suspicion that what is observed here is more widespread in the literature and true of evaluations of other treatments.

Do you have the time and expertise to interpret a meta-analysis of a psychological treatment? Suppose you don’t, but you are a policymaker, researcher, clinician, or potential patient. You have to decide whether the treatment is worthwhile. Can you rely on the abstract in a prestigious clinical psychology journal?

What if the authors had substantial financial gains to be made from advertising their treatment as “evidence-supported”?

 In this blog post I am going to provide critical analysis of a meta-analysis of Triple P Parenting Programs (3P) published by its promoters. The authors have a lucrative arrangement with their university [updated] for sharing profits from dissemination and implementation of 3P products. As is typical for articles authored by those benefiting financially from 3P, these authors declare no conflict of interest.

You can read this post with a number of different aims.

(1) You can adjust your level of suspicion when you encounter an abstract of a meta-analysis by authors with a conflict of interest. Maybe be even more suspicious when you aren’t informed by the authors and find out on your own.

(2) You can decide more how much credence to give to claims in meta-analyses, simply because they are in prestigious journals.

(3) You can decide whether you have adequate skills to independently evaluate claims in meta-analysis.

(4) Or you can simply read the post to pick up some tips and tricks for scrutinizing a meta-analysis.

You will see how much work I had to do. Decide whether it would have been worth you doing it. This article appears in a prestigious, peer-reviewed journal. You’d think “Surely the reviewers would have caught obvious misstatements or simple hanky-panky and either rejected the manuscript or demanded revision.” But, if you have been reading my posts, you probably are no longer surprised by gross lapses in the oversight provided by peer review.

This post will identify serious shortcomings in a particular meta analysis. However, this article made it through peer review. After my analysis, I will encourage you to consider how this article reflects on some serious problems in the literature concerning psychological treatment. Maybe there is even a bigger message here.

The Clinical Psychology Review article I will be critiquing is behind a pay wall. You request a PDF from the senior author at . You can consult an earlier version of this meta-analysis placed on the web with a label indicating it was under review at another journal. The manuscript was longer than the published article, but most of the differences between the two other than shortening are cosmetic. Robin Kok has now highlighted the manuscript to indicate overlap with the article. Thanks, Robin.

Does this abstract fairly represent the conduct and conclusions of the meta-analysis?

This systematic review and meta-analysis examined the effects of the multilevel Triple P-Positive Parenting Program system on a broad range of child, parent and family outcomes. Multiple search strategies identified 116 eligible studies conducted over a 33-year period, with 101 studies comprising 16,099 families analyzed quantitatively. Moderator analyses were conducted using structural equation modeling. Risk of bias within and across studies was assessed. Significant short-term effects were found for: children’s social, emotional and behavioral outcomes (d = 0.473); parenting practices (d =0.578); parenting satisfaction and efficacy (d = 0.519), parental adjustment (d = 0.340); parental relationship (d = 0.225) and child observational data (d = 0.501). Significant effects were found for all outcomes at long-term including parent observational data (d = 0.249). Moderator analyses found that study approach, study power, Triple P level, and severity of initial child problems produced significant effects in multiple moderator models when controlling for other significant moderators. Several putative moderators did not have significant effects after controlling for other significant moderators. The positive results for each level of the Triple P system provide empirical support for a blending of universal and targeted parenting interventions to promote child, parent and family wellbeing.

On the face of it, wow! This abstract seems to provide a solid endorsement of 3P based on a well done meta-analysis. Note the impressive terms: “multiple search strategies,”… “moderator analyses”… “Structural equation modeling,”… “risk of bias.”

I have to admit that I was almost taken. But then I noticed a few things.

The effect sizes were unusually high for psychological interventions, particularly treatments delivered

  • in the community with low intensity.
  • by professionals with typically low levels of training.
  • to populations that are mandated treatment and often socially disadvantaged and unprepared to do what is required to benefit from treatment.

The abstract indicates moderator analyses were done. Most people familiar with moderator analyses would be puzzled why results for the individual moderators were not reported except where they turned up in a multivariate analysis. That is not relevant to the question of whether there was a moderator effect for a particular variable.

The abstract mentions attention to “risk of bias,” but does not report what was found. It also leaves you wondering how promoters can analyze the bias of studies of evaluating a psychological treatment when they are the investigators in a substantial number of the trials. Of course, they would say they conduct and report their intervention studies very well. Might they have a risk of bias in giving themselves high marks?

penalty1So, as I started to read the meta-analysis, there is a penalty flag on the playing field. Actually, a lot of them.penalty_flags

I had to read this article a number of times. I had to make a number of forays into the original literature to evaluate the claims that were being made. Read on and discover what I found but here are some teasers:

Two of the largest trials of 3P ever done were excluded, despite claims of comprehensiveness. Both have been interpreted as negative trials.

One of the largest trials ever done involved one of the authors of the meta-analysis. A description of the trial was published while the trial was ongoing. Comparison of this description to the final report suggests highly selective and distorted reporting of basic outcomes and misrepresentation of the basic features of the design. Ouch, a credibility problem!

Very few of the other evaluation studies were pre-registered, so we do not know whether they have such a high risk of bias. But we now have evidence that is unwise to assume they don’t.

The meta-analysis itself was pre-registered in PROSPERO, CRD42012003402. Preregistration is supposed to promise what analyses will and will not be reported in the published meta-analysis. Yet, in this meta analysis, the authors failed to report what was promised and provided analyses other than what were promised in answering key questions. This raises concerns about hypothesizing after results are known (H.A.R.K.I.N.G). Ouch!

The authors claim impressive effects for a variety of outcomes. But these supposedly different outcomes are actually highly intercorrelated. These variables could not have all been the primary outcomes in a given trial. People who do clinical trials and meta-analyses worry about these sorts of selective reporting of outcomes from a larger pool of variables.

There is high likelihood that the original trials and should into the meta-analysis selectively reported positive outcomes from a larger pool of candidates variables. Because these trials do not typically have registered protocols, we do not know what was the designated primary outcome and so we have to particularly careful in accepting multiple outcomes from the same trial. Yet, the authors of this meta-analysis are basically bragging about doing multiple, non-independent re-analyses of data from the same trial. This is not cool. In fact, very fishy.

The abstract implies a single meta-analysis was done. Actually, the big population-level studies were analyzed separately. That is contrary to what was promised in the protocol. We should have been told this would be done at the outset, especially because the authors also claim that they were testing for any bias from including overly small studies (at least one with only 6 patients) with larger ones, and these studies are the largest. So, the authors cannot fully evaluate that bias which they themselves have indicated is a threat to the validity of the findings

The meta-analyses included lower quality, nonrandomized trials, lumping them together with the higher quality evidence of randomized trials. Bad move. Adding poor quality data does not strengthen conclusions from stronger studies.

Inclusion of nonrandomized trials also negates any risk of bias assessment of the randomized trials. Judged by the same standards applied to the randomized trial, the nonrandomized trials all have high of bias. It is no wonder that the authors did not disclose results of risk of bias noted in the abstract. It makes no sense when they have included nonrandomized trials that would score poorly, but it makes no sense to examine the risk of bias in the randomized trials without applying the same standards to the nonrandomized trials with which they are being integrated.

Combining effect sizes from nonrandomized trials with those of randomized trials involve some voodoo statistics that were consistently biased in the direction of producing better outcomes for 3P. Basically, the strategies used to construct effect sizes from nonrandomized trials will exaggerate the effects of 3P.

The bottom line is that “Hey, Houston, we’ve got a problem.” Or two or three.

Introduction: is this meta-analysis the biggest and best ever?

In the introduction, the authors are not shy in their praise for their 3P. They generously cite themselves as the authority for some of the claims. Think about how it would look if they had used first person instead of citing themselves, simply said “we think” and “we conclude” rather than using a citation that usually implies the support of someone else thinking or finding this. Most importantly, they claim that this meta-analysis of 3P is the most comprehensive one ever and they criticize past meta-analyses for being incomplete.

a compilation of systematic review and meta-analysis, based on more than double the number of studies included any prior meta-analysis of triple P or other parenting interventions, provides a timely opportunity to examine the impact of a single, theoretically-integrated system of parenting support on the full range of child, parent and family outcome variables

But it is not a single meta-analysis, but a couple of meta-analyses with some inexplicable and arbitrary lumping and splitting of studies, some of quite poor quality that pass reviewers would have left out.

This meta-analysis includes more studies than recent past meta-analyses because it accepts low-quality data from nonrandomized trials, as well as combining data from huge trials with studies with as few as six patients. It may be bigger, but is unlikely to give a better answer as to whether 3P works because the authors have done unusual things to accommodate poor quality data.

Authorities about how to do a meta-analysis agree it should represent a synthesis of the evidence with careful attention to the quality of that evidence. That involves distinguishing between conclusions based on best evidence, and those dependent on introducing poor quality evidence.

The introduction misrepresents other authors as agreeing with the strategy used in this meta analysis. In particular, if I were Helena Kraemer, I would be very upset that the authors suggest that I endorsed a strategy of ‘the more studies, the better, even if it means integrating data from poor quality studies,’ Or any of a number of things that they imply with which Helena Kraemer would agree:

all possible evaluation designs were included to provide the most comprehensive review of triple P evidence, and to avoid exclusion and publication bias (Sica, 2006; Kraemer et al. 1998)…. To provide the most comprehensive meta-analytic assessment of triple P studies, inclusion-based approach was adopted (Kraemer et al. 1998).

It would help if readers knew that this meta-analysis was published in response to another meta-analysis that disputes the level and quality of evidence that 3P is effective. In a story I have recounted elsewhere, promoters (or someone in direct contact with them) of 3P blocked publication of this other meta-analysis in Clinical Psychology Review and tried to intimidate the author, Philip Wilson, M.D. PhD. His work nonetheless got published elsewhere and, amplified by my commentary on it, set off a lot of reevaluation of the validity of the evidence for 3P. It stimulated discussion about undisclosed conflicts of interest and caused at least one journal editor to announce greater vigilance about nondisclosure.

It is interesting to compare what Wilson says about 3P with how the authors portray his work in this paper. If you did not know better, you would conclude that Wilson’s most important finding was a moderate effect for 3P, but that he was limited by considering too few studies and too narrow a range of outcomes.

Actually, Wilson made well-reasoned and well-and described decisions to focus on primary outcomes, not secondary ones, and only from randomized trials. He concluded:

In volunteer populations over the short term, mothers generally report that Triple P group interventions are better than no intervention, but there is concern about these results given the high risk of bias, poor reporting and potential conflicts of interest. We found no convincing evidence that Triple P interventions work across the whole population or that any benefits are long-term. Given the substantial cost implications, commissioners should apply to parenting programs the standards used in assessing pharmaceutical interventions.

In the manuscript that was posted on the web, these authors cite my paper without indicating its findings and they cite my criticism of them without fairly portraying what I actually said. In the published meta-analysis, they dropped the citation. They instead cite my earlier paper which they say I “claimed” that clinical trials with less than 35 participants per condition have less than 50% probability of obtaining a moderate effect even if there was one present.

No, if these authors had bothered to cite power analyses tables they would see that this is not merely my “claim,” it is what every power analysis table shows. And my point was that if most small published trials are positive, as they usually are, there must be a publication bias, with negative trials missing.

What I say in in the abstract of my critique of 3P is

Applying this [at least 35 participants in the smallest cell] criterion, 19 of the 23 trials identified by Wilson et al. were eliminated. A number of these trials were so small that it would be statistically improbable that they would detect an effect even if it were present. We argued that clinicians and policymakers implementing Triple P programs incorporate evaluations to ensure that goals are being met and resources are not being squandered.

So, most of the available evidence for 3P from randomized trials is from small trials with a high risk of bias, including trials having been conducted by people who have financial interests at stake. The 3P promoters are trying to refute this finding by introducing nonrandomized trials, largely done with their own involvement.

Promoters of 3P have also come under withering criticism from Professor Manuel Eisner of Cambridge University. Reading this article, you would think his main gripe was that developers of 3P are sometimes involved in studies and that needs to be taken into account as a moderator variable.

I invite you to compare the summary in the article to what Professor Eisner says here and here.

The authors set about to counter Professor Eisner by introducing 3P “developer involvement” as a moderator variable. The problem is that Eisner is complaining about undisclosed conflicts of interest. We know that there is a lot of developer involvement by consulting the authorship of papers, but we do not know to what extent conflict of interest exists beyond that, i.e., persons doing trials getting financial benefit from promoting 3P products. But conflicts of interest statements almost never accompany trials of 3P, even when developers are involved.

I sent a couple of emails to Nina Heinrichs, a German investigator who has been involved in a number of studies, conducted a key meta-analysis, and spent considerable time with 3P developers in Australia. I asked her directly if she met criteria for needing to disclose a conflict of interest. She is not yet replied (I will correct this statement if she responds and indicates she does meet criteria for conflict of interest) [Update: Dr. Heinrichs is listed as a member of the Triple P Parenting International Scientific Advisory Committee, as are a number of other authors of 3P intervention trials who fail to disclose conflicts of interest. This meets critteria for having a conflict to declare. But these trials were not coded as positive for developer involvement in this meta analysis and so this designation is not a proxy for conflict of interest. ] So, we have a moderator variable that cannot be independently  checked, but about which there is considerable suspicion. And it does not correspond to the undisclosed conflicts of interest about which critics of 3P complain.

In the introduction we see the authors using a recognizable tactic of misciting key sources and ignoring others in order to suggest there are some consensus about their assessment of 3P and the decisions in conducting a meta-analysis. Basically, they are claiming unfounded authority from the literature by selective and distorted citation and ignoring of others, attributing to sources agreement that is not there and misrepresenting what actually is there. Steven Greenberg has provided an excellent set of tools for detecting when this is being done. I encourage you to study his work and acquire these tools.

We will see more distorted citation in this meta-analysis article as we proceed into the method section. Note, however, that use of this powerful technique and having access to the information it provides requires going back to original sources and looking for differences between what is said in them and what is said about them in the article. This takes time that you might not want to commit, but it is vital to get at some issues.

Methods: Explaining what was done and why

I will try not to bore you with technical details nor get you lost as I almost did in the authors’ complex and contradictory description of how they decided to do the meta-analysis. But I will highlight a few things so that you can get a flavor. I am sure that if you go back you can find more such things.

The methods section starts with reassurance that the protocol for the meta-analysis is preregistered, suggesting that we should relax and assume that the protocol was followed. Actually, the published paper deviates from the protocol in treating randomized trials with active control groups as nonrandomized and by keeping the population-level intervention separate.

Those of you who have been following my blog post may recall my saying again and again that interventions do not have effect sizes, only comparisons do. When these authors take the data for the intervention condition out of randomized trials, they destroy the benefit of the design and use statistics that exaggerate the effects of the treatment.

The authors make a spirited defense of including nonrandomized trials as being better. They justify the voodoo statistics that they used to calculate effect sizes from these trials without control groups. They cite some classic work by Scott Morris and Richard DeShon as the basis for what they do, but I am quite confident that these authorities would be annoyed being invoked as justification.

Basically, if you use data for intervention groups without a comparison, you overestimate the effect of the intervention because you take all the change that occurs and attributed to the intervention, not the passage of time or some nonspecific factors. If there is any hope of getting around this, you have to have an estimate of what change would be experienced in the absence of intervention and test your assumptions in making this estimate.

With the population served by 3P, this is particularly a problem because many of these families come into treatment at a time of crisis and it is only with a control group that you can calculate how much they will decline anyway in their problems without getting treatment. Similarly, if you take an intervention group and drop its comparator, you do not get to see what is specific to the intervention.

We already know that much of the evidence for the efficacy of psychological interventions involves comparing it to an active control groups like waiting lists or remaining in treatment as usual that is either no care or thoroughly in adequate care. The authors of this meta-analysis suppressed information that would have allowed us to examine whether that was the case for 3P.

This is all very bad form and serves to inflate the effect sizes obtained by the authors.

The methods section seems to give an impressive account of thoughtful selection criteria and thorough searches. It indicates that published as well as unpublished papers will be included in the the meta-analysis.

The preregistration does not indicate that population-level intervention trials will be kept separate, but that is what is done. If you do not look carefully, you will not find the two brief disclosures of this in the method section.

The methods section indicates that seven different outcomes will be considered. This may sound comprehensive, but presents a number of problems. Clinical trials have only one or two or so declared primary outcomes and need to be evaluated on the basis of whether there are changes in those outcomes. Many evaluations of psychological treatment, and especially of 3P, include administering a battery of potential outcome variables and leave to after the outcomes of being analyzed to choose which one to report. That is why there is such a push to preregister protocols for trials and commit investigators to one or two outcomes. You can click on the image below depicting child outcomes and 3P to enlarge it. Despite many of these trials being conducted by the same investigators, note the wide variety of outcomes. Without preregistration, we do not know if all of the outcomes are reported in a particular trial or if the one that is emphasized was originally primary.

measures 3supplement.-page-0

Seven outcome categories are a lot, especially when five depend on parent self-report. So, we are dealing with multiple outcomes from the same trial, often with the same responded, often highly intercorrelated, with no independent validation as to whether these outcomes were designated primary or picked after analyses had begun from a larger array. There is a high risk of confirmatory bias.

More preferable would be the strategy adopted by Wilson and others of centering on one outcome, child adjustment in evaluating 3P on that basis. Even here, though, we run into a problem. Child outcomes can be social emotional or behavioral and we do not know which was the primary.

You may not be bothered by all of this, but consider the possibilities of picking the outcomes that make 3P look the best and using that as a basis for committing millions of dollars to funding it, when it may not be effective.

Art Garfunkel’s Mr Shuck ‘N Jive
Art Garfunkel’s Mr Shuck ‘N Jive

Results: shuckin and jivin’

One of first things you should do when you get the results of a meta-analysis is checked the analysis of heterogeneity to determine whether this group of studies can be integrated to produce a valid summary effect size. The authors report two measures, I2 and Q, both of which indicates considerable heterogeneity, particularly for the very important child outcomes. This should be like a warning light flashing on your dashboard: you should pull over and see what is wrong. All that the authors do is read to these measures for the individual levels 1 – 5 of the intervention. Because a number of these levels have only a few trials, measures of heterogeneity look better but are useless because of low power. The problem is considered solved, except that the authors have missed an important opportunity to throw the hood up on their meta-analysis and see what is going wrong.

The authors also conduct moderator analyses and find that study design (nonrandom versus random), investigator involvement, and having a small number of participants all strongly contribute to more positive findings. Again, the author should have been greatly troubled by this. Instead, they do two things to get rid of this potential embarrassment. First, they emphasize not on whether individual moderator variables have a significant impact, but whether any impact survives inclusion in a multiple regression equation with the other moderator variables. Second, they emphasize that even when they look at conditions when one of these moderator variables puts 3 PM the disadvantage, effects are still significant.

These tactics suggest they are really interested making the case that 3P is effective, not examining all the relevant data and potentially rethinking some of their decisions that artificially improve the appearance of 3P being effective.

The authors separate out what they identify as the three large-scale population trials of 3P. These are extremely expensive trials that adopt a public health approach to reducing child behavior problems. All three were done by the authors and all appraised in glowing terms.

The problem is that we have an independent check on the accuracy of what is reported in the one American trial. As I have detailed elsewhere, an earlier paper was published while the trial was ongoing gives specifics of its design, including primary outcomes. It does not agree with what was reported in the article providing data for the meta analysis.

Prinz, R. J., Sanders, M. R., Shapiro, C. J., Whitaker, D. J., & Lutzker, J. R. (2009). Population-based prevention of child maltreatment: The US Triple P system population trial. Prevention Science, 10(1), 1-12.

Then there is the problem of the two large unreported trials, both of which have been interpreted as being negative. One of them is published and you can find it here.

Little, M., Berry, V., Morpeth, L., Blower, S., Axford, N., Taylor, R., … & Tobin, K. (2012). The impact of three evidence-based programmes delivered in public systems in Birmingham, UK. International Journal of Conflict and Violence, 6(2), 260-272.

While the other is unpublished. It was conducted by investigators who have published other trials and it could be readily identified by either contacting them or from the considerable press coverage that it received. [Update: One of the authors is a member of International Scientific Advisory Committee for Triple P Parenting]

Schönenberger, M., Schmid, H., Fäh, B., Bodenmann, G., Lattmann, U. P., Cina, A., et al. (2006). Projektbericht “Eltern und Schule stärken Kinder” (ESSKI); Ein Projekt zur Förderung der Gesundheit bei Lehrpersonen, Kindern und Eltern und zur Prävention von Stress, Aggression und Sucht – Ergebnisse eines mehrdimensionalen Forschungs- und Entwicklungsprojekts im Bereich psychosoziale Gesundheit in Schule und Elternhaus

Both the media release and the research report can be downloaded from the website of the ESSKI project ( Subsequent analyses have cast doubt [Updated] about whether there were positive findings and it has remained unpublished.

Taken together, we have

  • Serious doubts about the validity of reports of the only US study.
  • No way of independently checking the validity of the two other studies conducted by the authors of the meta-analysis.
  • The unexplained absence of two negative trials.

Update 6/2/2104 7:51 am The authors bolster the case for the strength of their findings by bringing in the so-called failsafe N to argue hundreds of unpublished studies would have to be sitting out there in desk drawers to reverse their conclusions.

Orwin’s failsafe N was as follows for each outcome: child SEB outcomes = 246, parenting practices = 332, parenting satisfaction and efficacy = 285, parental adjustment = 174, parental relationship = 79, child observations = 76. It is highly unlikely that such large numbers of studies with null results exist, indicating the robustness of the findings to publication bias. For parent observations, Orwin’s failsafe N could not be computed as the overall effect size was below 0.10, the smallest meaningful effect size.

Bringing in failsafe N in an evaluation of an effect size  is still a common tactic in psychology, but the practice is widely condemned and Cochrane Collaboration specifically recommends against it as producing unreliable and invalid results.

I have discussed this in other blog posts, but let me point to some key objections. The first and most devastating is that their analyses have not provided an estimate of effect sizes in well done, adequately sized trials. Rather, they produced a biased estimate based on some poor quality studies that should not have been included, including nonrandomized studies. Second, they have not dispensed with the heterogeneity that they found in the published studies, and so cannot generalize to whatever studies remain unpublished. Third, they assume that the on published studies are only null findings, where as some of them could have actually demonstrated that 3P as a negative effect, particularly in comparison to active control groups, which were suppressed in this meta-analysis.

I am confident that clinical epidemiologists and those doing meta-analyses with biomedical data would reject out of hand this argument for  the strengthen of 3P effects from failsafe N .


Despite all these objections, the article ends with a glowing conclusion, suitable for dissemination to funding agencies and posting on the websites promoting 3P.

The evolution of a blended system of parenting support involving both universal and targeted elements has been built on a solid foundation of ongoing research and development, and the testing of individual components comprising the intervention. The present findings highlight the value of an integrated multilevel system of evidence-based parenting programs and raise the real prospect that a substantially greater number of children and parents can grow up in nurturing family environments that promote children’s development capabilities throughout their lives.

I suggest that you now go back to the abstract and reevaluate whether you accept what it says.

The problems with this meta-analysis are serious but reflect larger problems in the evaluation of psychological treatments

I have identified serious problems with this meta-analysis, but we need to keep in mind that it got published in a respectable journal. That could only have happened if some reviewers let it through. Our finding it in the literature points to some more pervasive problems in the research evaluating psychological treatments. Just as this meta-analysis does not provide valid estimates of the effectiveness of 3P, other meta-analyses are done by persons with conflicts of interest, and many of the studies providing evidence entered into meta-analyses are conducted by persons with conflicts of interest and have selective reporting of data.

Journals have policies requiring disclosures of conflict of interest under the circumstances, but at least in the psychological literature, you can find only few examples that these policies are enforced.

The authors registered their protocol for conducting this meta-analysis, but then did not adhere to the protocol in important ways. Obviously, reviewers did not pay any attention. It is actually uncommon for meta-analyses of psychological treatments to be preregistered, but any benefit to the recommendation or even requirement for this is lost if reviewers do not pay attention to whether authors delivered to what they promised.

A considerable proportion of the clinical trials included in the meta-analysis either had one of authors involved or another person with financial interests at stake. I did not do a thorough check, but in reviewing this literature I found no examples of conflicts of interest being declared. Again, journals have policies requiring disclosures, but these policies are worthless without enforcement.

Funding agencies increasingly require pre-registration of the protocols for clinical trials before the first patient is enrolled, including their key hypothesis and primary outcomes. Yet, other than a few like PLOS One, most journals do not require that protocols be available or alert reviewers to the need to consult protocols. Authors are free to assess many candidates for what will be decided to be the primary outcome after examining their data. The bottom line is that we cannot determine that there is selective reporting of outcomes with a strong confirmatory bias. Meta-analyses of the existing literature concerning psychological treatments may offer exaggerated estimates of their efficacy because they integrate selectively reported outcomes.

The authors of this meta-analysis need to be called out on their failure to declare conflicts of interest and the claims that they make for the efficacy of 3P should be dismissed, particularly when they are at odds with other assessments not done by people with financial interests at stake. However, we should take this blatant failure in peer review in editorial oversight as an opportunity to demand reform. Until those reforms are achieved, some of which involve simply enforcing existing rules, the literature evaluating psychological treatments is suspect, and especially meta-analyses.