Evolution and engineering of the megajournal – Interview with Pete Binfield

Image courtesy of PeerJ

Peter Binfield wrote a nice analysis on Mega Journals over at Creative Commons Aotearoa New Zealand  an organisation on which I serve. MegaJournals are a recent phenomenon, that have changed the face of scientific publishing.

I am an academic editor in PeerJ, as well as in PLOS ONE (PONE), “the” megajournal of the Public Library of Science. I entered PONE first as an author (submitted before PONE had began publishing), and then joined as an Academic Editor under the rule of Pete Binfield. I saw PONE grow into the publishing giant it is today, feeling proud of being a small part of it. Not long ago. I saw Pete leave to join Jason Hoyt (formerly from Mendeley, another venture I had signed up to in its very early days) in search of their new adventure that was eventually to become PeerJ.  It wouldn’t be long before I would become an academic editor and find myself, again, under Pete’s rule. It has been about a year since that invitation, and Open Access Week gave me an opportunity to reflect on my experience.

Who is Pete Binfield?

PB: Before PLOS ONE I spent about 14 years in the subscription publishing world.  I worked for Institute of Publishing Physics (doing books), then moved to Holland to work for Kluwer Academic Publishers for 8 years (and Kluwer then merged with Springer), and finally I moved to the US to work for SAGE Publications (the largest social science publisher). It was during my time at Kluwer and then SAGE that the Open Access movement was really taking off, and it quickly became apparent to me that this was the way the industry was (or at least should be!) going. I wanted to be at the leading edge of this movement, not looking in at it from outside, trying to play catch up, so when the opportunity came up to move to PLOS and run PLOS ONE, I jumped at it.

I am a biology teacher (broadly speaking)  mainly in the medical school. As such, I can’t escape talking about evolved and engineered systems. Animals’ bodies are evolved – the changes in structure and function happen against a backdrop of conserved structures. You can’t really understand “why” an organ looks the way it looks and works the way it does without thinking about what building blocks were available to start with. Engineers have it easier in a sense. They don’t have a preset structure they need to hack to get the best they can, they can start from scratch. Building an artificial kidney that works in dry land has less constraints that evolving one from that of a water-dwelling ancestor. So if you are a journal how do you go from print to online?

Building a journal from scratch, too, is not the same as evolving one. When PLOS came to life in the early over a decade ago they were able to invent their journals from scratch. And boy, did they do that well (and still do). They changed the nature of formal scientific communication and sent traditional publishers to chase their tails. Traditional publishers have been slow to adapt – trying to  hack the 17th Century publishing model.  When PLOS ONE was born it was unique, exploiting what PLOS had achieved so well as an Open Access online publication, but also seeking to changed the rules of how papers were to be accepted. This, in the whole evolution analogy was a structural change with a very large downstream effect.

PB: I think some of my prior colleagues might have thought that it was a strange transition – at SAGE I had been responsible for over 200 journal titles in a vibrant program, and now I was moving to PLOS to run a single title (PLOS ONE) in an organization that only had 7 titles. However, even at that time I could see the tremendous potential that PLOS ONE had and how it could bring about rapid change. It was the unique editorial criteria (peer-reviewing only for scientific validity); the innovative functionality; the potential for limitless growth; and the backing of a ‘mover and shaker’ organization which really excited me. I joined PLOS with the hope that we could make PLOS ONE the largest journal in the world, and to use that position to bring about real change in the industry – I think most people would agree we achieved that.

Until last year, you could pretty much put journals into 2 broad bags: those that were evolving from “print” standards and those that were evolving from “online” standards, which also included the ‘megajournals’ like PLOS ONE. Yet over 10 years after the launch of PLOS,  and given the accelerated changes in “online” media,  there was an opportunity for a fresh engineering approach.

PB: When I left, the journal was receiving about 3,000 submissions a month, and publishing around 2,000 – so to change anything about PLOS ONE was like trying to change the engines of a jet, in mid-flight. We had an amazingly successful and innovative product (and, to be clear, it still is) but it was increasingly difficult to introduce significant new innovations (such as new business models; new software; a new mindset).

In addition, myself and Jason wanted to attempt an entirely new business model which would make the act of publishing significantly cheaper for the author. I think it would have been very hard for PLOS to attempt this within the PLOS ONE structure which, in many ways, was already supporting a lot of legacy systems and financial commitments.

When Jason approached me with the original idea for PeerJ it quickly became clear that by partnering together we would be able to do things that we wouldn’t have been able to achieve in our previous roles (he at Mendeley, and me at PLOS). By breaking out and starting something new, from scratch, it was possible to try to take the lessons we had both learned and move everything one or two steps forwards with an entirely new mindset and product suite. That is an exciting challenge of course, but already I think you can see that we are succeeding!

PeerJ had from the start a lot that we (authors) were looking for. We had all been struggling for a while with knowing that the imperative to publish in Open Access was growing, either for personal motivation (as in my case) or because of funders’ or institutional mandates. We were also struggling with the perceived cost of Open Access, especially within the traditional journals. There is too much at stake in individual’s careers to not carefully choose how to “brand” our articles because we know too well that at some point or another someone will value our work more on the brand than on the quality, and that someone has the power to decide if we get hired, promoted, or granted tenure. PLOS ONE had two things in its favour: it was part of the already respected PLOS brand, and it was significantly cheaper than the other PLOS journals. Then, over a year ago, Pete and Jason came out of the closet with one of the best catch-phrases I’ve seen:

If we can set a goal to sequence the Human Genome for $99, then why shouldn’t we demand the same goal for the publication of research?

They had a full package: Pete’s credibility in the publishing industry, Jason’s insights on how to help readers and papers connect, and a cheap price, not just affordable, cheap. I bought my full membership out of my own pocket as soon as I could. I gave them my money because I had met and learned to trust both Pete’s and Jason’s insights and abilities.

PB: [The process from development to launch day ] was very exciting, although clearly nail biting! One of the things which was very important to us was to build our own submission, peer review and publication software entirely from scratch – something which many people thought would not be possible in a reasonable time frame. And yet our engineering team, recruited and led by Jason, were able to complete the entire product suite in just 6 months of development time. First we built the submission and peer review system, and as soon as submissions started moving through that system we switched to build the publication platform. Everything is hosted on the cloud, and implemented using github, and so were able to keep our development infrastructure extremely ‘light’ and flexible.

But even that does not guarantee buy-in. Truth be told, even if PeerJ was to be an interesting experiment I think mine was money well spent. (All in the name of progress.) What tipped the balance for me was the addition of Tim O’Reilly to the mix. Here is someone that understands the web (heck, he popularised that famous web 2.0 meme), publishing and innovation. O’Reilly brought in what, from my point of view, was missing in the original mix and that was crucial to attract authors: a sense of sustainability.

by @McDawg on twitter

PeerJ looked different to me in a very unique way – while other journals screamed out  “brand” or “papers”, PeerJ was screaming out  “authors”.  Whether this might be a bias of mine because of my perception of the founders, or the life-membership model, to me this was a different kind of journal. It wouldn’t be long until I got invited to join the editorial board, and then got to see who my partners in crime would be.

PB: Simultaneously, we were building up the ‘editorial’ side of the journal. We started with a journal with no reputation, brand, or recognized name and managed to recruit an Editorial Board of over 800 world class academics (including yourself, and 5 Nobel Laureates); we created the editorial criteria and detailed author guidelines; we defined a comprehensive subject taxonomy; we established ourselves with all the third party services which support this infrastructure (such as CrossRef, CLOCKSS, COPE, OASPA etc); we contracted with a production vendor and so on.

Everything was completed in perfect time, and worked flawlessly from the very start – it really is a testament to the talented staff we have and I think we have proven to other players that this approach is more than possible.

But to launch a journal you need articles and also to make sure your system does not crash. Academic Editors were invited to submit manuscripts free of charge in exchange of participating in the beta testing. I had an article that was ready to submit, and since by now I had pretty much no funding the free deal was worth any bug reporting nuisance. I had been producing digital files for submission for ages and doing submissions on line for long enough that I set a full day aside to go through the process (especially since this was a bug reporting exercise). And then came the surprise. Yes, there were a few bugs, as expected, but the submission system was easy and as user friendly as I had not anticipated. (Remember when above I said PeerJ screamed “authors”?). For the first time I experienced a submission system that was “user friendly”.

PB: I am constantly amazed that you can start from nothing, and provided you have staff who know what they are doing, and that you have a model which people can get behind, then it is entirely possible to build a world-class publishing operation from a standing start and create something which can compete with, and beat out, the more established players. As a testament to this, we have been named one of the Top 10 “Educational Technology Innovators of 2013” by the Chronicle of Higher Education; and as the “Publishing Innovation of 2013” by the Association of Learned and Professional Scholarly Publishers.

Then came the reviews of the paper – and there is when I found the benefit of knowing who the reviewers were. Many times I encounter these odd reviewer’s comments where I read puzzled and go “uh?”. In this case, because I knew who the reviewer was I could understand where they were coming from.  It made the whole process a lot easier. Apparently, the myth that people won’t review papers if their names are revealed, is , well, a myth.

PB: One particularly pleasant surprise has been the community reaction to our ‘optional open peer review’. At the time of writing, pretty much 100% of our authors are choosing to reproduce their peer-review history alongside their published articles (for example, every paper we are publishing in OA week is taking this option). We believe that making the peer review process as open as possible is one of the most important things that anyone can do to preserve the valuable comments of their peer-reviewers (time consuming comments which are normally lost to the world) and to prove the rigour of their published work .

I am not alone at being satisfied as an author. Not too long ago, PeerJ did their first author survey. Even as an editor I was biting my nails to see the results, I can only imagine the stress and anticipation in PeerJ headquarters.

PB: Yes, we conducted our first author survey earlier this year and we were extremely pleased to learn, for example, that 92% of responding authors rated their overall PeerJ experience as either “one of the best publishing experiences I have ever had” (42%) or “a good experience” (49%). In addition, 86% of our authors reported that their time to first decision was either “extremely fast” (29%) or “fast” (57%). Any publisher, no matter how well resourced or established, would be proud to be able to report results like these!

Perhaps the biggest surprise was how engaged our authors were, and how much feedback they were willing to provide. We quite literally received reams of free text feedback which we are still going through – so be careful what you ask for!

I am not surprised at this – I myself provided quite a bit of feedback. Perhaps seeing these comments from Pete emphasise the sense of community that some of us feel is the point of difference with  PeerJ.

PB: We are creating a publishing operation, not a ‘facebook for scientists’, however with that said our membership model does mean that we tend to develop functionality which supports and engages our members at every touch point. So although it is early days, I think a real community is already starting to form and as a result you can start to see how our broader vision is taking shape.

Unlike most publishers (who have a very ‘article centric’ mentality), our membership model means that we are quite ‘person centric’. Where a typical publisher might not know (or care) who the co-authors are on a paper, for us they are all Members, and need to be treated well or they will not come back or recommend us to their peers.With this mindset, you can see that we have an intimate knowledge of all the interactions (and who performed them) that happen on a paper. Therefore when you come to our site you can navigate through the contributions of an individual (for example, see the links that are building up at this profile) and see exactly how everyone has contributed to the community (through our system of ‘Academic Contribution’ points.

Another example of our tendency towards ‘community building’ is our newly launched Q&A Functionality. With this functionality, anyone can ask a question (on a specific part of a specific article; on an entire article; or on any aspect of science that we cover) and anyone in the community can answer that question. People who ask or answer questions can be ‘voted’ up or down, and as a result we hope to build up a system of ‘reputation recognition’ in any given field. Again – this is a great way to build communities of practise, and the barrier to entry is very low.

Image courtesy of PeerJ

It is early days – this is new functionality and it will be some time before we can see if it takes off. PLOS ONE also offers commenting, but that seems to be a feature that is under-used. I can’t but wonder whether the experience of PeerJ might be different because the relationship with authors and editors might be also different. Will feeling  that we, the authors (and not our articles), are the centre of attention make a difference?

PB: This is extremely important to us, so thank you for noticing! One of the mistakes that subscription publishers are making is that they have historically focussed on the librarian as the customer (causing them to develop features and functionalities focussed on those people) when in an Open Access world, the customer is the academic (in their roles as authors, editors and reviewers). Open Access publishers are obviously much more attuned to the principle of the ‘academic as customer’ but even they are not as focussed on this aspect as we (with our Membership model) are .

It is very important that authors feel loved; that people receive prompt and effective responses to their queries; that we listen to complaints and react rapidly and so on. One way we are going to scale this is with more automation – for example, if we proactively inform people of the status of their manuscript then they don’t need to email us. On another level, publishing is still a ‘human’ business based on networks of interaction and trust, and so we need to remember that when we resource our organisation going forwards.

This is what I find exciting about PeerJ – there is a new attitude, if not a new concept – that seems to come through. I will not even try to count the number of email and twitter exchanges that I have had with Pete and PeerJ staff ( I would not be surprised that eyes roll at the other end as they see the “from” field in their email inbox). But they have always responded. With graceful and helpful emails. Whether they “love” me or not (as Pete says above) is irrelevant when one is treated with respect and due diligence. I can see similar interactions at least on twitter – PeerJ responsive to suggestions and requests, and, at least from where I am standing, seemingly having innovation at the top of the list.

PB: I think that everyone at PeerJ came here (myself and Jason included) because we enjoy innovating and we aren’t afraid to try new things. Innovation is quite literally written into our corporate beliefs (“#1. Keep Innovating – We are developing a scholarly communication venue for the 21st Century. We are committed to improving scholarly communications in every way possible”) and so yes, it is part of our DNA and a core part of our competitive advantage.

I must admit, it wasn’t necessarily our intention to use twitter as our bug tracker (!), but it is definitely a very good way to get real time feedback on new features or functionality. Because of our flexible architecture, and ‘can do’ attitude, we can often fix or improve functionality in hours or days (compared to months or years at most other publishers who do not control their own software). For an example of this in action, check out this blog post from a satisfied ‘feature requestor’.

I want PeerJ to succeed not only because I like and admire the people involved with it but because it offers something different, including the PrePrint service to which I hope to contribute soon. So I had to ask Pete: how is the journal doing?

PB: Extremely well! But don’t forget that we are more than just a journal, we are actually a publishing ecosystem that aims to support authors throughout their publication cycles. PeerJ, the peer-reviewed journal has published 200 articles now, but we also have PeerJ PrePrints (our pre-print server) which has published over 80 articles. Considering we have only been publishing since February, this is a very strong output (90% of established journals don’t publish at this level). Meanwhile, our brand new Q&A functionality is already generating great engagement between readers and authors.

We have published a ton of great science, some of which has received over 20,000 views (!) already. We are getting first decisions back to authors in a median of 24 days, and we are going from submission to final publication (including revisions and production time) in just 51 days. Our institutional members such as UC Berkeley, University of Cambridge, and Trinity as well as our Editorial Board of >800 and our Advisory Board of 20, have kicked the tires and clearly support the model. We have saved the academic community almost $1m already, and we now have a significant cadre of members who are able to publish freely, for life, for no additional cost. Ever.

by @stephenjjohnson on twitter

I was thrilled when I got the invitation to become an academic editor in PeerJ, as I was when the offer came from PLOS ONE. I  blog in this space primarily because it is part of PLOS, I am not sure I would had added that kind of stress for any other brand. PLOS has been and continues to be a key player in the Open Access movement, and am proud to be one of their editors.

What the future of PeerJ might be, who knows. I will continue to support the venture because I believe it offers something of real value to science that is somewhat different from what we’ve had so far. Cant wait to see what else they will pull out of the hat.


Join PubMed’s Revolution in Post Publication Peer Review

At 11 AM on October 22, 2013, the embargo was lifted and so now it can be said: PubMed Commons has been implemented on a trial basis. It could change the process of peer assessment of scientific articles forever.

PubMedSome researchers can now comment on any article indexed at PubMed and read the comments of others. It is a closed and closely watched pilot testing. Bugs may become apparent that will need to be fixed. And NIH could always pull the plug. But so many people have invested so much at this point and spent so much time thinking through all the pros and cons that this is hopefully unlikely.

The implementation could prove truly revolutionary. PubMed Commons is effectively taking post-publication peer review out of the hands of editors and putting control firmly in the hands of the consumers of the scientific literature—where it belongs.

PubMed Commons allows us to abandon a thoroughly antiquated and inadequate reliance on letters to the editor as a means of addressing the many shortcomings of pre-publication peer review.

PubMed Commons is

  • A forum for open and constructive criticism and discussion of scientific issues.
  • That will thrive with high quality interchange from the scientific community.

You can read more about the fascinating history of PubMed here.  PubMed is a free database of references and abstracts from life sciences and biomedical journals. It primarily draws on the MEDLINE database and is maintained by the US National of Library Medicine (NLM). For 16 years ending in 1997, MEDLINE had to be accessed primarily through institutional facilities like University libraries. That excluded many  who draw on PubMed today from using it.Medline287

But then in 1997, in a revolutionary move similar to the launching of PubMed Commons, PubMed made its electronic bibliographic resources free to the public. Everyone was quite nervous at the time about being shut down. Lawyers of the for-profit publishers predictably descended on NIH to try to block the free access to abstracts, arguing, among a number of other things, copyright infringement that cut into their ability to make money. But fortunately NIH held its ground and Vice President Al Gore demonstrated PubMed’s capacity in a public ceremony.al gore pubmed

So, in the first revolutionary move, the for-profit journals lost their control over access to abstracts. In the second move, they are losing control over post publication commentary on articles– unless they succeed in squashing PubMed Commons.

Who can participate in PubMed Commons at this time?

  • Recipients of NIH (US) or Wellcome Trust (UK) grants can go to the NCBI website and register. You need a MyNCBI account, but they are available to the general public.
  • If you are not a NIH or Wellcome Trust grant recipient, you are still eligible to participate if you are listed as an author on any publication listed in PubMed, even a letter to the editor. But you will need to be invited by somebody already signed up for participation in PubMed Commons. So, if you have a qualifying publication, you can simply get a colleague with the grant to sign up and then invite you.

Inadequacies of letters to the editor as post-publication commentary

Up until now, the main option for post publication commentary has been in later reviews of the literature, although there was a misplaced confidence in the more immediate letters to the editor.

letter-to-editor-scaled500I regret my blog post last year recommending writing conventional letters to the editor. Letters remain better than journal clubs for junior investigators eager to develop critical appraisal skills. But it could be a waste of time to send the letters off because letters are simply not effective contributions to post publication commentary. Letters never worked reliably well, and for a number reasons, they are now obsolete.

In the not-so-good-old days of exclusively print journals, there was a rationale for these journals putting limits on letters to the editor.

  • With delays in availability due to the scheduling of print journals, letters to the editor were seldom available in a timely fashion. Readers usually would have long forgotten the article being critiqued when the letter finally came out.
  • With limits on the number of pages allowed for issue, letters to the editor consumed a scarce resource without contributing to the impact factor of the Journal. So, journals typically had strict restrictions on the length of letters (usually 400 to 800 words), and a tight deadline to submit a letter after publication of the print article, usually no more than three months or less.

Editorial Review of letters to editor has seldom been fair.

  • There is a prejudice against accepting anything but the most vitally relevant commentary. Yet editors are adverse to accepting critical letters that reflect badly on their own review processes. Get past the significance criterion and then you still risk offending the editor’s sense of privilege.
  • While letters to the editor are subject to peer review, responses from authors generally are not. Authors are free to dismiss or distort any criticism of their work, sometimes with the most absurd of statements going unchallenged.
  • Electronic bibliographic resources have become the principal means of accessing articles, but no links are often provided between a letter to the editor and the target article. So, even if the credibility of the published article with thoroughly demolished in a letter to the editor, readers accessing that article through electronic bibliographic source are not informed.
  • Many journals allowed authors to veto publication of any criticism of their work,hammering3pg but the journals do not state this in their instructions to authors. You can submit a letter to the editor, only to have it rejected because the author objects to what you said. But you are told nothing except that you letter is rejected.
  • Many journals allow authors of the target articles the last word in responding to critical letters. Publishing a single letter and a response typically completes discussion of a target article. And the letter writer never gets to see the author’s response until after it is published. So, you can put incredible effort into carefully expressing your points within the limits of 400 to 800 words, only to be made to look ridiculous with mischaracterizations you can do nothing about.

Letters to the editor are thus usually untimely, overly short, and inadequately referenced. And they elicit inadequate and even hostile responses from authors, but are generally ignored by everybody else.

Letters to the editor are seldom cited and this is just one reflection of their failing to play a major role in moving the scientific discussion forward.

The advent of web-based publishing made restrictions on letters to the editor less justifiable. Once a basic structure for processing and posting letters to the editor is set up, processing and posting cost little.

Print journals can reduce costs by maintaining a separate web-based place for letters to the editor, but restrictions on length and time to response have nonetheless continued, even if their economic justification has been lost.

bmj-logo-ogBMJ Rapid Responses provides an exceptional model for post publication peer commentary. BMJ accepts electronic responses that can be accessed by readers within 72 hours, as long as the responses are not grossly irrelevant or libelous. Readers can register “likes” of Rapid Responses and threads of comments often develop. Then, a few comments are selected each week for editing and published in the print edition. Unfortunately, the rapid responses that remain only electronic are not indexed at PubMed and can only be found by going to the BMJ website, which is behind a pay wall for most articles.

Other journals are scrambling to copy and improve upon the BMJ model, but it can take some serious modification of software and that takes time. “Like turning the Titanic around” an editor of one of the largest open access journals told me.

Until such options become widely available, a reluctance to writing letters to the editor remains thoroughly justifiable. Few letters will be submitted and fewer will be published or result in a genuine scientific exchange. And the goal remains elusive of readily accessible, indexed, citable letters to the editor and comments for which writers can gain academic credit.

PLOS cofounder Michael Eisen Photo by Andy Reynolds from Mother Jones

That is, unless PubMed Commons catches on.  It provides a potential of realizing PLOS Co-founder and disruptive innovator Michael Eisen’s goal of continuous peer assessment and reassessment, not stopping with two or three people making an unreliable, but largely irreversible judgment that something should have been published and should eternally be accepted as peer-reviewed.

PubMed Commons is only a rung on the ladder towards overthrowing the now firm line between publication and peer assessment. It’s not a place to stop, but in important step. Please join in and help make it work. If you’ve ever publish an article listed in PubMed, find a way to get invited. If you not ready to post your own comments, lurk, offer others encouragement with “like’s”, and then when the spirit moves you, jump in!

I expect that someday soon you’ll be a say to more junior colleagues, “I was active in the field when authors could prevent you from commenting on their work and editors could prevent you from embarrassing them with demonstration of the stupidity of some of their decisions.” And you junior colleagues can respond “wow, was that in the days before email? Just how did you participate in the dialogue that is at the heart of scientific communication back then? Did you have to get up and challenge speakers at conferences?”


Why I Accepted a PLOS One Article about Homeopathy for Depression

plos onePLOS One recently published Homeopathy for Depression: A Randomized, Partially Double-Blind, Placebo-Controlled, Four-Armed Study (DEP-HOM). I’m proud to have been the Academic Editor accepting this paper.

I wrote this blog post as an independent blogger to present my personal views about my decision to recommend acceptance, in the context of some larger issues.

The same day, PLOS Medicine published a paper evaluating acupuncture for depression in primary care. I tweeted (@coyneoftherealm) that if PLOS Medicine was going to keep publishing clinical trials of acupuncture without suitable sham acupuncture controls, I might have to resign as an Academic Editor at PLOS one.

Besides blogging at PLOS Mind the Brain, I’m an occasional blogger at Science Based Medicine. The heavily accessed blog site is well known for not mincing words in expressing contempt for complementary and alternative medicine (CAM) approaches, often referred to as SCAM in its blog posts. A couple of my posts (1,2) there were scathing criticisms of a PLOS Medicine article claiming acupuncture had effects equivalent to antidepressants and psychotherapy for depression.

I have not asked them, but I doubt many of my Science Based Medicine colleagues approve my accepting the homeopathy paper, especially if they were unaware of my rationale.

And then there is the inconsistency. Why, if I accepted a homeopathy paper, did I so strongly objected to publishing of an acupuncture paper? My recovery from a momentary lapse of reason?

I will explain, but offer no apologies.

Thanks to the open access afforded by PLOS One, you can get the article here.

A Clinical Trial of Homeopathy for Depression in PLOS One

The article describes an attempt to recruit patients into a four-armed randomized trial. A homeopathic remedy was compared to placebo. A homeopathic interview that involves a lot of history taking to personalize the choice of medication was compared to a more conventional shorter interview. Thus, the trial had a 2 x 2 placebo controlled design, with practitioners and patients blind to whether placebo or control was being administered.

The investigators intended that 224 patients would be randomized. However, despite extensive efforts, they were only able to recruit 44 patients. They abandoned their efforts and wrote up the study.

The investigators acknowledged that there was a lack of scientific rationale for homeopathic medicine.  They reported finding only one previous study in which its efficacy for major depression had been examined. There are actually more studies, but bad quality.

Their interest in conducting a trial was pragmatic.

Many depressed persons in the community are drawn to this treatment because of their belief that it is effective and lacks the side effects conventional medication or the extensive time commitment required by psychotherapy.

Homeopathy has been recommended by both Prince Charles and Mother Teresa. A controversial Swiss Health Technology Assessment concluded that homeopathy was safe and effective and resulted in continued reimbursement for treatment by Swiss insurance companies.

The German government funded this trial, presumably assuming that results of a well-designed clinical trial could settle the issue of efficacy in a way that could be persuasively communicated to the lay public and professional community.

Based on this inability to recruit patients, the investigators concluded

Although our results are inconclusive, given that recruitment into this trial was very difficult and we had to terminate early, we cannot recommend undertaking a further trial addressing this question in a similar setting.

They went on to explain why a further trial was not recommended.

How Is Homeopathy Supposed to Work?

Practitioners claim homeopathic medicine works by stimulating a self-healing mechanism in the body’s defense. This is accomplished by administering a substance that would cause the symptoms, except that is provided in very diluted form.

In the case of this clinical trial, it was diluted to a standard quinquagintamillesimal (Q or LM) potency, diluted 50,000 times.  But it’s important that homeopathic preparations not simply be diluted, they must be violently shaken between dilutions. Dilution just reduces potency, but dilution plus shaking or succussion as it is called, is believed to increase potency.

It is possible that there is not even a single molecule of the original substance left in the final, diluted remedy. So homeopathic remedies may consist of nothing but water. That does not bother homeopaths because they believe that because of dilution and succussion, the original compound leaves an “imprint” in the water that no longer depends on the substance still being physically present.

Saving Thousands of Lives With Homeopathy

A spoof video posted on YouTube by Myles Power Powerm1985-72936129announced that he was going to save thousands of lives by dumping small quantities of homeopathic remedies into Scottish streams that flowed into the North Sea. He obtained the remedies from a homeopathic first aid kit advertised on Amazon that promised cures for stroke, heart attacks, poisoning and drowning. The remedies would be appropriately diluted and people all over Europe and perhaps eventually the rest the world would obtain the protection of homeopathy against these conditions.homeopathic first aid kit

The problem with this spoof is that the joke had impeccable logic from a homeopathic perspective, except maybe that being rolled around in North Sea storms did not provide sufficient succession.

Many uses of homeopathy are drawn to claims that it is safe and draws upon the body’s natural healing potential. I doubt many users understand the dilution. Dana Ullman notes that Alexa Ray Joel, daughter of Billy Joel and model Christy Brinkley attempted suicide by taking an “overdose” of her homeopathic medicine.

Homeopathy as Evidence-Based Medicine

A Dana Ullman 2010 article in Huffington Post, Homeopathy: A Healthier Way to Treat Depression drew over 500 “likes.”  Ullman bills himself as an evidence-based homeopath.

Another Ullman article, the 2012 The Homeopathic Alternative to Antidepressants, is a spirited defense of the advantages of homeopathy over conventional antidepressants. Ullman is obviously aware of the scientific literature and draws freely, even if selectively interpreting articles that have appeared in New England Journal of Medicine and (ugh) PLOS Medicine to argue that antidepressants are no more effective than a placebo. Ullman also argues that even if antidepressants are effective in relieving the symptoms of depression, their effectiveness comes at the cost of frustrating the body’s natural reactions to depression and so any improvement obtained with cannot be expected to continue after stopping antidepressants.

A United Kingdom National Health Services (NHS) webpage denounces the lack of a scientific basis for homeopathy, cites the authoritative 2010 UK House of Commons Science Technology Committee Report on Homeopathy to argue that

The ideas that underpin homeopathy are not accepted by mainstream science, and are not consistent with long-accepted principles on the way that the physical world works.

As for the succession process, the NHS further quotes the 2010 report

We consider the notion that ultra-dilutions can maintain an imprint of substances previously dissolved in them to be scientifically implausible.

However,  the NHS article then wimped out, indicating NHS does not take a stand against homeopathic medicine and offers web links for referrals.

Why I Liked the PLOS Article

The article had a number of strengths in terms of trial design and a transparent reporting of what actually happened. No confirmatory bias here—or Barnum conclusion that further research is needed.

  • The protocol for the study had been pre-registered and was publically available.
  • The patients and the whole study team remained blinded to the identity of the four treatment groups until the end of the study.
  • The use of both a placebo control for the medication and a more conventional interview for the longer homeopathic interview.

This latter feature allowed for some control of the ritual, attention, and support with which homeopathic medications are delivered. Without the interview, homeopathic practitioners could argue that the medication was administered without appropriate personalization. Yet, knowing that depression is responsive to support inattention delivered with positive expectations, it was imperative to control for these elements of the treatment.

  • The write up of the trial complied with CONSORT in its transparent report of rationale, methods, and results.
  • The frank admission that the investigators failed in their effort to recruit sufficient numbers of patients and this  failure suggests that another attempt might not be warranted.

PLOS One is Not Just Any Journal

The PLOS One website notes

PLOS ONE will rigorously peer-review your submissions and publish all papers that are judged to be technically sound. Judgments about the importance of any particular paper are then made after publication by the readership (who are the most qualified to determine what is of interest to them).

PLOS ONE publication criteria are

  1. The study presents the results of primary scientific research.
  2. Results reported have not been published elsewhere.
  3. Experiments, statistics, and other analyses are performed to a high technical standard and are described in sufficient detail.
  4. Conclusions are presented in an appropriate fashion and are supported by the data.
  5. The article is presented in an intelligible fashion and is written in standard English.
  6. The research meets all applicable standards for the ethics of experimentation and research integrity.
  7. The article adheres to appropriate reporting guidelines and community standards for data availability.

Science-Based Medicine Instead of Evidence-Based Medicine?

Paul Ingraham, an editor at Science Based Medicine asked

Why “Science”-Based Instead of “Evidence”-Based?

And summarized a recurring theme going all the way back to the first post at the blog

The idea of emphasizing science in general instead of evidence in particular was first publicly proposed by Yale neurologist Dr. Steven Novella and infamous medical blogger and surgical oncologist Dr. David Gorski in early 2008, along with several other physician co-authors:

EBM is a vital and positive influence on the practice of medicine, but it has its limitations. Most relevant to this blog is the focus on evidence to the exclusion of scientific plausibility. The focus on evidence has its utility, but fails to properly deal with medical modalities that lie outside the scientific paradigm, or for which the scientific plausibility ranges from very little to nonexistent.


It is not that we are opposed to EBM, nor is it that we believe EBM and SBM to be mutually exclusive. On the contrary: EBM is currently a subset of SBM, because EBM by itself is incomplete. We eagerly await the time that EBM considers all the evidence and will have finally earned its name. When that happens, the two terms will be interchangeable..

A comment left on the inaugural SBM blog post

Why is homeopathy implausible? Among other matters, its signature proposition is implausible mainly because never in the whole of human experience or research has dilution of solutions been found to enhance their intrinsic physical, chemical or biological properties (Hormesis is a property of a few biological systems, not the consistent behavior of solutions that homeopathy requires). Thus, dilution doesn’t make our coffee taste stronger and we don’t expect otherwise no matter how much we shake or stir it.

You can find lots of posts at Science Based Medicine concerning homeopathy, including Harriet Hall’s fine discussion of homeopathy first aid kits and Steven Novella’s expression of upset over the Swiss endorsement of homeopathy.

What if

  • …I had rejected the article?

The authors could have gone elsewhere and presented the results with the more confirmatory spin and a call for further research.

Maybe they wouldn’t get published anywhere that would attract attention from anybody but homeopaths. But if so,  maybe the German government would be tempted to finance another trial that was less responsibly conducted and well reported.

  • …The trial had recruited a sufficient number of patients and found a significant effect favoring homeopathic medication when was it administered based on the extensive interview?

I might still have accepted the article, but would not be persuaded of the efficacy of homeopathic medication. I’m enough of a Bayesian to be unshaken by one trial in my belief that a scientifically absurd mechanism could produce effects. I would require the author to acknowledge the lack of a scientific basis for effective homeopathy on depression and propose other mechanisms, perhaps the greater ritual and positive expectations in the longer interview.

One trial does not undo 200 years of claims that are scientifically nonsense.

  • … I had been in a position to participate in the grant review that resulted in funding of the study?

I would say there is not a sufficient scientific basis for homeopathy that would justify the resources required of a well-designed study.

Just because I would accept this article, doesn’t mean that I would approve funding of the study, or be construed in collaborating in the study.

So why was I indignant that PLOS Medicine published a clinical trial comparing acupuncture to antidepressants?

Acupuncture similarly lacks a credible scientific explanation for its effects beyond the rituals in which it is administered. Appeals to ancient Chinese medicine are not scientific.

I would expect a sham treatment having the same ritual with provider and patient blinded would produce the same effect, unless some risk of bias had been introduced.

I think the lack of evidence for the mechanisms proposed by practitioners of acupuncture is sufficient to require that the role of rituals be tested with an appropriate control group, such as sham acupuncture delivered by someone blind to the purpose of hypotheses of the study.

The PLOS Medicine in question did not have an appropriate comparison group controlling for ritual. The authors were allowed to interpret the results with a confirmatory bias.

The article should not have been published in PLOS Medicine because it was scientifically flawed and the authors did not acknowledge the flaws. It greatly embarrasses me that this article got published, and should embarrass the editor who accepted it.

My reaction to publication of the article is to make a determined effort to educate PLOS  editors about the necessity of insisting on appropriate control groups. And about the need to protect the journal from those who would want to exploit its interest in a broader range of of articles to promote fake treatments based on bad science. I am also going to seek some sort of general recommendations from the PLOS management prevent this happening in the future.




“Strong evidence” for a treatment evaporates with a closer look: Many psychotherapies are similarly vulnerable.

Note: BMC Medicine subsequently invited a submission based on this blog post.

Coyne, J. C., & Kwakkenbos, L. (2013). Triple P-Positive Parenting programs: the folly of basing social policy on underpowered flawed studies. BMC Medicine, 11(1), 11.

It is now available here:

Promoters of Triple P parenting enjoy opportunities that developers and marketers of other “evidence-supported” psychosocial interventions and psychotherapies only dream of. With a previously uncontested designation as strongly supported by evidence, Triple P is being rolled out by municipalities, governmental agencies, charities, and community-based programs worldwide. These efforts generate lots of cash from royalties and license fees, training, workshops, and training materials, in addition to the prestige of being able to claim that an intervention has navigated the treacherous path from RCT to implementation in the community.

With hundreds of articles extolling its virtues, dozens of randomized trials, and consistently positive systematic reviews, the status of the Triple P parenting intervention as evidence supported would seem beyond being unsettled by yet another review. Some of the RCTs are quite small, but there are public health level interventions, including one involving 7000 children from child protective services. Could this be an instance in which it should be declared “no further research necessary”? Granting agencies have decided not to fund further evaluation of interventions on the basis of a much smaller volume of seemingly less unanimous data.

But the weaknesses revealed in a recent systematic review and meta-analysis of the Triple P by Philip Wilson and his Scottish colleagues show how apparently strong evidence can evaporate when it is given a closer look. Other apparently secure “evidence supported” treatments undoubtedly share these weaknesses and the review provides a model of where to look. But when I took careful look, I discovered that Wilson and colleagues glossed over a very important weakness in the body of evidence for Triple P. They noted it, but didn’t dwell on it. So, weakness in the body of evidence for Triple P is much greater than a reader might conclude from Wilson and colleagues’ review.

 WARNING! Spoiler Ahead. At this point, readers might want to download the article and form their own impressions, before reading on and discovering what I found. If so, they can click on this link and access the freely available, open access article.

Wikipedia describes Triple P as

a multilevel parenting intervention with the main goal of increasing the knowledge, skills, and confidence of parents at the population level and, as a result, reduce the prevalence of mental health, emotional, and behavioral problems in children and adolescents. The program is a universal preventive intervention (all members of the given population participate) with selective interventions specifically tailored for at risk children and parents.

A Triple P website for parents advertises

the international award winning Triple P – Positive Parenting Program®, backed by over 25 years of clinically proven, world wide research, has the answers to your parenting questions and needs. How do we know? Because we’ve listened to and worked with thousands of parents and professionals across the world. We have the knowledge and evidence to prove that Triple P works for many different families, in many different circumstances, with many different problems, in many different places!

The Triple P website for practitioners declares

As an individual practitioner or a practitioner working within an organisation you need to be sure that the programs you implement, the consultations you provide, the courses you undertake and the resources you buy actually work.

Triple P is one of the only evidence-based parenting programs available worldwide, founded on over 30 years of clinical and empirical research.

Disappearing positive evidence

In taking stock of Triple P, Wilson and colleagues applied objective criteria in a way that readily allows independent evaluation of their results.

They identified 33 eligible studies, almost all of them positive in indicating that Triple P has positive effects on child adjustment.

  • Of the 33 studies, most involving media-recruited families so that participants in the trials were self-selected and more motivated than if they are clients referred from community services or involuntarily getting treatment mandated by child protection agencies.
  • 31/ 33 studies compared Triple P interventions with waiting list or no-treatment comparison groups. This suggests that Triple P may be better than doing nothing with these self-referred families, but doesn’t control for simply providing attention, support, and feedback. The better outcomes for families getting Triple P versus getting than wait list or no treatment may reflect families assigned to these control conditions registering the disappointment with not getting what they had sought in answering the media ads.
  • In contrast, the two studies involving an active control group showed no differences between groups.
  • The trials evaluating Triple P typically administered a battery of potential outcomes, and there is no evidence for any trials that particular measures were chosen ahead of time as the primary outcomes. There was considerable inconsistency among studies using the same instruments in decisions about which subscales were reported and emphasized. Not declaring outcomes ahead of time provides a strong temptation for selective reporting of outcomes. Investigators analyze the data, decide what measures puts Triple P in the most favorable light, and declare post hoc those outcomes as primary.
  • Selective reporting of outcomes occurred in the the abstracts of these studies. Only 4/33 abstracts report any negative findings and 32/33 abstracts were judged to give a more favorable picture of the effects of Triple P.
  • Most papers only reported maternal assessments of child behavior and the small number of studies that obtained assessments from fathers did not find positive treatment effects from the father’s perspective. This may simply indicate the detachment and obliviousness of the fathers, but can also point to a bias in the reports of mothers who had made more of an investment in getting treatment.
  • Comparisons of intervention and control groups beyond the duration of the intervention were only possible in five studies. So, positive results may be short-lived.
  • Of the three trials that tested population level effects of Triple P, two were not randomized trials, but had quasi-experimental designs with significant intervention and control group differences at baseline. A third trial reported a reduction in child maltreatment, but examination of results indicate that this was due to an unexplained increased in child maltreatment in the control area, not a decrease in the intervention area.
  • Thirty-two of the 33 eligible studies were authored by Triple-P affiliated personnel, but only two had a conflict of interest statement. Not only is there strong possibility of investigator allegiance exerting an effect on the reported outcome of trials, there are undeclared conflicts of interest.

The dominance of small, underpowered for quality studies

Wilson and colleagues noted a number of times in their review that many of the trials are small, but they do not dwell on how many, how small, or with what implications. My colleagues have adopted the lower limit of 35 participants in the smallest group for inclusion of trials in meta-analyses. The rationale is that any trial that is smaller than this does not have a 50% probability of detecting a moderate sized effect, even if it is present. Small trials are subject to publication bias in that if results are not claimed to be statistically significant, they will not to get published because the trial was insufficiently powered to obtain a significant effect. On the other hand, when significant results are obtained, they are greeted with great enthusiasm precisely because the trials are so small. Small trials, when combined with flexible rules for deciding when to stop a trial (often based on a peek at the data), failure to specify primary outcomes ahead of time, and flexible rules for analyses, can usually be made to appear to yield positive findings, but that will not be replicated. Small studies are vulnerable to outliers and sampling error and randomization does not necessarily equalize group differences they can prove crucial in determining results. Combining published small trials  in a meta-analysis does not address these problems, because of publication bias and because of all or many of the trials sharing methodological problems.

What happens when we apply the exclusion criterion to Triple P trials of <35 participants in the smallest group? Looking at table 2 in Wilson and colleagues’ review, we see that 20/23 of the individual papers included in the meta-analyses are excluded. Many of the trials quite small, with eight trials having less than 20 participants (9 -18) in the smallest group. Such trials should be statistically quite unlikely to detect even a moderate sized effect, and that so many nonetheless get significant findings attests to a publication bias. Think of it: with such small cell sizes, arbitrary addition or subtraction of a single participant can alter results. Figure 2 in the review provides the forest plot of effect sizes for two of the key outcome measures reported in Triple P trials. Small trials account for the outlier strongest finding, but also the weakest finding, underscoring sampling error. Meta-analyses attempt to control for the influence of small trials by introducing weights, but this strategy fails when the bulk of the trials are small. Again examining figure 2, we see that even with the weights, small trials still add up to over 83% of the contribution to the overall effect size. Of the three trials that are not underpowered, two have nonsignificant effects entered into the meta-analysis. The confidence intervals for the one moderate size trial that is positive barely excludes zero (.06).

Wilson and colleagues pointed to serious deficiencies in the body of evidence supporting the efficacy of Triple P parenting programs, but once we exclude underpowered trials, there is little evidence left.

Are Triple P parenting programs ready for widespread dissemination and implementation?

Rollouts of the kind that Triple P is now undergoing are expensive and consume resources that will not be available for alternatives. Yet, critical examination of the available evidence suggests little basis for assuming that Triple P parenting programs will have benefits commensurate with their cost.

In contrast to the self-referring families stayed in randomized trials, the families in the community are likely to be more socially disadvantaged, often single parent, and often coming to treatment only because of pressure and even mandated attendance. Convenience samples of self-referred participants are acceptable in the early stages of evaluation of an intervention, but ultimately the most compelling evidence must come from participants more representative of the population who will be treated in the community.

Would other evidence supported interventions survive this kind of scrutiny?

Triple P parenting interventions have the apparent support of a large literature that is unmatched in size by most treatments claiming to be evidence supported. In a number of articles and blog posts, I have shown that other treatments claimed to be evidence supported often have only weak evidence. Similar to Triple P, other treatments are largely evaluated by investigators who have vested financial and professional interests in demonstrating their efficacy, in studies that are underpowered, and with a high risk of bias, notably in the failure to specify which of many outcomes that are assessed are primary. Similar to Triple P, psychotherapies routinely get labeled as having strong evidence based solely on studies that involve comparisons with no treatment or waitlist controls. Effect sizes exaggerate the advantage over these therapies over patient simply getting nonspecific, structured opportunities for attention, support, and feedback under conditions of positive expectations. And, finally, similar to what Wilson and colleagues found for Triple P, there often large gaps between the way findings are depicted in abstracts for reports of RCTs and what can be learned from the results sections of the actual articles.

In a recent blog post, I also showed that American Psychological Association Division 12 Clinical Psychology had designated Acceptance and Commitment Therapy (ACT) as having strong evidence for efficacy n hospitalized psychotic patients, only to have that designation removed when I demonstrated that the basis for this judgment was two null flawed and small trials. Was that shocking or even surprising? Stay tuned.

In coming blog posts, I will demonstrate problems with claims of other treatments being evidence-based, but hopefully this blog provides readers with tools to investigate for themselves.

What I learned as an Academic Editor for PLOS ONE

Open access week is just around the corner, and I thought I’d take the opportunity to share my experience as an Academic Editor for PLOS ONE.

I was invited to join the team following a conversation at Science Online 2010 with I think Steve Koch, who recommended me to PLOS ONE, and before I knew it I was receiving lots of emails asking me to handle a manuscript.

The nice thing about PLOS ONE is that I get to choose which articles I get to handle, and I am very picky. I think that my role is not just to’ handle’ the manuscript but also make sure that the review process is fair. To do this, I need to understand the manuscript myself. I read every article that I take on and write a ‘mini-review’ of it for myself. When I get the external peer reviews I go through every comment they make against the submitted version, compare the different reviews and revisit my first impression of the manuscript. I have learned a lot from the reviewers, they see things I have missed, and they miss things I have detected. It has been a great insight into the peer review process. And I love not having to pull my crystal ball out to determine whether the article is ‘important’ but just having to decide whether it is scientifically solid.

Image by Wiertz Sébastien on Flickr, licenced under CC-BY

If the science is fundamentally good the articles are sent back to the authors for either minor or major changes, and then it falls back into my inbox. I have found it really interesting to see how authors deal with the reviewer’s comments. The re-submission is also a lot of work. I need to compare the original and new version, make sure that the authors have done what they say they have done, make sure that all reviewer’s comments have been addressed. And then I decide if I send it back for re-review or not. One thing that I found interesting in this second phase is when authors respond to the reviewer’s comments in the letter but do not incorporate that into the article. It is almost as if the responses are for my and the reviewer’s benefit only. So back it goes asking them to incorporate that rationale into the actual manuscript. Oh well. That means another round. Luckily this does not happen that often.

And then it is time to ‘accept’ the paper – and so back to the manuscript where I go through commas, colons, paragraphs, spelling mistakes, in text citations, reference lists, formatting, image quality, figure legends, etc. This I normally send to the authors together with their acceptance letter but don’t ask for the article to be re-submitted.

The main challenge I find with the process is time management.

When I get the request to handle an article, I accept or nor based on how much time I have to process the article. That is all good. Except that I cannot predict when the reviews, resubmissions, etc will eventually happen – and many times these articles ‘ready for decision’ show up in my inbox at a time when I cannot give it the full attention it deserves.  Let alone being able to predict when the revised version will be submitted! I find it impossible to plan ahead for this, especially since I have very little control over a lot of my time commitments (like the days I need to lecture, submit exam questions, mark exams). So if an article arrives while I am somewhere at a conference with limited internet connection… How can I plan for this?

Finding reviewers is another challenge. Sometimes they are hard to find. Nothing as discouraging as finding the “reviewer declined…” emails in my inbox indicating that it is back to the system to do something that I thought was done and dusted. The other day someone asked what is a reasonable amount of reviewing one should do a year? My answer was that one should probably at minimum return the number of reviews provided for one’s articles. Say I publish 3 articles a year, each with 3 reviews, then I should not start complaining about reviewing until I have reviewed at least 9 articles. (of course, one can factor in rejection rate, number of authors, etc) but a tit for tat trade-off seems like a fair expectation. So then why is it so hard to find reviewers? Come on people – if it was your paper getting delayed you’d be sending letters to the journal asking how come the article shows as still sitting with the Editor!

And that is the other thing I learned. Editors don’t just sit on papers because they are lazy. There are many reasons why handling an article may take more or less time. In some cases, after receiving the reviews I feel that something has been raised that needs a specialist to look at a specific aspect of the paper. Sometimes I need a second opinion because there is too little agreement between reviewers. Sometimes the reviewers don’t submit in the agreed time. There are many reasons why an article can be delayed, and so what I learned is to be patient with the editors when I send my papers for publication.

But despite the headaches, the stress and the struggle of being an Academic Editor, it is also an extremely rewarding experience. I keep learning more about science because I see a range of articles before they take their final shape, because I get to look into the discussion of what is good and what is weak. And I get to be part of what makes science great: trying to put out the best we can produce.

It is unfortunate that this process is locked up. I think that there is a lot to learn from it. I think that students and early career scientists would really benefit from seeing the process in articles that are not their own, how variable the quality of the reviews are, what dealing well with reviewers comments and suggestions looks like. And the public too would benefit from seeing what this peer review is all about – what the strengths and weaknesses of the process are and what having been peer reviewed really means.

So, back to Open Access week. Access to the final product is really good. Access to the process of peer review can make understanding the literature even better, because it exposes a part of the process of science that is also worth sharing.