ASAP awards – Interview with Mark Costello

ASAP Finalist Announcement 600x600Mark Costello, a researcher at the Institute of Marine Science and Leigh Marine Laboratory (University of Auckland in New Zealand) was nominated for his work with WoRMS of which he was founding chair. The site provides a database of scientific names for all marine species. Species are sometimes described with different scientific names, and the site helps disambiguate these names and also provides or links to information about each species.

Q: How did the project come about

Courtesy of ASAP awards

MC: When I was in Ireland 1990’s I was involved in workshops developing policies for biodiversity – the main barrier was lack of coordination of species names. This meant we couldn’t merge datasets easily enough. In 1996 I put in a proposal to create an inventory of science species names which was funded by the European Commission. Since 2004, the Flemish Government has funded the hosting of the database. Once the infrastructure was secure and professionally managed, then getting the info into it became possible. People were motivated because this was a permanent website with permanent support from the Flanders Marine Institute (VLIZ). It started as a clean-up exercise.

Q: What is special about the site?

MC: By providing naming information about species, it helps people navigate the scientific literature where alternative names may be used, but it also links to information about the species.

Q: What did you learn from working on WoRMS?

Courtesy of Mark Costello

MC: There were unexpected patterns that were discovered from the data. We discovered that the number of species being described over time has been increasing at a linear rate. When you look at the authors there are now about 3-5 times more people discovering species than ever before – so taxonomists are not really disappearing as many people have said. The number of species discovered per author is, however, declining. That it is taking more people to discover species than it did before suggests that we have discovered most species on Earth (at least half, perhaps 2/3), not only a small fraction as some have speculated. We found also that science is doing better, conservation is working.

Q: What was people’s response?

MC: Word of mouth helped – there was an element of trust. We only know the people we know – but when you look globally you start to get a different picture than when you look at your own community. The taxonomy is curated by specialists, and people are now more trusting about online collaboration than when we started. But it was important to have a long – term commitment to supporting these databases to make the system sustainable so that the databases are shareable.

According to their stats page, In 2007 the site had received 37,221 unique visitors and by 2012 this number had risen to 817,335 unique visitors and 30,423,583 page views. The material is provided under a CC-BY, although permission needs to be sought for the re-distribution of the entire database, and it seems too to download the entire database too. I asked Mark about that.

MC: I don’t think that the CC-BY is a hindrance for sharing the data or reusing. We provide a clear citation for the data. We want the source to be cited because we consider it a scholarly publication. And users concerned about quality assurance of their sources can then cite it as an ‘authoritative’ rather than anonymous resource. When you combine the data into a new set, people that want to use this new group or want to replicate need to know where the original data came from. Otherwise they would be having to start from scratch. The citation solves this problem.

Courtesy of Mark Costello

MC: The request was put there originally because databases change over time and we were worried that there would be multiple copies which could create confusion as to what is the best source. It also was a way of not having to deal with data flow issues if too many people were downloading the entire database at the same time. We also needed safeguarding from attacks of sending constant queries to the database. But it is also a good way of knowing and tracking who your users are, so we can provide the list of organisations that use the database when we are out looking for funding and support.

Q: What would you like to see next?

MC: I would love to have all species on Earth in a quality approved database and see what we could then discover about the species. We learned a lot from querying this database, and we could learn a lot more if we had all species in there.

Even if you are not interested in digging into the data, the site is a great place to get to know our underwater neighbours. I encourage you to visit the site.

11 thoughts on “ASAP awards – Interview with Mark Costello”

  1. I wish to try to give a more balanced view of WoRMS, rather than to simply “plug it”, as MC is doing above. I am the major contributor to Wikispecies, a Wikimedia site which aims to do the same thing as WoRMS, only for all organisms (including fossil species), not just extant marine ones. While all such initiatives (and there are many) have their pros and cons, the major cons of WoRMS are that it relies heavily on “expert authority”, which in the real world is often wrong, as opposed to the Wikispecies approach which is to facilitate the user to check the data for themselves (i.e. putting the power in the hands of the user, not the data provider). “Experts” are often in a hurry, and don’t have time to check everything. WoRMS provides little or no incentive to report errors, and has difficulty keeping itself updated.


    1. Don’t you think there is room for both models to operate simultaneously? I am not sure whether I necessarily that expert authority is often wrong in the real world, or that they are often in a hurry and unable to do the curating. Is there any data showing that such is the case? I would love to see some comparative analysis of the two approaches to such databases if it hasn’t been done already.


      1. Here is one of many examples of the difference in approach: a small genus of sea slugs called Ancula. Both WoRMS and Wikispecies should have all Ancula related names listed as either valid or synonym. But see my comparisons with WoRMS at the bottom of this Wikispecies page: WoRMS is missing any record of 3 names in this small genus. By clicking on the names on the Wikispecies page, you get literature references to verify the existence of the name.


      2. Also, WoRMS refuses to post links to the corresponding Wikispecies pages, even when the latter has better data (probably particularly when the latter has better data!)


  2. Another “issue” is this: despite the best efforts of Costello, Zhang, etc. to convince people otherwise, compilations of species names are not like “scholarly publications”. They are simply compilations of data from scholarly publications, which have already been through peer review and already been published. Abstracting agencies basically do this sort of thing all the time, but they don’t claim to be authors of new “scholarly publications” that they can use to boost their publication record without having to do original research. I’m sure MC would love to take this a step further and claim co-authorship of countless published bits taken directly from WoRMS, and republished in scientific journals! I would really hate to see public research money, intended for original research, wasted on publication of compilations of such secondary data, particularly if open access fees have to be paid to the journals.


    1. “I’m sure MC would love to take this a step further and claim co-authorship of countless published bits taken directly from WoRMS, and republished in scientific journals! ”

      I do not know this to be the case. That was not the impression I got from him when we talked about it. Is there anything in particular that makes you think this is his intention or that this is how he has operated?


  3. Costello, M.J. et al. 2013: Biodiversity data should be published, cited, and peer reviewed. Trends in ecology & evolution, 28(8): 454-461. doi: 10.1016/j.tree.2013.05.002

    Reading the above paper, and knowing the people involved gives me that impression …


    1. Interesting – that is not my take on it. But indeed, it might be interesting to look at the relationship between data re-use, citations and authorship in both types of models, and see if a more curated model leads (inadvertently or not) to a bias or difference in behaviour around those.


  4. Basically, what is going on is this: currently, there are lots of biodiversity databases, like WoRMS, Wikispecies, etc., etc., and they tend to contradict one another. WoRMS (and others) try to throw off the contradiction by saying something like “we are the experts, and we say it is thus”. Wikispecies says something like “here are our sources, go check for yourself”. Anyway, GBIF has money for data, but wants clean data, whereas WoRMS etc. are currently too dirty. So, Costello et al. are lobbying for money from GBIF to publish the data in “peer reviewed” journals, where it can be cited and, as a desirable (to some) side effect, be claimable as a publication just like any other. In practice though, it will just be publishing data that is already public (but in not easily citable form) on websites like WoRMS, with little or no gain in data quality. Implicitly, it is an admission that websites like WoRMS aren’t really that reliable (which is why they tend to make liberal use of disclaimers). However, the Wikispecies approach is probably best (i.e. make it all user verifiable).


    1. I don’t think I can comment on that, since I do not have information about the issues you raise nor what the motivation might be. As I said, that is not the impression I got from chatting with Mark, quite the opposite. But thanks for bringing in a different point of view to the discussion. By the way, Daniel Mietchen (also a finalist) had pointed me to WikiSpecies a while back – loved the site. Somehow it dropped off my radar.


  5. It boils down to this: a war between those who insist “we are the experts, and we say it is so”, and those, like Wikispecies, who make no such claim, but instead explicitly cite all sources and say “if you don’t believe what I’ve written, go check for yourself”. The typical user will not want to bother to chack for themselves, and will want to be spoon fed reliable data. So, some databases go all out to play the “we are the experts, you can trust us” card, but in reality you can’t …


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s