Thursday 16 February 2012

An online repository for sharing experiments?

Have you ever read a psychology/neuroscience journal article and wondered if the information the authors had given you in the methods section was really sufficient for you to replicate the study?

Have you ever wanted to start a study with a new piece of software or something outside your normal method, and wished there was some existing experiment code that you could adapt for your needs?

A couple of people on the PsychoPy users list have suggested that it would be good to have a place to upload experimental code and materials to share.

It would serve a few purposes:
  • makes a study genuinely replicable, because you would be able to fetch the actual experiment as the authors used. 
  • publicises an experiment that you've run because people could browse the repository looking for experiments they found interesting
  • provides a starting point for new users of a piece of software to build an experiment
The first goal can actually also be met by uploading your experiment to your own lab web pages, but that solution doesn't address the second and third points.

The repository would be agnostic to the subject of the study, and to the software used to run it. You would upload all the materials needed to run it (code, image files etc), tag which software package it was written for (PsychoPy, E-Prime, Presentation, Psychtoolbox etc...), provide a summary of what results should be expected and a reference to the paper showing the original (if published). Then you provide keywords about the topic that the experiment addresses so that people can browse or search for the experiment. Users might search by topic, keyword or software package to find experiments to learn from or replicate.

Potential issues

A few people have raised concerns about the idea:

  • Will it lead people to run studies that they didn't actually understand? For example, see this post on eagle-eyed-autism describing a study going badly wrong because the authors had borrowed code and hadn't really understood it. Is the answer to make sure it's very difficult to run studies, so that the scientist has to really know what you're doing in order to manage? That seems more than a little arrogant.
  • Will errors in studies propagate more? If a study has an error, when another lab writes it from scratch the error will likely not be made, but if they borrow and tweak the bug could propagate. I think the benefit that more eyes potentially examine the experiment and reduce the propagation of bugs.
  • Why should someone else simply take the experiment that I spent hours writing? To me this one just seems blatantly at odds with the aims and philosophy of science. But I guess some people will feel territorial like that.
  • People would never use such a site (unless forced) because they will be too embarrassed by the quality of their code, which was, after all designed to work without necessarily being elegant. I'm fairly sympathetic to this (although I've obviously shared many thousands of lines of my own code). But some people will be brave enough to expose their work fully, especially if it was generated by something like E-Prime or PsychoPy Builder, where the need actually to write code is reduced.

The idea is definitely growing on me, although I don't currently have the time to build the site, nor the funding to pay someone to build it.

I'm keen to hear more views. So feel free to comment below. Hopefully the idea will also be discussed as part of a satellite event on open-science at the Vision Sciences Society conference this May.

6 comments:

  1. Neat idea. You may be interested in seeing how economists have implemented just such a site at http://www.runmycode.org Seems like people are actually using it.

    Of course a much lower bar is simply to provide the code in the first place (e.g. see the http://sciencecodemanifesto.org/, & various articles in Nature & Science that have called for this, somewhat in vain). Github seems like a convenient place to start with this.

    ReplyDelete
    Replies
    1. Thanks for the links to those sites. They're both very interesting.

      Delete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. The use of Github / Bitbucket seems useful to me for this. Lately, I've started keeping all my analysis code (and manuscripts, as .tex files) under version control using private repositories on Bitbucket (who allow free private repositories for a limited number of collaborators, unlike Github who charge by the number of private repositories).

    My eventual plan is to simply toggle the switch from "private" to "public" once the paper is published, whereupon all the code that produced the paper is available along with a version history.

    You can see an early version of this here: https://github.com/tomwallis/microperimetry_faces. I only have analysis scripts up at the moment as the paper is currently submitted -- reviewers will be able to look at my analysis and the data if they like, as part of the paper review.

    ReplyDelete
  4. Update: there's now a dedicated site called openscienceframework.org for uploading materials from any scientific project for free. It allows private and public projects and private projects can still be shared with collaborators.
    In addition it has facilities for things like pre-registration of a hypothesis. (Technically I believe it is essentially a web front-end to a git repository server, so it does actually store all versions of the work, but it means people don't have to know anything about version control systems to use it).
    I intend to use it for all new projects from here onwards, and my first paper to be published with access to all code and data can be found there right now (the manuscript will be published any day now):
    http://openscienceframework.org/project/nqWSs/

    ReplyDelete
  5. Helllo Jon,

    Very fair review, thanks for that! Adding strings works in MATLAB since the introduction of string arrays with release R2016b:

    >> str1 = "Hello";
    >> str2 = " World";
    >> str1+str2

    ans =

    "Hello World"

    Kind regards, Martin

    ReplyDelete