Note: This is a reblog from the OKFN Science Blog. As part of my duties as a Panton Fellow, I will be regularly blogging there about my activities concerning open data and open science.
Peer review is one of the oldest and most respected instruments of quality control in science and research. Peer review means that a paper is evaluated by a number of experts on the topic of the article (the peers). The criteria may vary, but most of the time they include methodological and technical soundness, scientific relevance, and presentation.
“Peer-reviewed” is a widely accepted sign of quality of a scientific paper. Peer review has its problems, but you won’t find many researchers that favour a non peer-reviewed paper over a peer-reviewed one. As a result, if you want your paper to be scientifically acknowledged, you most likely have to submit it to a peer-reviewed journal.
Even though it will take more time and effort to get it published than in a non peer-reviewed publication outlet.
Peer review helps to weed out bad science and pseudo-science, but it also has serious limitations. One of these limitations is that the primary data and other supplementary material such as documentation source code are usually not available. The results of the paper are thus not reproducible. When I review such a paper, I usually have to trust the authors on a number of issues: that they have described the process of achieving the results as accurate as possible, that they have not left out any crucial pre-processing steps and so on. When I suspect a certain bias in a survey for example, I can only note that in the review, but I cannot test for that bias in the data myself. When the results of an experiment seem to be too good to be true, I cannot inspect the data pre-processing to see if the authors left out any important steps.
As a result, later efforts in reproducing research results can lead to devastating outcomes. Wang et al. (2010) for example found that they could not reproduce almost all of the literature on a certain topic in computer science.
“Reproducible”: a new quality criterion
Needless to say this is not a very desirable state. Therefore, I argue that we should start promoting a new quality criterion: “reproducible”. Reproducible means that the results achieved in the paper can be reproduced by anyone because all of the necessary supplementary resources have been openly provided along with the paper.
It is easy to see why a peer-reviewed and reproducible paper is of higher quality than just a peer-reviewed one. You do not have to take the researchers’ word of how they calculated their results – you can reconstruct them yourself. As a welcome side-effect, this would make more datasets and source code openly available. Thus, we could start building on each others’ work and aggregate data from different sources to gain new insights.
In my opinion, reproducible papers could be published alongside non-reproducible papers, just like peer-reviewed articles are usually published alongside editorials, letters, and other non peer-reviewed content. I would think, however, that over time, reproducible would become the overall quality standard of choice – just like peer-reviewed is the preferred standard right now. To help this process, journals and conferences could designate a certain share of their space to reproducible papers. I would imagine that they would not have to do that for too long though. Researchers will aim for a higher quality standard, even if it takes more time and effort.
I do not claim that reproducibility solves all of the problems that we see in science and research right now. For example, it will still be possible to manipulate the data to a certain degree. I do, however, believe that reproducibility as an additional quality criterion would be an important step for open and reproducible science and research.
So that you can say to your colleague one day: “Let’s go with the method described in this paper. It’s not only peer-reviewed, it’s reproducible!”