Tag Archives: reproducible research

Should I cite?

In the old day things were simple – if you borrowed data, an idea, a method, or any specific piece of information, you knew you need to cite the source of such wisdom. With the rise of online communication these lines have become more blurred, especially in the domain of research software.

Although we use a wide variety of software to conduct our research it is not immediately obvious which of them deserve a formal citation, which should be mentioned and which can be left out completely. Imagine three researchers doing exactly the same piece of data analysis: the first one uses Excel, the second – SPSS, the third coded it up in R. The chances are that the Excel scholar won’t disclose which particular tool allowed him to calculate the p-values, the SPSS user will probably mention what they used, including the version of the software and the particular function employed, finally the R wizard is quite likely to actually cite R in the same way as they would cite a journal paper.

You may think this is not a big deal and we are talking about the fringes of science, but in fact it is. As everyone who has ever tried to replicate (or even just run) someone else’s simulation will tell you, without detailed information on software that was used, the chances of succeeding vary between  virtually impossible to very difficult. But apart from the reproducibility of research there is also the issue of credit. Some (probably most) of the software tools we are using were developed by people in research positions – as their colleagues were producing papers, they have spent their time developing code. In the world of publish or perish they may be severely disadvantaged if their effort is not credited in the same way as their colleagues. Spending two years developing a software tool that is used by hundreds of other researchers and not getting a job because the other candidate had published three conference papers in the meantime, sounds like a rough deal.

To make it easier to navigate this particular corner of academia, we teamed up with research software usurers and developers during the Software Sustainability Institute Hackday and created a simple chart and a website to help you make the decision of when to and when not to cite research software.

shoudacite_comic

If you’re still unsure check out the website we put together for more information about research software credit, including a short guide on how to get people to cite YOUR software:    Also, keep in mindt hat any model uploaded to OpenABM gets a citation and a doi, making it easy to cite.

 

 

 

 

 

 

New tool for reproducible research – The ReScience Journal

An article about computational science in a scientific publication is not the scholarship itself it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. – Buckheit and Donoho 1995

In 2003 Bruce Edmonds and David Halescalled their paper ‘Replication, Replication and Replication: Some Hard Lessons from Model Alignment‘ expressing both the necessity of replicating computational models and the little appreciated but significant effort that goes into such studies.

In our field replication usually corresponds to re-writing the simulation’s code. It is not an easy task because algorithms and details of implementations are particularly difficult to communicate and even if the code is made available simply copying it would be pointless. Equally, publishing one’s replication is not straight forward as, again, the language of communication is primarily the code.

The ReScience Journal is a brand new (just over one month old) journal dedicated to publishing replication studies. What sets it apart is that it is github based! Yes, you’ve heard it – it is a journal that is (almost) entirely a code depository. This simplifies the whole process and helps with the issue of ‘failed replications’ (when the replication rather than the original study has a bug).  You upload your replication code and other researchers can simply fork out their implementations. How come nobody thought of this earlier?