On the importance of tools in software engineering research
On the first weekend of June, I was at the Mining Software Repositories (MSR) 2012 conference. For those not familiar with MSR, it is a venue where software engineering meets information extraction and data mining. Researchers present the tools and methods that they applied on software repositories (source code repositories, but also bug databases, mailing lists and wikis) to understand how software is written and how its quality is affected by certain events in the project’s history. Due to its wide scope, MSR is always a bit unbalanced with respect to the quality of the papers presented. This year however, there were some really great submissions.
One of the most interesting talks, was Dongmei Zhang’s keynote address on the first day. Dongmei is a senior researcher at Microsoft Research Asia, where she leads the development analytics project. During her presentation, she told some great tales from the research vs practice battlefield. One of them, concerned a code cloning detection tool, that has successfully graduated from Microsoft Research to internal Microsoft teams and finally to a Visual Studio 2012 plug-in. Dongmei explained that the most important reason this tool was successful was not that the research upon which it was based, but the fact that it was a TOOL. Imperfect in the beginning, its speed and accuracy was improved after suggestions from users started pouring in. What she learned from this experience was the importance of producing reusable tools out of the research was greater than doing the research itself. ‘Make tools. It works on my computer is no longer enough’, as she put it.
I was curious as to whether the above apply to the papers presented the very same day (and the next) to the very same conference that Dongmei gave the keynote talk to. To do so, I went through each paper and looked for pointers to the tools or datasets used. I also Googled the paper titles, hoping that the authors had put together a page containing the paper’s data or tools, as it is often the case.
The following table summarizes what I have found:
Paper | Data | Tools | Documentation | Comment |
Towards Improving Bug Tracking Systems with Game Mechanisms | Partial | No | No | |
GHTorrent: Github's Data from a Firehose | Yes | Yes | Partial | |
MIC Check: A Correlation Tactic for ESE Data | No | No | No | |
An Empirical Study of Supplementary Bug Fixes | No | No | No | |
Incorporating Version Histories in Information Retrieval Based Bug Localization | Yes | No | Yes | Uses existing documented dataset |
Think Locally, Act Globally: Improving Defect and Effort Prediction Models | No | No | No | Promise to upload data |
Green Mining: A Methodology of Relating Software Change to Power Consumption | No | No | No | Best paper award |
Analysis of Customer Satisfaction Survey Data | No | No | No | Not based on open data |
Mining Usage Data and Development Artifacts | No | No | No | |
Why Do Software Packages Conflict? | No | No | No | Original data in Debian repository |
Discovering Complete API Rules with Mutation Testing | Yes | Yes | Yes | Not open source |
Inferring Semantically Related Words from Software Context | No | No | No | |
Do Faster Releases Improve Software Quality? An Empirical Case Study of Mozilla Firefox | No | No | No | |
Explaining Software Defects Using Topic Models | No | No | No | |
A Qualitative Study on Performance Bugs | No | No | No | |
Can We Predict Types of Code Changes? An Empirical Analysis | No | Yes (most) | No | |
An Empirical Investigation of Changes in Some Software Properties Over Time | Yes | No | Yes | Uses existing dataset |
Who? Where? What? Examining Distributed Development in Two Large Open Source Projects | Yes(partially) | No | No | Paper mentions that data is on the PROMISE dataset, could not be retrieved at the date of the conference. |
As you can see, the results are not particularly encouraging. In one of the most prominent empirical software engineering conferences, only two out of 18 papers provide really reusable tools (I have not investigated the degree of reusability).
In my opinion, what applies in practice should also apply in research. As researchers, we are often hesitant to provide reusable tools. Many times, this is due to the fact that going the extra mile to convert our ‘works on my computer’ scripts to tools is very time consuming and lacking any direct scientific value (i.e. does not lead to papers). Some of us might even be afraid of competing teams; if a tool is published this might allow others to find flaws in our research or that a more resourceful team will leap ahead of us using our effort.
Publishing a tool along with a paper has several advantages to research as a whole:
- It enables research to become repeatable, facilitating both horizontal (more hypotheses) and vertical (more data) scaling of research efforts.
- It enables research to become reproducible, leading to more credible results.
- It enables people to become creative with someone else’s effort. This is precisely the reason that made open source software successful, and also applies with research tools too (see for example LLVM or JikesRVM).
I believe that publishing reusable tools (plus data and documentation) should
be a prerequisite to publishing papers, especially so in empirical venues.
Thereby, I hope that efforts such as the
RESER workshop and the
the will raise the awareness of the importance of tools in software engineering research.
Why do you think that people are not investing time to create tools?