Untangling Fine-Grained Code Changes

by Dias, Martin and Bacchelli, Alberto and Gousios, Georgios and Cassou, Damien and Ducasse, Stephane

You can get a pre-print version from here.
You can view the publisher's page here.

This paper received a nomination for the "SANER best paper" award


After working for some time, developers commit their code changes to a version control system. When doing so, research shows that they often bundle unrelated changes (e.g., bug fix and refactoring) in a single commit, thus creating a so-called tangled commit. Sharing tangled commits is problematic because it makes review, reversion, and integration of these commits harder and historical analyses of the project less reliable. Researchers have worked at untangling existing commits, i.e., finding which part of a commit relates to which task. In this paper, we contribute to this line of work in two ways: (1) A publicly available dataset of untangled code changes, created with the help of two developers who accurately split their code changes into self contained tasks over a period of four months; (2) based on this dataset we devise and assess EpiceaUntangler, an approach to help developers share untangled commits (aka. atomic commits) by using fine-grained code change information. We further evaluate EpiceaUntangler by deploying it to 7 developers, who used it for 2 weeks. We recorded a median success rate of 91% and average one of 75%, in automatically creating clusters of untangled fine-grained code changes.

Bibtex record

  author = {Dias, Martin and Bacchelli, Alberto and Gousios, Georgios and Cassou, Damien and Ducasse, Stephane},
  title = {Untangling Fine-Grained Code Changes},
  booktitle = {Proceedings of the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering},
  series = {SANER 2015},
  year = {2015},
  location = {Montreal, Canada},
  pages = {341--350},
  doi = {10.1109/SANER.2015.7081844},
  url = {/pub/fine-untangling.pdf},
  nomination = {SANER best paper}

The paper