Software analytics is a modern term for the use of empirical (mostly quantitative) research methods on software data.
In this lecture, we will:
Quantitative software engineering is a subset of empirical software engineering, a discipline that
D: Can you identify potential applications of quantitative Software Engineering?
Empiricism is a philosophical theory that states that true knoweledge can only arise by systematically observing the world.
Types of empirical research:
Empirical research requires the collection of data to answer research questions (RQs).
Qualitative research methods collect non-numerical data
Quantitative methods using mathematical, statistical or numerical techniques to process numerical data:
A hypothesis proposes an explanation to a phenomenon
Defined in pairs
A good hypothesis is readily falsifiable.
Most statistical tests return the probability (\(p\)) that \(H_0\) is true.
To interpret a test, we set a threshold (usually, 0.05) for \(p\)
If \(p <\) threshold, then the null hypothesis is rejected and the default one is accepted
Need to know before hand what statistical tests do
A theory is a proposed explanation for an observed phenomenon. It (usually) specifies entities and prescribes their interactions. Using a theoretical model, we can explain and predict
Q: How can we build or dismantle a theory?
Theories are built by generalizing over consecutive research results.
A single contradicting data point is enough to reject a theory.
Extract samples of data for a running process. Data types:
McCabe’s complexity [1]: Attempt to quantify complexity at the function level by counting number of branches.
Halstead software science [2]: Attempt to generate laws of software growth
Curtis et al. [3] found that: “All three metrics (Halstead volume, McCabe complexity, LoCs) correlated with both the accuracy of the modification and the time to completion.”
they just work!
Boehm [4] defined the COCOMO model, and effort to quantify and predict software cost:
\(a, b, c\) and \(d\) were collected through case studies.
Both COCOMO and function points are widely used today for cost estimation.
Manny Lehman [5] defined a set of laws that characterise how software evolves (and ultimately predict its demise)
Using metrics to define product and process quality
The ISO 9126 standard
Basili [6, p. @Basil92]: The Goal-Question-Metric approach:
A goal is stated as follows:
what | example |
---|---|
Object of study | A tool or a practice |
Purpose | Characterize, improve, predict etc |
Focus | prespective to study the problem from |
Stackeholder | Who is concerned with the result? |
Context | Confouding factors (e.g. company, environment) |
The GQM approach is another way of describing the scientific method.
Mockus et al: “Two case studies of open source software development: Apache and mozilla” [9]
Not the first to use OSS data, but:
von Krogh et al.: “Community, joining, and specialization in open source software innovation: a case study” [10]
Defined the, now obvious, vocabulary of OSS research:
Herbsleb and Mockus: “An empirical study of speed and communication in globally distributed software development” [11]
Zimmerman et al. “Mining Version Histories to Guide Software Changes” [12]
Very important work because:
Nagappan et al.: “Mining Metrics to Predict Component Failures” [13]
Heitlager et al.: “A Practical Model for Measuring Maintainability” [14]
Noteworthy findings (at the file level):
Predicting component failures: Hassan [15] found a connection between process metrics and bugs
Distributed software development: Bird et al. [16] found that software quality is not affected by distance
No model to rule them all: Zimmerman et al. [17] established that software projects are different and therefore models need to be localised and specialised.
Naturalness: Hindle et al. [18] found that “code is very repetitive, and in fact even more so than natural languages”
In the early 10s, the velocity of software production increased at a breakneck rate
GitHub revolutionalized OSS by centralizing it. Anyone can contribute (and contribute they do!).
AppStores made discoverability and distribution to the end client trivial.
The cloud transfored hardware into software.
Software analytics coined as a term to help teams improve their performance
Big Software: GHTorrent (Gousios [19]) made TBs of GitHub data available to researchers. Inspired TravisTorrent [20] and SOTorrent [21]
Big testing: Herzig et al. [22] developed “a cost model, which dynamically skips tests when the expected cost of running the test exceeds the expected cost of removing it. ”
Big security: Gorla et al. [23] “after clustering Android apps by their description topics, (we) identified outliers in each cluster with respect to their API usage.”
Code summarization Allamanis et al. [24] use CNNs to automatically give names to methods based on their contents
Code search Gu et al. [25] search for code snippers using natural language queries
PR Duplicates: Nijessen [26] used deep learning to find duplicate PRs
An overview can be seen in this taxonomy.
In this course, we will focus on state of the art research in the areas of:
Modern software development
Ref | Who? | Definition |
---|---|---|
[27] | Hassan | [Software Intelligence] offers software practitioners (not just developers) up-to-date and pertinent information to support their daily decision-making processes. |
[28] | Buse | The idea of analytics is to leverage potentially large amounts of data into real and actionable insights. |
[29] | Zhang | Software analytics is to enable software practitioners to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. |
[30] | Menzies | Software analytics is analytics on software data for managers and software engineers with the aim of empowering software development individuals and teams to gain and share insight from their data to make better decisions. |
D: So what are software analytics?
The broader goal of software analytics is to extract value from data traces residing in software repositories, in order to assist developers to write better software.
The software analytics feedback loop
[1] T. J. McCabe, “A complexity measure,” IEEE Transactions on Software Engineering, vols. SE-2, no. 4, pp. 308–320, Dec. 1976.
[2] M. H. Halstead and others, Elements of software science (operating and programming systems series). Elsevier Science Inc., New York, NY, 1977.
[3] B. Curtis, S. B. Sheppard, P. Milliman, M. A. Borst, and T. Love, “Measuring the psychological complexity of software maintenance tasks with the halstead and mccabe metrics,” IEEE Transactions on Software Engineering, vols. SE-5, no. 2, pp. 96–104, 1979.
[4] B. W. Boehm and others, Software engineering economics, vol. 197. Prentice-hall Englewood Cliffs (NJ), 1981.
[5] M. M. Lehman, “Programs, life cycles, and laws of software evolution,” Proceedings of the IEEE, vol. 68, no. 9, pp. 1060–1076, 1980.
[6] V. R. Basili, R. W. Selby, and D. H. Hutchens, “Experimentation in software engineering,” IEEE Transactions on software engineering, no. 7, pp. 733–743, 1986.
[7] V. R. Basili, “Software modeling and measurement: The goal/question/metric paradigm,” 1992.
[8] S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” IEEE Transactions on software engineering, vol. 20, no. 6, pp. 476–493, 1994.
[9] A. Mockus, R. T. Fielding, and J. D. Herbsleb, “Two case studies of open source software development: Apache and mozilla,” ACM Trans. Softw. Eng. Methodol., vol. 11, no. 3, pp. 309–346, Jul. 2002.
[10] G. von Krogh, S. Spaeth, and K. R. Lakhani, “Community, joining, and specialization in open source software innovation: A case study,” Research Policy, vol. 32, no. 7, pp. 1217–1241, 2003.
[11] J. D. Herbsleb and A. Mockus, “An empirical study of speed and communication in globally distributed software development,” IEEE Transactions on Software Engineering, vol. 29, no. 6, pp. 481–494, 2003.
[12] T. Zimmermann, P. Weisgerber, S. Diehl, and A. Zeller, “Mining version histories to guide software changes,” in Proceedings of the 26th international conference on software engineering, 2004, pp. 563–572.
[13] N. Nagappan, T. Ball, and A. Zeller, “Mining metrics to predict component failures,” in Proceedings of the 28th international conference on software engineering, 2006, pp. 452–461.
[14] I. Heitlager, T. Kuipers, and J. Visser, “A practical model for measuring maintainability,” in 6th international conference on the quality of information and communications technology (quatic 2007), 2007, pp. 30–39.
[15] A. E. Hassan, “Predicting faults using the complexity of code changes,” in Proceedings of the 31st international conference on software engineering, 2009, pp. 78–88.
[16] C. Bird, N. Nagappan, P. Devanbu, H. Gall, and B. Murphy, “Does distributed development affect software quality? An empirical case study of windows vista,” in Proceedings of the 31st international conference on software engineering, 2009, pp. 518–528.
[17] T. Zimmermann, N. Nagappan, H. Gall, E. Giger, and B. Murphy, “Cross-project defect prediction: A large scale experiment on data vs. Domain vs. Process,” in Proceedings of the the 7th joint meeting of the european software engineering conference and the acm sigsoft symposium on the foundations of software engineering, 2009, pp. 91–100.
[18] A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, “On the naturalness of software,” in Software engineering (icse), 2012 34th international conference on, 2012, pp. 837–847.
[19] G. Gousios, “The GHTorrent dataset and tool suite,” in Proceedings of the 10th working conference on mining software repositories, 2013, pp. 233–236.
[20] M. Beller, G. Gousios, and A. Zaidman, “TravisTorrent: Synthesizing travis ci and github for full-stack research on continuous integration,” in Proceedings of the 14th working conference on mining software repositories, 2017.
[21] S. Baltes, L. Dumani, C. Treude, and S. Diehl, “SOTorrent: Reconstructing and analyzing the evolution of stack overflow posts,” in Proceedings of the 15th international conference on mining software repositories, 2018, pp. 319–330.
[22] K. Herzig, M. Greiler, J. Czerwonka, and B. Murphy, “The art of testing less without sacrificing quality,” in Proceedings of the 37th international conference on software engineering - volume 1, 2015, pp. 483–493.
[23] A. Gorla, I. Tavecchia, F. Gross, and A. Zeller, “Checking app behavior against app descriptions,” in Proceedings of the 36th international conference on software engineering, 2014, pp. 1025–1035.
[24] M. Allamanis, H. Peng, and C. Sutton, “A convolutional attention network for extreme summarization of source code,” in International conference on machine learning, 2016, pp. 2091–2100.
[25] X. Gu, H. Zhang, and S. Kim, “Deep code search,” in Proceedings of the 40th international conference on software engineering, 2018, pp. 933–944.
[26] R. Nijessen, “A case for deep learning in mining software repositories.” TU Delft, Delft, NL, Nov-2017.
[27] A. E. Hassan and T. Xie, “Software intelligence: The future of mining software engineering data,” in Proceedings of the fse/sdp workshop on future of software engineering research, 2010, pp. 161–166.
[28] R. P. L. Buse and T. Zimmermann, “Analytics for software development,” in Proceedings of the fse/sdp workshop on future of software engineering research, 2010, pp. 77–80.
[29] D. Zhang, Y. Dang, J.-G. Lou, S. Han, H. Zhang, and T. Xie, “Software analytics as a learning case in practice: Approaches and experiences,” in Proceedings of the international workshop on machine learning technologies in software engineering, 2011, pp. 55–58.
[30] T. Menzies and T. Zimmermann, “Software analytics: So what?” IEEE Software, vol. 30, no. 4, pp. 31–37, 2013.
The course contents are copyrighted (c) 2018 - onwards by TU Delft and their respective authors and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.