Introduction

Logging is a popular practice in software development, and engineers often rely on it for several purposes (e.g., debugging, performance monitoring, and root cause analysis). Unfortunately, logging is challenging. Previous research highlights that, in practice, developers often spend time implementing and maintaining log statements in a trial-and-error basis [1], [2], [3]. Furthermore, improper logging leads to several problems that hinders the use of log data (e.g., overwhelming volume of data and missing information) [4]. How can one determine the utility of a log statement at development time?

Researchers have been proposing techniques to help developers to make informed decisions (e.g., where to log [5], [6], [7], which variables to log [8], which log level to use [9]). Machine learning is a good fit in this domain since it can learn patterns of logging code that are complex/unfeasible to encode in a traditional static analyzer.

Papers

In this session, we will be focus in the log placement problem, i.e., “where to log”. Concretely, we will discuss a state-of-the-art technique based on Deep Learning [7]. This paper introduces a step forward from LogAdvisor [6], a former technique based on traditional machine learning and feature engineering. You are welcome to check the LogAdvisor paper for more historical background (but keep in mind it is not required for this session).

Bibliography

[1]
D. Yuan, S. Park, and Y. Zhou, “Characterizing logging practices in open-source software,” in Proceedings of the 34th International Conference on Software Engineering, 2012, pp. 102–112.
[2]
B. Chen and Z. M. (Jack) Jiang, Characterizing logging practices in Java-based open source software projects – a replication study in Apache Software Foundation,” Empirical Software Engineering, vol. 22, no. 1, pp. 330–374, Feb. 2017.
[3]
S. Kabinna, C.-P. Bezemer, W. Shang, M. D. Syer, and A. E. Hassan, Examining the stability of logging statements,” Empirical Software Engineering, vol. 23, no. 1, pp. 290–333, Feb. 2018.
[4]
M. Hassani, W. Shang, E. Shihab, and N. Tsantalis, Studying and detecting log-related issues,” Empirical Software Engineering, vol. 23, no. 6, pp. 3248–3280, Dec. 2018.
[5]
H. Li, T.-H. (Peter). Chen, W. Shang, and A. E. Hassan, Studying software logging using topic models,” Empirical Software Engineering, vol. 23, no. 5, pp. 2655–2694, Oct. 2018.
[6]
J. Zhu, P. He, Q. Fu, H. Zhang, M. R. Lyu, and D. Zhang, Learning to Log: Helping Developers Make Informed Logging Decisions,” in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, 2015, vol. 1, pp. 415–425.
[7]
Z. Li, T.-H. (Peter). Chen, and W. Shang, Where Shall We Log? Studying and Suggesting Logging Locations in Code Blocks,” in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, 2020.
[8]
Z. Liu, X. Xia, D. Lo, Z. Xing, A. E. Hassan, and S. Li, Which Variables Should I Log? IEEE Transactions on Software Engineering, 2019.
[9]
H. Anu, J. Chen, W. Shi, J. Hou, B. Liang, and B. Qin, An Approach to Recommendation of Verbosity Log Levels Based on Logging Intention,” in 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019, pp. 125–134.