by Ernst Mulders
Moderator of the discussion is Ayushi Rastogi.
The discussion session is started with an analogy. Ayushi mentions the delays one can have inside an airport. The similarity with software is in the fact that one bottleneck can cause delays for the entire project. Furthermore assumptions are made for the required scenarios. In the airport example a student mentions that they might have assumed that check-in would take 20 minutes, but in reality takes longer causing delays.
The moderator specifically mentions the three parts of the paper: - processes, - mining and - software repositories
The students are asked what the keywords / important items of the paper are. The following replies are given:
Ayushi mentions that process mining can be described as extracting data that help you find the process. So the paper is about how you can find these processes. Why is this relevant in software? Because the amount of data is too large to do it manually.
“What repositories is the paper talking about?” the moderator asks. After a slight delay a student gives the answer that the paper uses a broader view on the term repository, and not only means software repositories (commit history) but also bug trackers and e-mail logs. Ayushi adds that these repositories contain: who, what and when.
The discussion continues about the experiments conducted in the paper. The first experiment is summarized by a student and can be described as “matching characters.” The following elements are mentioned:
The explanation of the 2nd experiment is given by a student after a refresh of that part of the paper. The following points come forward in the discussion:
When asked by Ayushi why the paper is useful, the following answers are given:
Everybody agrees that the paper isn’t a good paper. Some reasons are given:
How do anchoring effects affect estimations?
3 experiments (randomized control trials):
To summarize, our findings indicate on the one hand that anchors have strong effects on estimates of software project effort, and on the other hand, that the anchoring effect is not moderated by numerical preciseness or source credibility.
by Ernst Mulders
The discussion was started with the moderator asking to give a summary of the paper. Two summaries were given. From these summaries it is discussed that an earlier paper on anchoring already existed and that in this paper extra dimensions and input are added. Furthermore it is noted that motivation for the paper came from questions not asked in other papers about anchoring.
Immediately from this conversation the topic turns onto biases.
The moderator mentions and recommends the book Thinking, Fast and Slow by Daniel Kahneman. Kahneman is an important figure in the field of behavioral economics. The book also explains on how we make decisions that we think are rational, but are actually biased. Until the sixties it was thought that people could think purely rational.
After the introduction of the book the moderator hands out the Cognitive Bias Codex which shows the influences in decision making. In a clockwise manner we go through the items on the codex. The items on the main circle are shortly read out loud by the moderator. It is noted that items in the inner circle are the corresponding research fields.
The bias discussion continues with the moderator asking the question “Can you think of places were biases can play a role in SE?” Several students reply:
When the group diversity argument is mentioned, the moderator turns to Ayushi Rastogi who has performed research on biases in accepting pull requests. From her research it is found that reviewers of a pull request are 20% more likely to accept the request when the committer is from the same country as the reviewer. Furthermore people performing the reviews often say they aren’t biased, but the committers mention they do feel the bias.
The moderator continues the discussion by asking for examples of biases in research. The given examples by students are:
As an example an exponential decay graph is drawn on the board by the moderator with number of pull requests on the Y-axis and number of projects on the X-axis. The solution to getting a good sample is using stratified sampling. Explained a fellow student as taking x samples from every part of the distribution.