Table Of Links
2.1 Code Review As Communication Network
2.2 Code Review Networks
2.3 Measuring Information Diffusion in Code Review
3.1 Hypotheses
3.2 Measurement model
3.3 Measuring system
ACKNOWLEDGMENTS AND REFERENCES
ABSTRACT
Background: As a core practice in software engineering, the nature of code review has been frequently subject to research. Prior exploratory studies found that code review, the discussion around a code change among humans, forms a communication network that enables its participants to exchange and spread information. Although popular in software engineering, there is no confirmatory research corroborating this theory and the actual extent of information diffusion in code review is not well understood.
Objective: In this registered report, we propose an observational study to measure information diffusion in code review to test the theory of code review as communication network.
Method: We approximate the information diffusion in code review through the frequency and the similarity between
(1) human participants,
(2) affected components, and
(3) involved teams of linked code reviews. The measurements approximating the information diffusion in code review serve as a foundation for falsifying the theory of code review as communication network.
1 INTRODUCTION
The theory is compelling: Modern software systems are often too large, too complex, and evolve too fast for an individual developer to oversee all parts of the software and, thus, to understand all implications of a change. Therefore, most software projects rely on code review to foster discussions on changes and their impacts before they are merged into the code bases. During those discussions, the participants exchange information and when needed and deemed relevant, the information is passed on in subsequent code reviews. Thereby, the information diffuses in the communication network that emerges from code review.
This theory is based on the solid and thorough exploratory research that identified information exchange as a key expectation
towards code review [2, 3, 5, 6, 19]—also beyond teams and architectural boundaries [2, 3, 4]—which makes code review a communication network.
While this theory is plausible, exploratory research alone is not sufficient—it also requires the confirmatory counterpart, which is currently missing. Exploratory research begins with specific observations, distills patterns in those observations, and derives theories from the observed patterns using inductive reasoning. The nature of exploratory research leads to limited generalizability as they are drawn from specific cases. As such, it is more susceptible to researcher bias due to the absence of a predefined theory.
Deductive research starts with a general theory, makes predictions (often in the form of hypotheses), and evaluates whether that prediction holds true or not in empirical observations. In research, we need both exploratory and confirmatory research to minimize bias and maximize the validity and reliability of theories efficiently. Figure 1 shows the empirical research cycle involving both exploratory (theory-generating) and confirmatory (theory-testing) research.
In the proposed study, we aim to fill that gap: The objective is to test the theory of code review as communication network. Instead of using classical statistical tests for the hypothesis testing, we quantify the extent of information diffusion in the code review system at Spotify, which may or may not contradict the underlying
theory of code review as communication network or its universality. A single empirical code review system with no or marginal information diffusion could not be aligned with the existing theory of code review as communication network in general; further constraints, context, or limitations must be considered. Therefore, we measure information diffusion in code review at Spotify across social, organizational, and architectural boundaries.
This paper is