This story on HackerNoon has a decentralized backup on Sia.
Transaction ID: 6i8dWcM2O9C-3WGqi5wuVOuz1r0yOmwAGiQX2wYBMSU
Cover

The Challenges of Data Collection for Software Engineering Research

Written by @bayesianinference | Published on 2025/8/27

TL;DR
Discussing the challenges of obtaining real-world data and details the two case studies—the Repo Margining System and the Abrahamsson Case Study—used for the model's validation.

Abstract and 1. Introduction

  1. Background and 2.1. Related Work

    2.2. The Impact of XP Practices on Software Productivity and Quality

    2.3. Bayesian Network Modelling

  2. Model Design

    3.1. Model Overview

    3.2. Team Velocity Model

    3.3. Defected Story Points Model

  3. Model Validation

    4.1. Experiments Setup

    4.2. Results and Discussion

  4. Conclusions and References

4.1. Experiments Setup

Collecting data from real projects to validate our model was a difficult task due to several reasons. Due to XP simplicity value, it is difficult to find company collecting information regarding their activities and practices. Moreover, most real XP projects are developed by private companies having restrictions on publishing their internal development process. In addition, there is no guarantee that the available data is sufficient for model validation.

Two XP projects provided enough data to test our model. The first one is the Repo Margining System project [4]. The second one is a controlled case study reported by Pekka Abrahamsson [17]. We will refer to this case study in the rest of this paper by Abrahamsson Case Study. The model input data for the two projects are shown in tables 2 and 3. The model internal parameters are summarized in table 4.

Table 2 Repo Margining System input data

Table 3 Abrahamsson Case Study input data

Table 4 Model internal parameters (U(a,b) refers to uniform distribution from a to b, while N(µ,σ) refers to normal distribution with mean µ and standard deviation σ)

Authors:

(1) Mohamed Abouelelam, Software System Engineering, University of Regina, Regina, Canada;

(2) Luigi Benedicenti, Software System Engineering, University of Regina, Regina, Canada.


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

[story continues]


Written by
@bayesianinference
At BayesianInference.Tech, as more evidence becomes available, we make predictions and refine beliefs.

Topics and
tags
bayesian-networks|extreme-programming|software-process|xp-process-modelling|data-collection|software-engineering-research|repo-margining-system|what-is-extreme-programming
This story on HackerNoon has a decentralized backup on Sia.
Transaction ID: 6i8dWcM2O9C-3WGqi5wuVOuz1r0yOmwAGiQX2wYBMSU