sia.hackernoon.com

Table Of Links

3 Original Study: Validity Threats

Based on the checklist provided by Wohlin et al. [52], the relevant threats to our study are next described.

3.1 Conclusion Validity

1. Random heterogeneity of participants. The use of a within-subjects experimental design ruled out the risk of the variation due to individual differences among participants being larger than the variation due to the treatment.

3.2 Internal Validity

History and maturation:

– Since participants apply different techniques on different artefacts, learning effects should not be much of a concern. – Experimental sessions take place on different days. Given the association of grades to performance in the experiment, we expect students will try to do better on the following day, causing that the technique applied the last day gets a better effectiveness. To avoid this, different participants apply techniques in different orders. This way we cancel out the threat due to order of application (avoiding that a given technique gets benefited from the maturation effect). In any case, an analysis of the chosen techniques per day is done to study maturation effect.
Interactions with selection. Different behaviours in different technique application groups are ruled out by randomly assigning participants to groups. However, we will check it analysing the behaviour of groups.
Hypothesis guessing. Before filling in the questionnaire, participants in the study were informed about the goal of the study only partially. We told them that we wanted to know their preferences and opinions, but they were not aware of our research questions. In any case, if this threat is occurring, it would mean that our results for perceptions are the best possible ones, and therefore would set an upper bound.
Mortality. The fact that several participants did not give consent to participate in the study has affected the balance of the experiment.
Order of Training. Techniques are presented in the following order: CR, BT and EP. If this threat had taken place, then CR would be the most effective (or their favourite).

3.3 Construct Validity

Inadequate preoperational explanation of cause constructs. Cause constructs are clearly defined thanks to the extensive training received by participants on the study techniques.
Inadequate preoperational explanation of effect constructs. The question being asked is totally clear and should not be subject to possible misinterpretations. However, since the perception is subjective, there exists the possibility that the question asked is interpreted differently by different participants, and hence, perceptions are related to how the question is interpreted. This issue should be further investigated in future studies.

3.4 External Validity

Interaction of setting and treatment. We tried to make the faults seeded in the programs as representative as possible of reality.
Generalisation to other subject types. As we have already mentioned, the type of subjects our sample represents are developers with little or none previous experience in testing techniques and junior programmers. The extent to which the results obtained in this study can be generalised to other subject types needs to be investigated. Of all threats listed, the only one that could affect the validity of the results of this study in an industrial context is the one related to generalisation to other subject types.

Authors:

Sira Vegas
Patricia Riofr´ıo
Esperanza Marcos
Natalia Juristo

This paper is available on arxiv under CC BY-NC-ND 4.0 license.

Assessing Validity Threats in Controlled Software Engineering Experiments

Table Of Links

3 Original Study: Validity Threats

3.1 Conclusion Validity

3.2 Internal Validity

3.3 Construct Validity

3.4 External Validity