Table Of Links
2 Original Study: Research Questions and Methodology
3 Original Study: Validity Threats
5 Replicated Study: Research Questions and Methodology
6 Replicated Study: Validity Threats
8 Discussion
Next, we summarize the findings of this study and analyse their implications. Note that the results of the study are restricted to junior programmers with little testing experience, and defect detection techniques.
8.1 Answers to Research Questions
– RQ1.1: What are participants’ perceptions of their testing effectiveness? The number of participants perceiving a particular technique/program as being more effective cannot be considered different for all three techniques/programs.
– RQ1.2: Do participants’ perceptions predict their testing effectiveness? Our data do not support that participants correctly perceive the most effective technique for them. Additionally, no bias has been found towards a given technique. However, they tend to correctly perceive the program in which they detected most defects.
– RQ1.3: Do participants find a similar amount of defects for all techniques? Participants do not obtain similar effectiveness values when applying the different techniques.
– RQ1.4: What is the cost of any mismatch? Mismatch cost is not negligible (mean 31pp), and it is not related to the technique perceived as most effective.
– RQ1.5: What is expected project loss? Expected project loss is 15pp, and it is not related to the technique perceived as most effective.
– RQ1.6: Are participants perceptions related to the number of defects reported by participants? Results are not clear about this. Although our data do not support that participants correctly perceive the most effective technique for them, it should not be ruled out. Further research is needed.
Therefore, the answer to RQ1: Should participants’ perceptions be used as predictors of testing effectiveness? is that participants should not base their decisions on their own perceptions, as they are not reliable and have an associated cost.
– RQ2.1: What are participants’ opinions about techniques and programs? Most people like EP best, followed by both BT and CR (which merit the same opinion). There is no difference in opinion as regards programs
– RQ2.2: Do participants’ opinions predict their effectiveness? They are not good predictors of technique effectiveness. A bias has been found towards EP.
Therefore, the answer to RQ2: Can participants’ opinions be used as predictors for testing effectiveness? is that participants should not use their opinions, as they are not reliable and are biased.
– RQ3.1: Is there a relationship between participants’ perceptions and opinions? Participants’ perceptions of technique effectiveness are related to how well they think they applied the techniques. We have not been able to find a relationship between the technique they like best and find easiest to apply, and perceived effectiveness. Participants do not associate the simplest program with the program in which they detected most defect.
– RQ3.2: Is there a relationship between participants’ opinions? Yes. Opinions are consistent with each other.
Therefore, the answer to RQ3: Is there a relationship between participants’ perceptions and opinions? is positive for some of them.
8.2 About Perceptions
Participants’ perceptions about the effectiveness of techniques are incorrect (50% get it wrong). However, this is not due to some sort of bias in favour of any of the three techniques under review. These misperceptions should not be overlooked, as they affect software quality. We cannot accurately estimate the cost, as it depends on what faults there are in the software. However, our data suggest a loss of from 25pp to 31 pp. Perceptions about programs appear to be correct, although this does not offset the mismatch cost.
Our findings confirm that:
– Testing technique effectiveness depends on the software faults.
Additionally, they warn developers that:
– They should not rely on their perceptions when rating a defect detection technique or how well they have tested a program. Finally, they suggest the need for the following actions:
– Develop tools to inform developers about how effective the techniques that they applied are and the testing they performed is.
– Develop instruments to give developers access to experimental results.
– Conduct further empirical studies to learn what technique or combination of techniques should be applied under which circumstances to maximize its effectiveness.
8.3 About Opinions
Participants prefer EP to BT and CR (they like it better, think they applied it better and find it easier to apply). Opinions do not predict real effectiveness. This failure to predict reality is partly related to the fact that a lot of people prefer EP but are really more effective using BT or CR. Opinions do not predict real effectiveness with respect to programs either.
These findings warn developers that:
– They should not be led by their opinions on techniques when rating their effectiveness.
Finally, they suggest the need for the action:
– Further research should be conducted into what is behind developers’ opinions.
8.4 About Perceptions and Opinions
The technique that participants believe to be the most effective is the one that they applied best. However, they are capable of separating their opinions about technique complexity and preferences from their perceptions, as the technique that they think is most effective is not the one that they find easiest to apply or like best.
Our findings challenge that:
– Perceptions of technique effectiveness are based on participants’ preferences.
They also warn developers that:
– Maximum effectiveness is not necessarily achieved when a technique is properly applied.
Finally, they suggest the need for the following actions:
– Determine the best combination of techniques to apply that is at the same time easily applicable and effective.
– Continue to look for possible drivers to determine what could be causing developers’ misperceptions.
Authors:
- Sira Vegas
- Patricia Riofr´ıo
- Esperanza Marcos
- Natalia Juristo
This paper is