Abstract and I. Introduction

II. Related Work

III. Methodology

IV. Results and Discussion

V. Threats to Validity

VI. Conclusions, Acknowledgments, and References

VI. CONCLUSION

We conducted a large-scale study on crypto issues discussed on Stack Overflow to find out what crypto challenges users commonly face in various areas of cryptography. Findings suggest that developers still have a distinct lack of knowledge of fundamental concepts, such as OpenSSL, asymmetric and password hashing, and the complexity of crypto libraries weakened developer performance to correctly realize a crypto scenario. We call for dedicated studies to investigate the usability of crypto APIs. We are conducting a survey with users who actively helped the Stack Overflow community in this domain to understand the potential remedies.

REFERENCES

[1] S. Nadi, S. Krüger, M. Mezini, and E. Bodden, “Jumping through hoops: Why do java developers struggle with cryptography apis?” in Proceedings of the 38th International Conference on Software Engineering.

[2] M. Hazhirpasand, M. Ghafari, and O. Nierstrasz, “Java cryptography uses in the wild,” in Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2020, pp. 1–6.

[3] Y. Tymchuk, M. Ghafari, and O. Nierstrasz, “Jit feedback - what experienced developers like about static analysis,” in 2018 IEEE/ACM 26th International Conference on Program Comprehension (ICPC), 2018, pp. 64–6409.

[4] C. Corrodi, T. Spring, M. Ghafari, and O. Nierstrasz, “Idea: Benchmarking android data leak detection tools,” in Engineering Secure Software and Systems, M. Payer, A. Rashid, and J. M. Such, Eds. Cham: Springer International Publishing, 2018, pp. 116–123.

[5] S. Kafader and M. Ghafari, “Fluentcrypto: Cryptography in easy mode,” in 37th International Conference on Software Maintenance and Evolution (ICSME), 2021.

[6] S. E. Jahan, M. Rahman, A. Iqbal, and T. Sabrina, “An exploratory analysis of security on data transmission on relevant software engineering discussion sites,” in 2017 4th International Conference on Networking, Systems and Security (NSysS). IEEE, 2017.

[7] X.-L. Yang, D. Lo, X. Xia, Z.-Y. Wan, and J.-L. Sun, “What security questions do developers ask? a large-scale study of stack overflow posts,” Journal of Computer Science and Technology, vol. 31, no. 5, 2016.

[8] N. Meng, S. Nagy, D. Yao, W. Zhuang, and G. A. Argoty, “Secure coding practices in java: Challenges and vulnerabilities,” in Proceedings of the 40th International Conference on Software Engineering, 2018.

[9] F. Fischer, H. Xiao, C.-Y. Kao, Y. Stachelscheid, B. Johnson, D. Razar, P. Fawkesley, N. Buckley, K. Böttinger, P. Muntean et al., “Stack overflow considered helpful! deep learning security nudges towards stronger cryptography,” in 28th {USENIX} Security Symposium, 2019.

[10] F. Fischer, K. Böttinger, H. Xiao, C. Stransky, Y. Acar, M. Backes, and S. Fahl, “Stack overflow considered harmful? the impact of copy&paste on android application security,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017.

[11] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” the Journal of machine Learning research, vol. 3, 2003.

[12] A. A. Bangash, H. Sahar, S. Chowdhury, A. W. Wong, A. Hindle, and K. Ali, “What do developers know about machine learning: a study of ml discussions on stackoverflow,” in 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 2019.

[13] C. Rosen and E. Shihab, “What are mobile developers asking about? a large scale study using stack overflow,” Empirical Software Engineering, vol. 21, no. 3, 2016.

[14] H. Osman, M. Ghafari, and O. Nierstrasz, “Hyperparameter optimization to improve bug prediction accuracy,” in 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation, 2017.

[15] W. Zhao, J. J. Chen, R. Perkins, Z. Liu, W. Ge, Y. Ding, and W. Zou, “A heuristic approach to determine an appropriate number of topics in topic modeling,” in BMC bioinformatics, vol. 16, no. 13, 2015.

[16] G. Luo, “A review of automatic selection methods for machine learning algorithms and hyper-parameter values,” Network Modeling Analysis in Health Informatics and Bioinformatics, vol. 5, no. 1, 2016.

[17] T. L. Griffiths and M. Steyvers, “Finding scientific topics,” Proceedings of the National academy of Sciences, vol. 101, no. suppl 1, 2004.

[18] J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei, “Reading tea leaves: How humans interpret topic models,” in Advances in neural information processing systems, 2009.

[19] V. Braun and V. Clarke, “Using thematic analysis in psychology,” Qualitative research in psychology, vol. 3, no. 2, 2006.

[20] J. Cohen, “A coefficient of agreement for nominal scales,” Educational and psychological measurement, vol. 20, no. 1, 1960.

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) Mohammadreza Hazhirpasand, Oscar Nierstrasz, University of Bern, Bern, Switzerland;

(2) Mohammadhossein Shabani, Azad University, Rasht, Iran;

(3) Mohammad Ghafari, School of Computer Science, University of Auckland, Auckland, New Zealand.