Table of Links
2. Methodology and 2.1. Research Questions
3. Results and Interpretation and 3.1. Type of Problems (RQ1)
4. Implications
4.1. Implications for the Copilot Users
4.2. Implications for the Copilot Team
4.3. Implications for Researchers
6. Related Work
6.1. Evaluating the Quality of Code Generated by Copilot
6.2. Copilot’s Impact on Practical Development and 6.3. Conclusive Summary
4.2. Implications for the Copilot Team
Provide more customization options to allow users to tailor the behavior of Copilot to align with their own workflow. Among the 115 FUNCTION REQUESTS, we identified 52 instances of such requests to customize the behavior of Copilot in various aspects, accounting for approximately 50%. Some common requests are specifying the file types or workspace in which Copilot automatically runs (11), modifying the shortcut keys for accepting suggestions (10), accepting code suggestions line-by-line or word-by-word (9), preventing Copilot from generating certain types of suggestions (e.g., file paths, comments) (3), and configuring text color and fonts (3). Our previous study on the expected feature of Copilot also indicates that the functionalities provided by Copilot have not yet fully met the requirements of users for flexible utilization of the tool (Zhang et al., 2023). Additionally, according to some of the identified POOR FUNCTIONALITY EXPERIENCE (e.g., perceiving the auto-suggestions of Copilot as distracting, which is also mentioned in the study of Bird et al. (2023)), we can discern the demand for customizing the behavior of Copilot. It is believed that the extent to which the behavior of Copilot can adapt well to user coding habits is a vital factor in their decision to use Copilot. Therefore, providing flexible and user-friendly customization options is highly beneficial.
Provide more ways to control the content generated by Copilot. According to Table 5, there is only a small number of solutions for Suggestion Content Issue. Out of the 59 Suggestion Content Issues, only five solutions were identified, indicating that users may find it challenging to provide ideal solutions for the problems of content suggested by Copilot. The output of Copilot is inherently unpredictable, and users have limited ways to control the code generated by Copilot besides modifying code context or code comments per se. Based on our observation, features such as allowing users to define code styles and conventions, and choosing multiple files as the context for Copilot to generate code suggestions are worth trying.
Simplify the configuration of Copilot and provide support for more IDEs and code editors. According to the results of RQ1 and RQ2, Compatibility Issue is the second-largest category, and Editor/IDE Compatibility Issue is one of the main causes that leads to many Operation Issues. From the perspective of users, we also have observed lots of discussions related to configuration and settings of Copilot, which makes Modify Configuration/Setting the second most frequently employed solution. Additionally, Improper Configuration/Setting is the fifth most common cause of problems related to Copilot usage. Based on these findings, we believe that simplifying the configuration process of Copilot for users can significantly improve their experience. For example, the Copilot team may offer more detailed installation and configuration guidelines, provide user-friendly configuration options. Moreover, we have identified 72 INTEGRATION REQUESTS, indicating a significant number of users expressing a desire for Copilot compatibility across a broader range of IDEs and code editors.
Increase the diversity of the code suggested by Copilot while improving their quality. In Suggestion Content Issues, the predominant types are LOW QUALITY SUGGESTION and NONSENSICAL SUGGESTION. Bird et al. (2023) also observed that Copilot occasionally offers peculiar and nonsensical code suggestions as reported by the Copilot users. Other prior studies (e.g., Vaithilingam et al. (2022), Pearce et al. (2022)) also indicated the problems with the code suggestions provided by Copilot in terms of code quality. However, compared to the issues of specific code content suggested by Copilot, we found that users are more inclined to discuss situations where Copilot fails to provide valuable code suggestions as problems on public Q&A platforms. This tendency actually reflects one of the primary purposes for which many users utilize Copilot: they hope that Copilot will inspire their ideas in coding. Observations from some instances of User Experience Issue have shown how users can be disappointed when all 10 code suggestions provided by Copilot lack diversity. In fact, in addition to the improvement in code quality, diverse coding suggestions are essential for satisfying the needs of users for more inspirational code.
Consider intellectual property and copyright when gathering training data for Copilot and giving code suggestions. The number of Copyright and Policy Issue is higher than we expected, and we observed many related concerns from both users and repository owners during the data extraction process. Bird et al. (2023) also noticed some discussions about how copyright applied to code suggestions generated by Copilot. Meanwhile, in Feature Request, we have noticed that some users are calling for Copilot to introduce new features that prevent CODE TELEMETRY ISSUE. The goal of our research is not to provide an evaluation of such problems or the non-open source nature of Copilot. However, we contend that Copilot should take measures to address these problems, providing stable and high-quality code generation services while protecting user privacy and intellectual property.
Authors:
(1) Xiyu Zhou, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(2) Peng Liang (Corresponding Author), School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(3) Beiqi Zhang, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);
(4) Zengyang Li, School of Computer Science, Central China Normal University, Wuhan, China ([email protected]);
(5) Aakash Ahmad, School of Computing and Communications, Lancaster University Leipzig, Leipzig, Germany ([email protected]);
(6) Mojtaba Shahin, School of Computing Technologies, RMIT University, Melbourne, Australia ([email protected]);
(7) Muhammad Waseem, Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland ([email protected]).
This paper is