Abstract and 1. Introduction

2. Methodology and 2.1. Research Questions

2.2. Data Collection

2.3. Data Labelling

2.4. Data Extraction

2.5. Data Analysis

3. Results and Interpretation and 3.1. Type of Problems (RQ1)

3.2. Type of Causes (RQ2)

3.3. Type of Solutions (RQ3)

4. Implications

4.1. Implications for the Copilot Users

4.2. Implications for the Copilot Team

4.3. Implications for Researchers

5. Threats to Validity

6. Related Work

6.1. Evaluating the Quality of Code Generated by Copilot

6.2. Copilot’s Impact on Practical Development and 6.3. Conclusive Summary

7. Conclusions, Data availability, Acknowledgments, CRediT authorship contribution statement and References

3.2. Type of Causes (RQ2)

3.2.1. Results

As mentioned in Section 2.4.2, not all problems have corresponding causes that can be extracted. As a result, we identified a total of 391 causes, which were collected from 28.9% of all problems related to Copilot usage, and categorized into 16 types as presented in Table 2. The result indicates that the most frequent causes are Copilot Internal Error (19.4%) and Network Connection Error (13.6%), with Editor/IDE Compatibility Issue (12.8%) and Unsupported Platform (8.2%) also commonly reported. The example, count, and proportion of each type of cause are presented in Table 2. It is worth noting that certain types of problems can potentially be the causes of other problems. For example, EDITOR/IDE COMPATIBILITY ISSUE is a type of Compatibility Issue. However, it is also the causes leading to other types of problems such as INSTALLATION ISSUE and STARTUP ISSUE.

• Copilot Internal Error (CIE) refers to the problems with the Copilot server side that affects its usage. These may include errors with the CodeX model and services provided by the server side, which are not visible to users. For example, the Copilot team once reported that “there has been an outage of one of our models which has caused the others to have to handle a higher traffic load”, which consequently results in temporary issues (Discussion #14370).

• Network Connection Error (NCE) refers to the disruptions in the network communication between the user side and the Copilot server side, resulting in an inability for users to utilize the code generation service of Copilot. For example, a user repeatedly encountered issues with Copilot not functioning properly because his “company’s network settings blocked the connection to copilot” (Discussion #36152).

• Editor/IDE Compatibility Issue (EICI) refers to situations where the code editors and IDEs is not compatible with Copilot, resulting in various anomalies when using Copilot. For example, a user was “unable to install GitHub Copilot extension in Visual Studio 2022 Enterprise”, because the version of his IDE was outdated, resulting in its incompatibility with Copilot (SO #71702171).

• Unsupported Platform (UP) refers to situations where users try to use Copilot on development platforms that lack official support for Copilot, which may results in unpredictable problems. For example, the Copilot team claimed that “Code OSS and VSCodium” were not supported by Copilot, which explains why some users failed to install Copilot in these two IDEs (Discussion #8015).

• Improper Configuration/Setting (ICS) refers to situations where Copilot operates abnormally or offers a suboptimal user experience due to the settings that are not configured appropriately. For example, a user found that “the setting.json file had disabled inline suggestions”, which resulted in Copilot not providing code suggestions in VSCode (SO #76257401).

• Poor Functionality Experience (PFE) refers to the negative experiences users encounter when coding with Copilot, which lead them to request for enhancement to existing features. For example, a user wanted Copilot to provide code suggestions via a shortcut, because “feature auto suggestion is something annoying” (Discussion #7172).

• For Coding Habit (FCH) refers to the wishes of some users for Copilot to offer new features and ensure compatibility with their preferred development platforms to accommodate their coding habits. For example, a user asked “if there was an option for changing the keybinding for accepting the suggestions”, because he was accustomed to “using the tab binding for opening the completion menu” (Discussion #6919).

• User Unauthorized (UU) refers to situations where some users cannot access Copilot services because they lack the authorization for using Copilot. For example, a user found that “GitHub Copilot could not connect to server”, because he was not authorized yet (Discussion #16795).

• Improper User Operation (IUO) refers to situations where Copilot exhibits unintended behavior because of user mistakes or oversights during the process of using Copilot such as registration, login, and subscription. For example, a user consistently received a prompt in Visual Studio stating that “Your Copilot experience is not fully configured, please complete your setup”. It was found that this issue arose because he “had signed up to Copilot with a different GitHub account” (Discussion #19556).

• Plug-in Compatibility Issue (PCI) refers to situations where Copilot fails to work properly because of incompatibility with other plug-ins. For example, “a bad extension interaction between Copilot and learnmarkdown” resulted in some users being unable to accept code suggestions from Copilot (Issue #441).

• Intentional Design of Copilot (IDC) refers to situations where what some users may perceive as anomalies of Copilot are features deliberately designed by the Copilot team. For example, a user noticed that Copilot’s “inline completions are trumping normal completions”. However, a member from the Copilot team proved that “this is intentional” and can be disabled by modifying certain setting (Issue #154).

• Unimplemented Feature (UF) refers to the functionalities that users assume Copilot to already have, but which the Copilot team has not yet provided. For example, a user could not activate Copilot behind proxy in VSCode, and a Copilot team member explained that “currently the Copilot VSCode extension does not support proxies” (Discussion #11630).

• For Higher Coding Efficiency (FHCE) is the reason that some users wish for new functionalities in Copilot to enhance efficiency in coding tasks. For example, a user suggested that Copilot could “add mapping generator for classes” to facilitate the generation of mapping methods between different classes (Discussion #7870).

• Obsoleted Copilot Version (OCV) refers to an outdated version of Copilot that is no longer operational. For example, a user encountering an issue with Copilot not working and was informed that “your Copilot extension was outdated” (Discussion #17463).

• License Restriction (LR) refers to situations where some development platforms are unable to be integrated with Copilot due to the restriction imposed by Copilot license. For example, a repository contributor explained that Copilot “cannot be used in free or opensource software such as code-server” due to its license restriction (Issue #123).

• Code Telemetry Issue (CTI) is the reason that some users request new features, aiming to prevent the exposure of their code to Copilot. For example, a user wanted to “disable Copilot per workspace”, so that he could “use it for open-source projects but not private/work projects” (Discussion #47991).

3.2.2. Causes to Problems Mapping

Table 3 illustrates the mapping relationship of Copilot related problems to their causes. We use abbreviations to represent each type of cause; for example, “CIE” represents Copilot Internal Error. The full names for all types of causes are provided in the note of Table 3.

Over one third of Operation Issues (37.2%) have associated causes. Specifically, AUTHENTICATION FAILURE is primarily induced by CIE, NCE, and UU; FUNCTIONALITY FAILURE is mainly caused by CIE, ICS, and PCI; The occurrence of INSTALLATION ISSUE typically stems from EICI and UP; STARTUP ISSUE commonly originates from CIE and NCE; while ACCESSING FAILURE is primarily attributed to NCE; and VERSION CONTROL ISSUE is mainly brought about by CIE, EICI, and UP.

For Compatibility Issue, causes are identified in 9.5% of the cases. To be specific, EDITOR/IDE COMPATIBILITY ISSUE is mainly caused by UP and ICS, while PLUG-IN COMPATIBILITY ISSUE is attributed to ICS and IDC.

For Feature Request, four types of causes are identified in 31.0% of cases, which are FCH, PFE, FHCE, and CTI. PFE and FCH are the prime causes for users to raise FUNCTION REQUESTS, while INTEGRATION REQUESTS are mainly attributed to FCH.

For User Experience Issue, causes are identified in 19.0% of the cases. However, there are only a few causes identified for each type of User Experience Issue, with CIE identified as the prime cause for POOR AUTHENTICATION EXPERIENCE and POOR PERFORMANCE.

For Suggestion Content Issue, causes are identified in 8.6% of the cases. CIE and UF are the causes leading to LOW QUALITY SUGGESTION, while CIE and ICS are the causes for NONSENSICAL SUGGESTION.

Copyright and Policy Issue refers to the concerns of some users about code leakage to Copilot, which inherently forms the reason for raising such problems, thereby no need for further identification of the underlying causes.

3.2.3. Interpretation

Frequency of Causes: CIE is the most common type of cause leading to Copilot usage problems. Typically, the cause identification of CIE relies on user feedback regarding abnormal usage experiences of Copilot, and it often results in a group of users reporting the same problem within a certain time period. For example, “a bad deployment” of the Copilot server caused a group of users to report AUTHENTICATION FAILURES (Discussion #39533). The high number of NCE, EICI, and ICS indicates that some problems arise from the environment in which Copilot operates. A common situation of NCE is that users connect to the Copilot server through an HTTP proxy, which may lead to the intercept of Secure Socket Layer (SSL). However, Copilot now offers support for access through an HTTP proxy, thus addressing such problems (GitHub, 2024b). PFE, FCH, FCE, and CTI are the four types of causes for Feature Request. The remaining eight types of causes are less common, but can still provide insights into specific problems related to Copilot. For instance, UU is identified as the direct cause for many users experiencing AUTHENTICATION FAILURE and FUNCTIONALITY FAILURE when using Copilot.

Mapping of Causes to Problems: When Copilot users encounter Operation Issues, nearly one quarter (23.1%) of these problems are caused by errors originating from Copilot server (i.e. CIE), while the rest of such problems are induced by the environment on which Copilot operates. According to the causes of Feature Request, it appears that users typically request for new functionalities or enhancement of existing features based on their personal coding habits and suboptimal experiences when coding with Copilot. Fewer causes

have been identified for Compatibility Issue. When users identify the problem as Compatibility Issue, they tend to focus on finding solutions rather than further analyzing the causes. The causes identified for Suggestion Content Issue and Copyright and Policy Issue are limited in number. One possible reason is that the non-open source nature of Copilot prevents users from investigating the causes of the problems in these two categories. From the results of Table 3, we did not identify the main causes leading to User Experience Issue. However, when users encounter POOR AUTHENTICATION EXPERIENCE or POOR PERFORMANCE when using Copilot, consideration should be given to whether there are internal errors within Copilot, as the four cases of CIE leading to such problems.

Authors:

(1) Xiyu Zhou, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);

(2) Peng Liang (Corresponding Author), School of Computer Science, Wuhan University, Wuhan, China ([email protected]);

(3) Beiqi Zhang, School of Computer Science, Wuhan University, Wuhan, China ([email protected]);

(4) Zengyang Li, School of Computer Science, Central China Normal University, Wuhan, China ([email protected]);

(5) Aakash Ahmad, School of Computing and Communications, Lancaster University Leipzig, Leipzig, Germany ([email protected]);

(6) Mojtaba Shahin, School of Computing Technologies, RMIT University, Melbourne, Australia ([email protected]);

(7) Muhammad Waseem, Faculty of Information Technology, University of Jyväskylä, Jyväskylä, Finland ([email protected]).


This paper is available on arxiv under CC BY 4.0 DEED license.