Table of Links
2. Contexts, Methods, and Tasks
3.1. Quality and 3.2 Productivity
5. Discussion and Future Work
5.1. LLM, Your pAIr Programmer?
5.2. LLM, A Better pAIr Programmer?
5.3. LLM, Students’ pAIr Programmer?
6. Conclusion, Acknowledgments, and References
4.3 Communication
According to Freudenberg et al. [24], “the key to the success of pair programming [is] the proliferation of talk at an intermediate level of detail in pair programmers’ conversations.” Researchers also found that pair programming eliminates distracting activity and enables programmers to focus on productive activity [75], which could be why engaging communications contribute to the success of pair programming. Murphy et al. [55] used transactive analysis to break down communication by different types of transactions, and they found that attempting more problems associated with more completion transactions and debugging success correlated with more critique transactions. Some other works pointed out the social support aspect of communication [17] and an explanation effect where the verbalization of the thought process makes it clearer [12].
In human-human pair programming, programmers spend about 1/3 of the time primarily focusing on communication [65], which forces them to concentrate, rationalize, and explain their thoughts [31, 75]. In human-AI pair programming, Mozannar et al. [53] has shown that an analogous 1/3 amount of time is spent communicating with Copilot, such as thinking and verifying (22.4%) Copilot’s suggestion, which may be replicating the self-explanation effects in some ways, and prompt crafting, which takes 11.56% of the time. These activities are arguably efforts to understand and communicate with Copilot. However, there is no other human to co-verify the answers, and there is no study that evaluate the communicative nature of human-Copilot interaction as human-human pair programming.
4.4 Collaboration
How well partners collaborate have been important factors that affect pair programming effectiveness [4, 79], and cooperative behavior and positive interdependence are key to pair programming success [67].
Collaboration can fail in various ways in a human-human pair. For example, the free-rider problem, where the entire workload is on one partner while the other remains a marginal player, can result in less satisfaction and learning [57, 87]. In human-AI pair programming, educators are worried that easily available code-generation tools may lead to cheating, and over-reliance on AI may hinder students learning [10]. However, no study has formally evaluated it.
For human-human pair programming, there is a suggested collaboration pattern of role-switching – two software developers periodically and regularly switch between writing code (driver) and suggesting code (navigator), aiming to ensure that both are engaged in the task and alleviate the physical and cognitive load borne by the driver [5, 65].
Some researchers Freudenberg et al. [24] argue that the success of pair programming should be attributed to communication rather than “the differences in behavior or focus between the driver and navigator,” as they found both driver and navigator worked on similar levels of abstraction. Nevertheless, instructors still recommend drivers and navigators to regularly alternate roles to ensure equitable learning experiences [83].
In human-AI interaction, given Copilot’s amazing capability to write code in different languages, some have argued that Copilot can take on the role of the “driver” in pair programming, allowing a solo programmer to take on the role of the “navigator” and focus on understanding the code at a higher level [35]. However, while it is possible for humans to offload some API lookup and syntax details to Copilot, humans still need to jump back into the driver’s seat frequently and fluidly switch between the thinking and writing activities [53]. It is ultimately the human programmer’s sole responsibility to understand code at the statement level [72].
4.5 Logistics
Logistical challenges, including scheduling difficulties, teaching and evaluating collaboration for the pair, and figuring out individual accountability and responsibility [11, 67], can add to the management cost of human-human pair programming [4, 79].
In human-AI pair programming, some may argue that the human is solely responsible in the human-AI pair [72], but the accountability of these LLM-based generative AI is still under debate [10]. There may be new logistics issues for the human-AI pair, such as teaching humans how to best collaborate with Copilot. There could also be unique challenges as in every human-AI interaction scenario, such as bias, trust, and technical limitations – much to be explored. More study would be needed to empirically and experimentally verify the moderating effects of different variables in human-AI pair programming.
Summary: Human-human pair programming literature have found moderators including task type & complexity, compatibility, communication, collaboration, and logistics. However, there is a lack of in-depth examination of potential moderating effects in current pAIr works.
Authors:
(1) Qianou Ma (Corresponding author), Carnegie Mellon University, Pittsburgh, USA ([email protected]);
(2) Tongshuang Wu, Carnegie Mellon University, Pittsburgh, USA ([email protected]);
(3) Kenneth Koedinger, Carnegie Mellon University, Pittsburgh, USA ([email protected]).
This paper is