Abstract and 1 Introduction

2. Background

2.1 Effective Tutoring Practice

2.2 Feedback for Tutor Training

2.3 Sequence Labeling for Feedback Generation

2.4 Large Language Models in Education

3. Method

3.1 Dataset and 3.2 Sequence Labeling

3.3 GPT Facilitated Sequence Labeling

3.4 Metrics

4. Results

4.1 Results on RQ1

4.2 Results on RQ2

5. Discussion

6. Limitation and Future Works

7. Conclusion

8. Acknowledgments

9. References

APPENDIX

A. Lesson Principles

B. Input for Fine-Tunning GPT-3.5

C. Scatter Matric of the Correlation on the Outcome-based Praise

D. Detailed Results of Fine-Tuned GPT-3.5 Model's Performance

7. CONCLUSION

In this study, we investigated the enhancement of automated feedback systems through the application of GPT models, employing a multifaceted approach that included the utilization of prompting GPT-3.5 and GPT-4 models and finetuning GPT-3.5 models for improved performance. Prompting GPT models demonstrated their potential in guiding models to identify specific components of praise, emphasizing the critical role of prompt design in optimizing model outputs.

In comparison, fine-tuning the GPT-3.5 model, in particular, significantly enhanced the system’s ability to accurately highlight key components from tutor responses. This led to the development of an automated feedback system aimed at delivering immediate and explanatory feedback for tutor training, addressing the crucial need for scalable and effective feedback. Our implementation showcases the potential of leveraging advanced large language models to provide highlighting explanatory feedback on tutors’ open-ended responses, offering insights for future research in the development of automated feedback systems.

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) Jionghao Lin, Carnegie Mellon University ([email protected]);

(2) Eason Chen, Carnegie Mellon University ([email protected]);

(3) Zeifei Han, University of Toronto ([email protected]);

(4) Ashish Gurung, Carnegie Mellon University ([email protected]);

(5) Danielle R. Thomas, Carnegie Mellon University ([email protected]);

(6) Wei Tan, Monash University ([email protected]);

(7) Ngoc Dang Nguyen, Monash University ([email protected]);

(8) Kenneth R. Koedinger, Carnegie Mellon University ([email protected]).