Predicting the final grade using a machine learning regression model: insights from fifty percent of total course grades in CS1 courses

Carlos Giovanny Hidalgo Suarez; Jose Llanos; Víctor A Bucheli

doi:10.7717/peerj-cs.1689

Predicting the final grade using a machine learning regression model: insights from fifty percent of total course grades in CS1 courses

PeerJ Comput Sci. 2023 Dec 11:9:e1689. doi: 10.7717/peerj-cs.1689. eCollection 2023.

Authors

Carlos Giovanny Hidalgo Suarez¹, Jose Llanos², Víctor A Bucheli²

Affiliations

¹ Software Systems Engineering, Universidad de San Buenaventura, Cali, Valle del Cauca, Colombia.
² School of Systems Engineering and Computing, Universidad del Valle, Cali, Valle del Cauca, Colombia.

Abstract

This article introduces a model for accurately predicting students' final grades in the CS1 course by utilizing their grades from the first half of the course. The methodology includes three phases: training, testing, and validation, employing four regression algorithms: AdaBoost, Random Forest, Support Vector Regression (SVR), and XGBoost. Notably, the SVR algorithm outperformed the others, achieving an impressive R-squared (R²) value ranging from 72% to 91%. The discussion section focuses on four crucial aspects: the selection of data features and the percentage of course grades used for training, the comparison between predicted and actual values to demonstrate reliability, and the model's performance compared to existing literature models, highlighting its effectiveness.

Keywords: CS1; Course grade; Machine learning; Predicting final grade; Regression model.

Grants and funding

The authors received no funding for this work.