Predicting new student performances and identifying important attributes of admission data using machine learning techniques with hyperparameter tuning
More details
Hide details
Faculty of Science, Ubon Ratchathani University, Ubon Ratchathani, THAILAND
Online publication date: 2023-11-05
Publication date: 2023-12-01
EURASIA J. Math., Sci Tech. Ed 2023;19(12):em2369
Recently, many global universities have faced high student failure and early dropout rates reflecting on the quality of education. To tackle this problem, forecasting student success as early as possible with machine learning is one of the most important approaches used in modern universities. Thus, this study aims to analyze and compare models for the early prediction of student performance with six machine learning based on Thailand’s education curriculum. A large dataset was collected from the admission scores of 5,919 students during 2011-2021 of 10 programs in the Faculty of Science at Ubon Ratchathani University. The methodology was carried out using Jupyter Notebook, Python 3, and Scikit-Learn to build the models for prediction. To obtain a higher result, we needed not only to find high-performance prediction models, but also to tune hyperparameter configurations consisting of 138 possible different patterns to identify the best-tuned model for each classifier. Furthermore, we investigated significantly important predictors affecting student success for 10 programs in our faculty. In the experiments, the process was divided into two parts: First, we evaluated effective models using a confusion matrix with 10-fold cross-validation. The results showed that random forest (RF) had the highest F1-measure of 86.87%. While predictive models using fine-tuned RF of 10 programs claimed accuracy of about 72% to 93%. Second, we computed the importance of each feature with fine-tuned RF classifiers. The result showed that national test scores (e.g., ONET-English, ONET-Math, ONET-Science, ONET-Social studies, ONET-Thai, and PAT2), entry type, and school grade (e.g., art, English, GPA, health, math, science, and technology) are highly influential features for predicting student success. In summary, these results yield many benefits for other relevant educational institutions to enhance student performance, plan class strategies and undertake decision-making processes.
Adekitan, A. I., & Noma-Osaghae, E. (2019). Data mining approach to predicting the performance of first year student in a university using the admission requirements. Education and Information Technology, 24, 1527–1543.
Ali, H., Mohd Salleh, M. N. B., Saedudin, R. R., Hussain, K., & Mushtaq, M. F. (2019). Imbalance class problems in data mining: A review. Indonesian Journal Of Electrical Engineering and Computer Science, 14(3), 1560-1571.
Ali, Y. A., Awwad, E. M., Al-Razgan, M., & Maarouf, A. (2023). Hyperparameter search for machine learning algorithms for optimizing the computational complexity. Processes, 11(2), 349.
Assami, S., Daoudi, N., & Ajhoun, R. (2022). Implementation of a machine learning-based MOOC recommender system using learner motivation prediction. International Journal of Engineering Pedagogy, 12(5), 68-85.
Backham, N. B., Akeh, L. J., Mitaart, G. N. P., & Moniaga, J. V. (2023). Determining factors that affect student performance using various machine learning methods. Procedia Computer Science, 216, 597-603.
Bengesai, A. V., & Pocock, J. (2021). Patterns of persistence among engineering students at a south African university: A decision tree analysis. South African Journal of Science, 117(3/4).
Bilal, M., Omar, M., Anwar, W., Bokhari, R. H., & Choi, G. S. (2022). The role of demographic and academic features in a student performance prediction. Scientific Reports, 12, 12508.
Chang, T.-C., & Wang, H. (2016). A multi criteria group decision-making model for teacher evaluation in higher education based on cloud model and decision tree. EURASIA Journal of Mathematics, Science and Technology Education, 12(5), 1243-1262.
Cui, J., Zhang, Y., An, R., Yun, Y., Dai, H., & Shang, X. (2021). Identifying key features in student grade prediction. In Proceedings of the International Conference on Progress in Informatics and Computing (pp. 519-523). IEEE.
Dabaliz, A.-A., Kaadan, S., Dabbagh, M. M., Barakat, A., Shareef, M. A., Al-Tannir, M., Obeidat, A., & Mohamed, A. (2017). Predictive validity of pre-admission assessments on medical student performance. International Journal of Medical Education, 8, 408-413.
Devi, K., & Ratnoo, S. (2022). Predicting student dropouts using random forest. Journal of Statistics and Management Systems, 25(7), 1579-1590.
Gutierrez, O. A., Taylor, D. M. H., Santos-Guevara, A., Chavarria-Garza, W. X., Martinez-Huerta, H., & Galloway, R. K. (2022). How the entry profiles and early study habits are related to first-year academic performance in engineering programs. Sustainability, 14(22), 15400.
Holladay, S. D., Gogal, R. M., Moore, P. C., Tuckfield, R. C., Burgess, B. A., & Brown, S. A. (2020). Predictive value of veterinary student application data for class rank at end of year 1. Veterinary Sciences, 7(3), 120-132.
Huynh-Cam, T.-T., Chen, L.-S., & Huynh, K.-V. (2022). Learning performance of international students and students with disabilities: Early prediction and feature selection through educational data mining. Big Data and Cognitive Computing, 6(3), 94.
Huynh-Cam, T.-T., Chen, L.-S., & Le, H. (2021). Using decision trees and random forest algorithms to predict and determine factors contributing to first-year university students’ learning performance. Algorithms, 14(11), 318.
Jayaprakash, S., Krishnan, S., & Jaiganesh, V. (2020). Predicting students academic performance using an improved random forest classifier. In Proceedings of the International Conference on Emerging Smart Computing and Informatics (pp. 238-243). IEEE.
Kaensar, C., & Wongnin, W. (2023). Analysis and prediction of student performance based on Moodle log data using machine learning techniques. International Journal of Emerging Technologies in Learning, 18(10), 184-203.
Kemda, L. E., & Murray, M. (2021). Statistical modeling of students’ academic performances: A longitudinal study. International Journal of Higher Education, 10(6), 153-170.
Ko, C.-Y., & Leu, F.-Y. (2021). Examining successful attributes for undergraduate students by applying machine learning techniques. IEEE Transactions on Education, 64(1), 50-57.
Kornpitack, P., & Sawmong, S. (2022). Empirical analysis of factors influencing student satisfaction with online learning systems during the COVID-19 pandemic in Thailand. Heliyon, 8(3), e09183.
Maksimova, N., Pentel, A., & Dunajeva, O. (2022). Computer science students early drop-out prediction using machine learning: A case study. In M. E. Auer, A. Pester, & D. May (Eds.), Learning with technologies and technologies in learning (pp. 523-549). Springer.
Mengash, H. A. (2020). Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access, 8, 55462-55470.
Mothial, R. K., De Laet, T., Broos, T., & Pinxten, M. (2018). Predicting first-year engineering student success: From traditional statistics to machine learning. In Proceedings of the 46th SEFI Annual Conference. The European Society for Engineering Education.
Nurhachita, N., & Negara, E. S. (2021). A comparison between deep learning, naïve Bayes and random forest for the application of data mining on the admission of new students. International Journal of Artificial Intelligence, 10(2), 324-341.
Qahmash, A., Ahmad, N., & Algarni, A. (2023). Investigating students’ pre-university admission requirements and their correlation with academic performance for medical students: An educational data mining approach. Brain Sciences, 13(3), 456-465.
Rajagopal, S. K. P. (2020). Predicting student university admission using logistic regression. European Journal of Computer Science and Information Technology, 8(3), 46-56.
Raschka, S. (2015). Python machine learning. Packt Publishing Ltd.
Roslan, M. H. B., & Chen, C. J. (2020). Educational data mining for student performance prediction: A systematic literature review (2015-2021). International Journal of Emerging Technologies in Learning, 17(05), 147-179.
Santosa, R. G., Lukito, Y., & Chrismanto, A. R. (2021). Classification and prediction of students’ GPA using k-means clustering algorithm to assist student admission process. Journal of Information Systems Engineering and Business Intelligence, 7(1), 1-10.
Sathe, M., & Adamuthe, A. C. (2021). Comparative study of supervised algorithms for prediction of sudents’ performance. International Journal of Modern Education and Computer Science, 13(1), 1-21.
Singh, M., Verma, C., Kumar, R., & Juneja, P. (2020). Towards enthusiasm prediction of Portuguese school’s students towards higher education in realtime. In Proceedings of the International Conference on Computation, Automation and Knowledge Management (pp. 421-425). IEEE.
Ubon Ratchathani University. (2010). REG UBU system: Office of registration.
Usman, M., Iqbal, M. M., Iqbal, Z., Chaudhry, M. U., Farhan, M., & Ashraf, M. (2017). E-assessment and computer-aided prediction methodology for student admission test score. EURASIA Journal of Mathematics, Science and Technology Education, 13(8), 5499-5517.
Yagci, M. (2022). Educational data mining: Prediction of students’ academic performance using machine learning algorithms. Smart Learning Environments, 9, 11.
Yang, J., Jiang, H., Wang, J., & Luo, H. (2022). Key factors influencing blended learning outcomes in an undergraduate course: Perspectives from learning behaviors and experiences. In Proceedings of the 4th International Conference on Computer Science and Technologies in Education (pp. 123-127). IEEE.
Journals System - logo
Scroll to top