Exploratory Data Analysis and Feature Selection for Predictive Modeling of Student Academic Performance Using a Proposed Dataset
Exploratory Data Analysis and Feature Selection for Predictive Modeling of Student Academic Performance Using a Proposed Dataset |
||
|
||
© 2024 by IJETT Journal | ||
Volume-72 Issue-11 |
||
Year of Publication : 2024 | ||
Author : Hardik I. Patel, Dharmendra Patel |
||
DOI : 10.14445/22315381/IJETT-V72I11P116 |
How to Cite?
Hardik I. Patel, Dharmendra Patel, "Exploratory Data Analysis and Feature Selection for Predictive Modeling of Student Academic Performance Using a Proposed Dataset," International Journal of Engineering Trends and Technology, vol. 72, no. 11, pp. 131-143, 2024. Crossref, https://doi.org/10.14445/22315381/IJETT-V72I11P116
Abstract
Academic performance prediction is vital for numerous applications. The previous research adopted methodologies that were lacking in scientific approach. Most researchers have predicted student academic performance by focusing on academic parameters. However, social and economic factors also influence academic outcomes. The proposed research encompasses both academic and socio-economic parameters to predict student academic performance more comprehensively. The careful steps of data collection, cleansing, and use of Exploratory Data Analysis (EDA) to improve model prediction accuracy are described in depth in this study. This article depicted research gaps in previous research. This article focuses mainly on exploratory data analysis. Exploratory data analysis is vital for understanding data thoroughly before applying a prediction model. In data science, understanding data is more important than applying predictive algorithms. This proposed research has designed a novel exploratory data analysis algorithm and applied it to the proposed dataset. The article also decided on features that are essential for better prediction of predictive algorithms. This research aims to improve the predictive modeling of student academic success by using a large dataset that includes academic and socioeconomic characteristics. By going beyond conventional academic measures, this study fills a significant research vacuum by acknowledging the variety of factors impacting educational results. Through presenting a comprehensive viewpoint, the study seeks to advance the comprehension of the factors that influence academic achievement. The technique presented here provides a solid foundation for further studies in this area, highlighting the significance of taking a variety of factors into account for a more thorough assessment of student performance.
Keywords
Exploratory data analysis, Predictive model, Feature selection, Data preprocessing, Machine learning.
References
[1] Mesfin Tadese, Alex Yeshaneh, and Getaneh Baye Mulu, “Determinants of Good Academic Performance Among University Students in Ethiopia: A Cross Sectional Study,” BMC Medical Education, vol. 22, no. 1, pp. 1-9, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Travis T. York, Charles Gibson, and Susan Rankin, “Defining and Measuring Academic Success,” Practical Assessment, Research & Evaluation, pp. 1-20, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Kris Kimbark, Michelle L. Peters, and Tim Richardson, “Effectiveness of the Student Success Course on Persistence, Retention, Academic Achievement, and Student Engagement,” Community College Journal of Research and Practice, vol. 41, no. 2, pp. 124-138, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Kevin M. Bonney, “Case Study Teaching Method Improves Student Performance and Perceptions of Learning Gains,” Journal of Microbiology & Biology Education, vol. 16, no. 1, pp. 21-28, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Pengyue Guo et al., “A Review of Project-Based Learning in Higher Education: Student Outcomes and Measures,” International Journal of Educational Research, vol. 102, pp. 1-13, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Poonam Sawan et al., “Classification Approach for Evaluating Students Performance in Covid 19 Pandemic,” International Journal of Engineering and Advanced Technology, vol. 10, no. 4, pp. 110-113, 2021.
[Google Scholar] [Publisher Link]
[7] Susmita Pati et al., “Early Identification of Young Children at Risk for Poor Academic Achievement: Preliminary Development of a Parent-Report Prediction Tool,” BMC Health Services Research, vol. 11, no. 1, pp. 1-13, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[8] David Bañeres et al., “An Early Warning System To Identify and Intervene Online Dropout Learners,” International Journal of Educational Technology in Higher Education, vol. 20, no. 1, pp. 1-25, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Alebiosu, Eunice Oluwayemisi, Akintoke, Victor Akin, and Oginni, Omoniyi Israel, “Implications of Counselling, Psychological and Social Services on Academic Performance of Primary School Pupils in Southwest, Nigeria,” Contemporary Research in Education and English Language Teaching, vol. 3, no. 2, pp. 1-8, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Januard D. Dagdag, Hydee G. Cuizon, and Aisie O. Bete, “College Students’ Problems and their Link to Academic Performance: Basis for Needs-Driven Student Programs,” Journal of Research, Policy & Practice of Teachers &Teacher Education, vol. 9, no. 2, pp. 54-65, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Anupam Khan, and Soumya K. Ghosh, “Student Performance Analysis and Prediction in Classroom Learning: A Review of Educational Data Mining Studies,” Educational and Information Technologies, vol. 26, no. 1, pp. 205-240, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Filgona Jacob, Sakiyo John, and D. M. Gwany, “Teachers' Pedagogical Content Knowledge and Students' Academic Achievement: A Theoretical Overview,” Journal of Global Research in Education and Social Science, vol. 14, no. 2, pp. 14-44, 2020.
[Google Scholar] [Publisher Link]
[13] Zhonglu Li, and Zeqi Qiu, “How Does Family Background Affect Children’s Educational Achievement? Evidence From Contemporary China,” Journal of Chinese Sociology, vol. 5, no. 1, pp. 1-21, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Carlos Felipe Rodríguez-Hernández, Eduardo Cascallar, and Eva Kyndt, “Socio-economic Status and Academic Performance in Higher Education: A Systematic Review,” Educational Research Review, vol. 29, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[15] O.K Osonwa et al., “Economic Status of Parents, a Determinant on Academic Performance of Senior Secondary Schools Students in Ibadan, Nigeria,” Journal of Educational and Social Research, vol. 3, no. 1, pp. 115-122, 2013.
[Google Scholar] [Publisher Link]
[16] Wanli Xing, “Exploring The Influences of Mooc Design Features on Student Performance and Persistence,” Distance Education, vol. 40, no. 1, pp. 1-16, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[17] James D.A. Parker et al., “Emotional Intelligence and Student Retention: Predicting the Successful Transition From High School to University,” Personality and Individual Differences, vol. 47, no. 7, pp. 1329-1336, 2006.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Maria P.G. Martins et al., “A Data Mining Approach for Predicting Academic Success: A Case Study,” Information Technology and Systems, pp. 45-56, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Emtinan Alqurashi, “Predicting Student Satisfaction and Perceived Learning Within Online Learning Environments,” Distance Education, vol. 40, no. 1, pp. 133-148, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Jonas Willems et al., “Identifying Science Students at Risk in the First Year of Higher Education: The Incremental Value of Non-Cognitive Variables in Predicting Early Academic Achievement,” European Journal of Psychology of Education, vol. 34, pp. 847-872, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Boris Pérez, Camilo Castellanos, and Darío Correal, “Predicting Student Drop-Out Rates Using Data Mining Techniques: A Case Study,” Applications of Computational Intelligence, vol. 833, pp. 111-125, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[22] S.M.F.D. Syed Mustapha, “Predictive Analysis of Students’ Learning Performance Using Data Mining Techniques: A Comparative Study of Feature Selection Methods,” Applied System Innovation, vol. 6, no. 5, pp. 1-24, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[23] M. Narayana Swamy, and M. Hanumanthappa, “Predicting Academic Success from Student Enrolment Data using Decision Tree Technique,” International Journal of Applied Information Systems (IJAIS), vol. 4, no. 3, pp. 1-6, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Riyadh Mehdi, and Mirna Nachouki, “A Neuro-Fuzzy Model for Predicting and Analyzing Student Graduation Performance in Computing Programs,” Education and Information Technologies, vol. 28, pp. 2455-2484, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Raza Hasan et al., “Student Academic Performance Prediction by using Decision Tree Algorithm,” 4th International Conference on Computer and Information Sciences, Kuala Lumpur, Malaysia, pp. 1-5, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Hashmia Hamsa, Simi Indiradevi, and Jubilant J. Kizhakkethottam, “Student academic performance Prediction Model using Decision Tree and Fuzzy Genetic Algorithm,” Procedia Technology, vol. 25, pp. 326-332, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[27] X.W. Liang et al., “LR-SMOTE-An Improved Unbalanced Data Set Oversampling Based on K-Means and SVM,” Knowledge Based Systems, vol. 196, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Pamela Chaudhary et al., “Enhancing the capabilities of Student Result Prediction System,” ICTCS ‘16: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, pp. 1-6, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Maryam Zaffar et al., “A Study of Feature Selection Algorithms for Predicting Students Academic Performance,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 5, pp. 541-549, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Amal Asselman, Mohamed Khaldi, and Souhaib Aammou, “Enhancing the Prediction of Student Performance Based on the Machine Learning XGBoost Algorizthm,” Interactive Learning Environments, vol. 31, no. 6, pp. 3360-3379, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Pamela Chaudhury, and Hrudaya Kumar Tripathy, “An Empirical Study on Attribute Selection of Student,” International Journal of Learning Technology, vol. 12, no. 3, pp. 241-252, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[32] K.R. Vineetha, and E. Chandra Blessie, “Efficient Prediction of Student Performance Using Hybrid SVM Classifier,” International Journal of Computer Science and Engineering Technology, vol. 9, no. 3, pp. 32-39, 2018.
[Google Scholar] [Publisher Link]
[33] Phauk Sokkhey, and Takeo Okazaki, “Study on Dominant Factor for Academic Performance Prediction using Feature Selection Methods,” International Journal of Advanced Computer Science and Applications (IJACSA), vol. 11, no. 8, pp. 492-502, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Ali Daud et al., “Predicting Student Performance using Advanced Learning Analytics,” WWW ‘17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, pp. 415-421, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Hideki Fujiyam, Yoshinori Kamo, and Mark Schafer, “Peer Effects of Friend and Extracurricular Activity Networks on Students’ Academic Performance,” Social Science Research, vol. 97, pp. 1-37, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Azwa Abdul Aziz et al., “A Framework for Students’ Academic Performance Analysis using Naïve Bayes Classifier,” Journal of Technology, vol. 75, no. 3, pp. 13-19, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Vasiliki Matzavela, and Efthimios Alepis, “Decision tree learning through a Predictive Model for Student Academic Performance in Intelligent M-Learning Environments,” Computer and Education: Artificial Intelligence, vol. 2, pp. 1-12, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Amjad Abu Saa, “Educational Data Mining and Student's Performance Prediction,” International Journal of Advanced Computer Science and Applications, vol. 7, no. 5, pp. 212-220, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Alaa Hamoud, Ali Salah Hashim, and Wid Akeel Awadh, “Predicting Student Performance in Higher Education Institutions Using Decision Tree Analysis,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, no. 2, pp. 26-31, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Raheela Asif, Agathe Merceron, and Mahmood K. Pathan, “Predicting Student Academic Performance at Degree Level: A Case Study,” International Journal of Intelligent Systems and Applications, vol. 7, no. 1, pp. 49-61, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[41] M. Al-Barrak, and M. Al-Razgan, “Predicting Students Final GPA Using Decision Trees: A Case Study,” International Journal of Information and Education Technology, vol. 6, no. 7, pp. 528-533, 2016.
[Google Scholar] [Publisher Link]
[42] Ralph Olusola Aluko, “Towards Reliable Prediction of Academic Performance of Architecture Students Using Data Mining Techniques,” Journal of Engineering, Design and Technology, vol. 16, no. 3, pp. 385-397, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Aderibigbe Israel Adekitan, and Odunayo Salau, “The Impact of Engineering Students’ Performance in the First Three Years on their Graduation Result Using Educational Data Mining,” Heliyon, vol. 5, no. 2, pp. 1-21, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Raheela Asif et al., “Analyzing Undergraduate Students' Performance Using Educational Data Mining,” Computers & Education, vol. 113, pp. 177-194, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Muhammad Imran et al., “Student Academic Performance Prediction Using Supervised Learning Techniques,” International Journal of Emerging Technologies in Learning, vol. 14, no. 14, pp. 92-104, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Efrem Yohannes Obsie, and Seid Ahmed Adem, “Prediction of Student Academic Performance Using Neural Network, Linear Regression and Support Vector Regression: A Case Study,” International Journal of Computer Applications, vol. 180, no. 40, pp. 39-47, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Alireza Ahadi et al., “Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance,” Proceedings of the Eleventh Annual International Conference on International Computing Education Research, USA, pp. 121- 130, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Sachio Hirokawa, “Key Attribute for Predicting Student Academic Performance,” Proceedings of the 10th International Conference on Education Technology and Computers, pp. 308-313, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[49] F. Okubo et al., “A Neural Network Approach for Students’ Performance Prediction,” Proceedings of the 7th International Learning Analytics & Knowledge Conference, Canada, pp. 598-599, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[50] Mayreen V. Amazona, and Alexander A. Hernandez, “Modelling Student Performance Using Data Mining Techniques: Inputs for Academic Program Development,” Proceedings of the 2019 5th International Conference on Computing and Data Engineering, pp. 36-40, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[51] Amjad Abu Saa, Mostafa Al-Emran, and Khaled Shaalan, “Mining Student Information System Records to Predict Students’ Academic Performance,” Proceedings the International Conference on Advanced Machine Learning Technologies and Applications, pp. 229-239, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[52] Mustafa Yağcı, “Educational Data Mining: Prediction of Students’ Academic Performance Using Machine Learning Algorithms,” Smart Learning Environments, vol. 9, no. 11, pp. 1-19, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[53] Luca Cagliero et al., “Predicting Student Academic Performance by Means of Associative Classification,” Applied Sciences, vol. 11, no. 4, PP. 1-21, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[54] Eyman Alyahya, and Dilek Düştegör, “Predicting Academic Success in Higher Education: Literature Review and Best Practices,” International Journal of Educational Technology in Higher Education, vol. 17, pp. 1-21, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[55] P. Umamaheswari et al., “Student Success Prediction using a Novel Machine Learning Approach based on Modified SVM,” Multidisciplinary Science Journal, vol. 5, no. 15, pp. 1-7, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[56] Muhammad Mubashar Hussain et al., “Prediction of Student’s Academic Performance through Data Mining Approach,” Journal of Informatics and Web Engineering, vol. 3, no. 1, pp. 241-251, 2024.
[CrossRef] [Google Scholar] [Publisher Link]