Comparing Multiple Ensemble Classifiers for E-commerce Recommendation System
Comparing Multiple Ensemble Classifiers for E-commerce Recommendation System |
||
|
||
© 2025 by IJETT Journal | ||
Volume-73 Issue-1 |
||
Year of Publication : 2025 | ||
Author : Ignatius Michael Dinata, Vincentius Loanka Sinaga, Antoni Wibowo |
||
DOI : 10.14445/22315381/IJETT-V73I1P121 |
How to Cite?
Ignatius Michael Dinata, Vincentius Loanka Sinaga, Antoni Wibowo, "Comparing Multiple Ensemble Classifiers for E-commerce Recommendation System," International Journal of Engineering Trends and Technology, vol. 73, no. 1, pp. 250-254, 2025. Crossref, https://doi.org/10.14445/22315381/IJETT-V73I1P121
Abstract
E-commerce recommendation systems face challenges with data sparsity, which impacts the accuracy of user engagement and product recommendations. This research evaluates the performance of multiple Machine Learning classifiers, including Extreme Gradient Boosting (XGBoost), K-Nearest Neighbor (KNN), Random Forest, and Support Vector Machine (SVM), with and without the use of the Synthetic Minority Over-sampling Technique (SMOTE). The results indicate that XGBoost with SMOTE achieves the highest performance across all evaluation metrics (accuracy, precision, recall, and F1-score), scoring 0.97 in each metric. Random Forest also performs well, achieving 0.95 for all metrics, while KNN scores moderately at 0.83. SVM shows the lowest performance, with an accuracy of 0.59 and an F1-score of 0.51. These findings highlight the robustness of XGBoost combined with SMOTE in handling imbalanced data and improving prediction accuracy in e-commerce recommendation systems, offering valuable insights for researchers and practitioners in this domain.
Keywords
XGBoost, KNN, Random Forest, SVM, E-Commerce.
References
[1] J. Anitha, and M. Kalaiarasu, “RETRACTED ARTICLE: Optimized Machine Learning Based Collaborative Filtering (OMLCF) Recommendation System in E-Commerce,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 6, pp. 6387-6398, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Tianqi Chen, and Carlos Guestrin, “XGBoost: A Scalable Tree Boosting System,” KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, USA, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Sunny Sharma, Vijay Rana, and Manisha Malhotra, “Automatic Recommendation System Based on Hybrid Filtering Algorithm,” Education and Information Technologies, vol. 27, no. 2, pp. 1523-1538, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Riya Widayanti et al., “Improving Recommender Systems Using Hybrid Techniques of Collaborative Filtering and Content-Based Filtering,” Journal of Applied Data Sciences, vol. 4, no. 3, pp. 289-302, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Peiyi Song, and Yutong Liu, “An XGBoost Algorithm for Predicting Purchasing Behaviour on E-Commerce Platforms,” Technical Bulletin, vol. 27, no. 5, pp. 1467-1471, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Sumitra Nuanmeesri, and Wongkot Sriurai, “Second-Hand Cars Recommender System Model Using the Smote and the Random Forest Technique,” Journal of Xi’an University of Architecture and Technology, vol. 12, no. 4, pp. 3687-3695, 2020.
[Google Scholar]
[7] Adyanata Lubis et al., “Leveraging K-Nearest Neighbors with Smote and Boosting Techniques for Data Imbalance and Accuracy Improvement,” Journal of Applied Data Sciences, vol. 5, no. 4, pp. 1625-1638, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Alberto Fernandez et al., “Smote for Learning from Imbalanced Data: Progress and Challenges, Marking The 15-Year Anniversary,” Journal of Artificial Intelligence Research, vol. 61, pp. 863-905, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Amit Pandey, and Achin Jain, “Comparative Analysis of KNN Algorithm using Various Normalization Techniques,” International Journal of Computer Network and Information Security, vol. 9, no. 11, pp. 36-42, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Niklas Donges, Random Forest Algorithm: A Complete Guide, Built In, 2022. [Online]. Available: https://builtin.com/data-science/random-forest-algorithm
[11] Sajib Kabiraj et al., “Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm,” 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, pp. 1-4, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Haifeng Wang, and Dejin Hu, “Comparison of SVM and LS-SVM for Regression,” 2005 International Conference on Neural Networks and Brain, Beijing, pp. 279-283, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Danial Jahed Armaghani et al., “Examining Hybrid and Single SVM Models with Different Kernels to Predict Rock Brittleness,” Sustainability, vol. 12, no. 6, pp. 1-17, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Arepalli Peda Gopi et al., “Classification of Tweets Data Based on Polarity Using Improved RBF Kernel of SVM,” International Journal of Information Technology, vol. 15, no. 2, pp. 965-980, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[15] R. Jackson Divakar, Online Shopping Dataset, Kaggle, 2024. [Online]. Available: https://www.kaggle.com/datasets/jacksondivakarr/online-shopping-dataset
[16] Bay Vo et al., “Efficient Methods for Clickstream Pattern Mining on Incremental Databases,” IEEE Access, vol. 9, pp. 161305-161317, 2021.
[CrossRef] [Google Scholar] [Publisher Link]