Optimizing Customer Experience Analysis Across Dataset Size Reduction and Relevant Features Selection

Optimizing Customer Experience Analysis Across Dataset Size Reduction and Relevant Features Selection

	© 2023 by IJETT Journal
	Volume-71 Issue-12
	Year of Publication : 2023
	Author : Sara AHSAIN, Yasyn EL YUSUFI, M’hamed AIT KBIR
	DOI : 10.14445/22315381/IJETT-V71I12P209

How to Cite?

Sara AHSAIN, Yasyn EL YUSUFI, M’hamed AIT KBIR, "Optimizing Customer Experience Analysis Across Dataset Size Reduction and Relevant Features Selection," International Journal of Engineering Trends and Technology, vol. 71, no. 12, pp. 78-89, 2023. Crossref, https://doi.org/10.14445/22315381/IJETT-V71I12P209

Abstract
Today, in an era of data-driven business, customer sentiment analysis is becoming more important. It allows organizations to identify areas in their operations where some services and products can be improved. This can help them to make better decisions and improve their customer experience. The main goal of this study is to classify Amazon customers’ reviews. The dataset consists of a collection of product reviews with an overall appreciation. This dataset is a rich source of information for academic researchers in the fields of natural language processing and machine learning that concern customer experience understanding with some products. Despite its diversity in terms of product categories, the huge number of records makes the exploration and the use of this dataset time and resources-consuming. Thus, it is not easy to use computers with standard performances. The proposed approach is centered on selecting a representative subset of the original dataset combined with relevant feature selection, using ensemble learning techniques, on reducing the processed data size while achieving interesting results compared with research interested in the same dataset. In fact, when dealing with the ‘Magazine subscriptions’ category and using only 12% of the original collection of examples, the proposed approach shows a high level of performance with respect to the following metrics: accuracy (up to 0.94), sensitivity (up to 0.90) and specificity (up to 0.97).

Keywords
Classification, Feature extraction, Feature selection, Sentiment analysis.

References
[1] Marjane Holding-Leader in Mass Distribution, Marjane. [Online]. Available: https://www.marjane.ma/corporate/corporate
[2] Haleem, Abid, et al. "Artificial Intelligence (AI) Applications for Marketing: A Literature-based Study." International Journal of Intelligent Networks, vol. 3, pp. 119-132, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Afifah Mohd Asri, Siti Rohaidah Ahmad, and Nurhafizah Moziyana Mohd Yusop, “Feature Selection using Particle Swarm Optimization for Sentiment Analysis of Drug Reviews,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Nikhat Parveen et al., “Twitter Sentiment Analysis using Hybrid Gated Attention Recurrent Network,” Journal of Big Data, vol. 10, no. 1, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Tarun Jain et al., “Sentiment Analysis on COVID-19 Vaccine Tweets using Machine Learning and Deep Learning Algorithms,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Mouaad Errami et al., “Sentiment Analysis on Moroccan Dialect based on ML and Social Media Content Detection,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 3, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Maria Yousef, and Abdulla ALali, “Analysis and Evaluation of Two Feature Selection Algorithms in Improving the Performance of the Sentiment Analysis Model of Arabic Tweets,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 6, pp. 705-711, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Reza Maulana et al., “Improved Accuracy of Sentiment Analysis Movie Review Using Support Vector Machine Based Information Gain,” Journal of Physics: Conference Series, vol. 1641, no. 1, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[9] KaiSiang Chong, and Nathar Shah, “Comparison of Naive Bayes and SVM Classification in Grid-Search Hyperparameter Tuned and Non-Hyperparameter Tuned Healthcare Stock Market Sentiment Analysis,” International Journal of Advanced Computer Science and Applications, vol. 13, no. 12, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Jianmo Ni, Jiacheng Li, and Julian McAuley, “Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects,” EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, pp. 188–197, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Amazon Review Data, 2018. [Online]. Available: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/
[12] English, English Spacy Models Documentation, Spacy. [Online]. Available: https://spacy.io/models/en#en_core_web_sm
[13] Mueller, Wordcloud: A Little Word Cloud Generator, Github. [Online]. Available: https://github.com/amueller/word_cloud
[14] Vijaylakshmi Sajwan et al., “Sentiment Analysis of Twitter Data Regarding the Agnipath Scheme of the Defense Forces,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 30, no. 3, pp. 1643–1650, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[15] C.J. Hutto, and Eric Gilbert, “VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text,” Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, no. 1, pp. 216-225, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Ditiman Hazarika et al., “Sentiment Analysis on Twitter by Using TextBlob for Natural Language Processing,” Annals of Computer Science and Information Systems, vol. 24, pp. 63-67, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Aigerim Toktarova et al., “Hate Speech Detection in Social Networks using Machine Learning and Deep Learning Methods,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Stephen Robertson, “Understanding Inverse Document Frequency: On Theoretical Arguments for IDF,” Journal of Documentation, vol. 60, no. 5, pp. 503-520, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Khattak, Asad, et al. "Customer Churn Prediction Using Composite Deep Learning Technique." Scientific Reports, vol. 13, no. 1, pp. 1- 17, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Sklearn.Metrics.Accuracy_Score, Scikit-Learn. [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html
[21] Jacob Cohen, “A Coefficient of Agreement for Nominal Scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37-46, 1960.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Arno De Caigny, Kristof Coussement, and Koen W. De Bock, “A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees,” European Journal of Operational Research, vol. 269, no. 2, pp. 760-772, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Theodoros Evgeniou, and Massimiliano Pontil, “Support Vector Machines: Theory and Applications,” Advanced Course on Artificial Intelligence, vol. 2049, pp. 249-257, 2001.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Marius-Constantin Popescu et al., “Multilayer Perceptron and Neural Networks,” WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579-588, 2009.
[Google Scholar] [Publisher Link]
[25] Harsh H. Patel, and Purvi Prajapati, “Study and Analysis of Decision Tree Based Classification Algorithms,” International Journal of Computer Sciences and Engineering, vol. 6, no. 10, pp. 74-78, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Masoud Amini Motlagh, Hadi Shahriar Shahhoseini, and Nina Fatehi, “A Reliable Sentiment Analysis for Classification of Tweets in Social Networks,” Social Network Analysis and Mining, vol. 13, no. 1, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[27] D. Elangovan, and V. Subedha, “Firefly with Levy Based Feature Selection with Multilayer Perceptron for Sentiment Analysis,” Journal of Advances in Information Technology, vol. 14, no. 2, pp. 342-349, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Robert Lakatos et al., “A Cloud-Based Machine Learning Pipeline for the Efficient Extraction of Insights from Customer Reviews,” arXiv, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Naveen Kumar Gondhi et al., “Efficient Long Short-Term Memory-Based Sentiment Analysis of E-Commerce Reviews,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1-19, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Gyananjaya Tripathy, and Aakanksha Sharaff, “AEGA: Enhanced Feature Selection Based on ANOVA and Extended Genetic Algorithm for Online Customer Review Analysis,” Journal of Supercomputing, vol. 79, pp. 13180-13209, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

IJBTT

Optimizing Customer Experience Analysis Across Dataset Size Reduction and Relevant Features Selection