An Enhanced Text Mining Approach using Ensemble Algorithm for Detecting Cyber Bullying

Zarapala Sunitha Bai; Sreelatha Malempati

doi:https://doi.org/10.14445/22315381/IJETT-V70I9P240

Research Article | Open Access | Download PDF

Volume 70 | Issue 9 | Year 2022 | Article Id. IJETT-V70I9P240 | DOI : https://doi.org/10.14445/22315381/IJETT-V70I9P240

An Enhanced Text Mining Approach using Ensemble Algorithm for Detecting Cyber Bullying

Zarapala Sunitha Bai, Sreelatha Malempati

Received	Revised	Accepted	Published
15 Jun 2022	07 Sep 2022	20 Sep 2022	30 Sep 2022

Citation :

Zarapala Sunitha Bai, Sreelatha Malempati, "An Enhanced Text Mining Approach using Ensemble Algorithm for Detecting Cyber Bullying," International Journal of Engineering Trends and Technology (IJETT), vol. 70, no. 9, pp. 393-399, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I9P240

Abstract

Text mining (TM) is most widely used to process the various unstructured text documents and process the data present in the various domains. The other name for text mining is text classification. This domain is most popular in many domains, such as movie reviews, product reviews on various E-commerce websites, sentiment analysis, topic modeling, and cyberbullying on social media messages. Cyberbullying is the type of abusing someone with insulting language. Personal abuse, sexual harassment, and other abuse come under cyberbullying. Several existing systems are developed to detect bullying words based on their situation on social networking sites (SNS). SNS becomes a platform for bullying someone. In this paper, An Enhanced text mining approach is developed by using Ensemble Algorithm (ETMA) to solve several problems in traditional algorithms and improve the accuracy, processing time, and quality of the result. ETMA is the algorithm used to analyze the bullying text within the social networking sites (SNS) such as Facebook, Twitter, etc. The ETMA is applied to a synthetic dataset collected from various data sources consisting of 5k messages belonging to bullying and non-bullying. The performance is analyzed by showing Precision, Recall, F1-Score, and Accuracy.

Keywords

Deep Learning, Cyber Bullying, Text Mining, Ensemble Algorithm.

References

[1] F. Elsafoury, S. Katsigiannis, Z. Pervez and N. Ramzan, "When the Timeline Meets the Pipeline: A Survey on Automated Cyberbullying Detection," in IEEE Access, vol. 9, pp. 103541-103563, 2021, doi: 10.1109/ACCESS.2021.3098979.
[2] M. A. Al-Garadi, M. R. Hussain, N. Khan, G. Murtaza, H. F. Nweke, I. Ali, G. Mujtaba, H. Chiroma, H. A. Khattak, and A. Gani, "Predicting Cyberbullying on Social Media in the Big Data Era Using Machine Learning Algorithms: Review of Literature and Open Challenges," IEEE Access, vol. 7, pp. 70701–70718, 2019.
[3] M. A. Al-Garadi, K. D. Varathan and S. D. Ravana, "Cybercrime Detection in Online Communications: the Experimental Case of Cyberbullying Detection in the Twitter Network," Computers in Human Behavior, vol. 63, pp. 433-443, 2016.
[4] G. M. Abaido, "Cyberbullying on Social Media Platforms Among University Students in the United Arab Emirates," International Journal of Adolescence and Youth, vol. 25, no. 1, pp. 407-420, 2020.
[5] D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringhini and A. Vakali, "Mean Birds: Detecting Aggression and Bullying on Twitter," Proc. ACM Conference Web Science (WebSci), pp. 13-22, 2017.
[6] E. Raisi and B. Huang, "Cyberbullying Detection with Weakly Supervised Machine Learning," Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 409-416, Jul. 2017.
[7] E. Raisi and B. Huang, "Weakly Supervised Cyberbullying Detection using Co-Trained Ensembles of Embedding Models," Proceedings IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 479-486, Aug. 2018.
[8] Vijay B, Jui T, Pooja G, Pallavi V, ”Detection of Cyberbullying using Deep Neural Network,” in 5th International Conference on Advanced Computing & Communication Systems (ICACCS), pp.604-607, 2019
[9] Monirah A., Mourad Y, “Optimized Twitter Cyberbullying Detection based on Deep Learning,” in 21st Saudi Computer Society National Computer Conference (NCC), 2018.
[10] Batoul H, Maroun C, Fadi Y, “Cyberbullying Detection: A Survey on Multilingual Techniques” in European Modelling Symposium (EMS), pp. 165–171, 2016.
[11] Ms. Anushree Negi, “A Brief Survey on Text Mining, Its Techniques, and Applications,” SSRG International Journal of Mobile Computing and Application, vol. 8, no. 1, pp. 1-6, 2021. Crossref, https://doi.org/10.14445/23939141/IJMCA-V8I1P101.
[12] Xiang Z, Jonathan T, Nishant V, Elizabeth W., ”Cyberbullying Detection with a Pronunciation Based Convolutional Neural Network,” in 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 740-745, 2016.
[13] A. S. Srinath, H. Johnson, G. G. Dagher, and M. Long, “BullyNet: Unmasking Cyberbullies on Social Networks," IEEE Transactions on Computational Social Systems, vol. 8, no. 2, pp. 332–344, 2021, doi:10.1109/TCSS.2021.3049232.
[14] R. Zhao and K. Mao, "Cyberbullying Detection Based on Semantic Enhanced Marginalized Denoising Auto-Encoder," IEEE Transactions on Affective Computing, vol. 8, no. 3, pp. 328–339, 2017, doi:10.1109/TAFFC.2016.2531682.
[15] H. Rosa, J. P. Carvalho, P. Calado, B. Martins, R. Ribeiro and L. Coheur, "Using Fuzzy Fingerprints for Cyberbullying Detection in Social Networks," 2018 IEEE International Conference on Fuzzy Systems (FUZZIEEE), pp. 1-7, 2018. doi: 10.1109/FUZZIEEE.2018.8491557.
[16] X. Zhang, J. Tong, N. Vishwamitra, E. Whittaker, J. P. Mazer, R. Kowalski, Et Al., "Cyberbullying Detection with a Pronunciation Based Convolutional Neural Network," Proc. 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 740-745, Dec. 2016.
[17] Johnson, Elizabeth K, and Michael D Tyler, "Testing the Limits of Statistical Learning for Word Segmentation," Developmental science, vol. 13, no.2 , pp.339-45, 2010. doi:10.1111/j.1467-7687.2009.00886.x
[18] Gunisetti Tirupathi Rao, “Dr. Rajendra Gupta, “ An approach of Clustering and Analysis of Unstructured Data,” SSRG International Journal of Computer Science and Engineering, vol. 6, no. 11, pp. 64-69, 2019. Crossref, https://doi.org/10.14445/23488387/IJCSEV6I11P114
[19] I. Nazar, D.-S. Zois and M. Yao, "A Hierarchical Approach for Timely Cyberbullying Detection," Proc. IEEE Data Science.Workshop (DSW), pp. 190-195, 2019.
[20] Y. Win, "Classification Using Support Vector Machine to Detect Cyberbullying in Social Media for Myanmar Language," 2019 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia), pp. 122-125, 2019. doi: 10.1109/ICCEAsia46551.2019.8942212.
[21] Lin L., Linlong X., Nanzhi W., GuocaiY, "Text Classification Method Based on Convolution Neural Network," in 3rd IEEE International Conference on Computer and Communications (ICCC), pp . 1985-1989, 2017.
[22] Sang-Uk Jung, Jungho Byun, Seongyeol Bae, Donghwi Song, “Predicting Neologisms for Marketing: a Text Mining Approach,” SSRG International Journal of Economics and Management Studies, vol. 7, no. 7, pp. 5-9, 2020. Crossref, https://doi.org/10.14445/23939125/IJEMS-V7I7P102.
[23] M. Lan, C. L. Tan, J. Su, and Y. Lu, "Supervised and Traditional Term Weighting Methods for Automatic Text Categorization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 721–735, 2008.
[24] R. Johnson and T. Zhang, "Effective Use of Word Order for Text Categorization With Convolutional Neural Networks," arXivPrepr. arXiv1412.1058, 2014.
[25] X. Zhang, J. Zhao, and Y. LeCun, "Character-Level Convolutional Networks for Text Classification," In Advances in Neural Information Processing Systems, 2015, pp. 649–657.
[26] Dr.K.Anuradha, Dr.M.Vamsi Krishna, Dr.Banitamani Mallik, Prof.B.P.Mishra. Festus, “A Survey Paper on Sentiment Analysis: Approaches, Methods Challenges,” International Journal of Computer Trends and Technology, vol. 67, no.10, pp.25-34, 2010. DOI: 10.14445/22312803/IJCTT-V67I10P105.
[27] A. J. McMinn, Y. Moshfeghi, and J. M. Jose, "Building a Large-Scale Corpus for Evaluating Event Detection on Twitter," In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, 2013, pp. 409–418.
[28] S. Salawu, Y. He and J. Lumsden, "Approaches to Automated Detection of Cyberbullying: A Survey," IEEE Transactions on Affective Computing , vol. 11, no. 1, pp. 3-24, Jan. 2020.
[29] A. Onan, "Sentiment Analysis on Product Reviews Based on Weighted Word Embeddings and Deep Neural Networks," Concurrency and Computation: Practice and Experience, pp. e5909, 2020.