Research Article | Open Access | Download PDF
Volume 74 | Issue 3 | Year 2026 | Article Id. IJETT-V74I3P105 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I3P105Severity Detection of Cyberbullying in Saudi-Dialect Tweets: A Machine-Learning Approach
Bader Azi Alanazi, Chin-Teng Lin
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 06 Mar 2026 | 24 Jan 2026 | 29 Jan 2026 | 28 Mar 2026 |
Citation :
Bader Azi Alanazi, Chin-Teng Lin, "Severity Detection of Cyberbullying in Saudi-Dialect Tweets: A Machine-Learning Approach," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 3, pp. 54-74, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I3P105
Abstract
Social media platforms such as Twitter (known as X) have become channels for global communication, but have also led to an increase in cyberbullying, which carries serious psychological risks. Although much existing research has focused on detecting cyberbullying in English, there is an apparent lack of studies addressing this issue in Arabic, particularly for severity classification. This study aims to evaluate machine learning classifiers trained on balanced, pre-processed Saudi dialect data for four-level cyberbullying severity detection (non-cyberbullying, low, medium, and high) and to assess the impact of systematic class balancing on minority class performance. The study applied Support Vector Machine (SVM) and Naïve Bayes (NB) classifiers, using Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF) for feature extraction. A dataset of 5,819 Saudi-dialect tweets was annotated into four severity categories and evaluated across 28 experimental scenarios combining different pre-processing tools (CAMeL, NLTK, Araby) and balancing techniques (random insertion, random oversampling, synonym replacement). The highest accuracy of 92.23% was achieved using BoW+SVM with NLTK pre-processing and stop word removal, representing a 27.43% absolute improvement over the imbalanced baseline of 64.80% accuracy. Random oversampling proved to be the most effective, accounting for 96-99% of the performance gains. Per-class F1-scores ranged from 0.88 (low severity) to 0.95 (high severity and non-cyberbullying), providing further evidence of the importance of balanced training data for achieving reliable performance across all severity levels. To the best of the authors’ knowledge, this is the first study to implement four-class cyberbullying severity detection for Saudi dialect tweets.
Keywords
Text Classification, Machine Learning, Cyberbullying Detection, Arabic social media, Saudi dialect, Support Vector Machine(SVM), Naïve Bayes (NB).
References
[1] Saudi Arabia Social Media
Statistics 2024, Global Media Insight - Dubai Digital Interactive Agency, 2023.
[Online]. Available: https://www.globalmediainsight.com/blog/saudi-arabia-social-media-statistics/
[2] Number of users of twitter
in Saudi Arabia 2019-2028, Statista Research Department, 2025. [Online].
Available: https://www.statista.com/statistics/558404/number-of-twitter-users-in-saudi-arabia/
[3] Fadia S. AlBuhairan et al., “Time for an Adolescent
health Surveillance System in Saudi Arabia: Findings from “Jeeluna”,” Journal of Adolescent Health, vol. 57, no. 3, pp. 263-269, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Monirah Abdullah Al-Ajlan, and
Mourad Ykhlef, “Deep Learning Algorithm for Cyberbullying Detection,” International Journal of Advanced Computer
Science and Applications, vol. 9, no. 9, pp. 199-205, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[5] A.K. Jaithunbi et al., “Detecting
Twitter Cyberbullying using Machine Learning,” Annals of the Romanian Society for Cell Biology,
vol. 25, no. 4, pp.
16307-16315, 2021.
[Google Scholar] [Publisher Link]
[6] Raju Kumar, and Aruna Bhat,
“A Study of Machine Learning-based Models for Detection, Control, and
Mitigation of Cyberbullying in Online Social Media,” International Journal of Information Security, vol. 21, no. 6, pp.
1409-1431, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Monirah A. Al-Ajlan, and Mourad
Ykhlef, “Optimized Twitter Cyberbullying Detection based on Deep Learning,” 2018
21st Saudi Computer Society National Computer Conference (NCC), Riyadh,
Saudi Arabia, pp. 1-5, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Alanoud Mohammed Alduailaj,
and Aymen Belghith, “Detecting Arabic Cyberbullying Tweets Using Machine
Learning,” Machine Learning and Knowledge
Extraction, vol. 5, no. 1, pp. 29-42, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Djedjiga Mouheb et al., “Detection
of Arabic Cyberbullying on Social Networks using Machine Learning,” 2019 IEEE/ACS 16th International
Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United
Arab Emirates, pp. 1-5, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Deema Alghamdi et al., “Automatic
Detection of Cyberbullying and Threatening in Saudi Tweets using Machine
Learning,” International Journal of
Advanced and Applied Sciences, vol. 8, no. 10, pp. 17-25, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Sourabh Parime, and Vaibhav
Suri, “Cyberbullying Detection and Prevention: Data Mining and Psychological
Perspective,” 2014 International Conference on Circuits, Power and Computing
Technologies [ICCPCT-2014], Nagercoil, India, pp. 1541-1547, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Marilyn Campbell, and Sheri
Bauman, Cyberbullying: Definition, Consequences, Prevalence, Reducing
Cyberbullying in Schools: International Evidence-based Best Practices, Academic
Press, pp. 3-16, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Vikas S. Chavan, and S.S. Shylaja,
“Machine Learning Approach for Detection of Cyber-Aggressive Comments by Peers on
Social Media Network,” 2015 International
Conference on Advances in Computing, Communications and Informatics (ICACCI),
Kochi, India, pp. 2354-2358, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Andreas König, Mario
Gollwitzer, and Georges Steffgen, “Cyberbullying as an Act of Revenge?,” Journal of Psychologists and Counsellors in
Schools, vol. 20, no. 2, pp. 210-224, 2010.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Peter K. Smith et al., “Cyberbullying:
Its Nature and Impact in Secondary School Pupils,” Journal of Child Psychology and Psychiatry, vol. 49, no. 4, pp.
376-385, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Sydney L. Brunecz, “More
Harm than Good? Why Schools Who Take a Zero-Tolerance Stance on Cyberbullying
Cause More Problems than Solutions,” Case Western Reserve Journal of Law, Technology & the Internet,
vol. 6, no. 1, pp. 13-42, 2014.
[Google Scholar] [Publisher Link]
[17] Allison Paolini, “Cyberbullying:
Role of the School Counselor in Mitigating the Silent Killer Epidemic,” International Journal of Educational
Technology, vol. 5, no. 1, pp. 1-8, 2018.
[Google Scholar] [Publisher Link]
[18] Ye Zhang Pogue, The Digital
Dagger: The Destructive Impact of Cyberbullying, Psychology Today, 2023. [Online].
Available: https://www.psychologytoday.com/us/blog/the-human-identity/202307/the-digital-dagger-the-destructive-impact-of-cyberbullying?msockid=2f3d6626b97162203f4d74a4bd716cea
[19] Ditch the Label,
Cyberbullying Statistics: What They Tell Us, Ditch the Label Youth Charity,
2017. [Online]. Available: https://www.ditchthelabel.org/cyber-bullying-statistics-what-they-tell-us
[20] Deborah Goebert et al., “The
Impact of Cyberbullying on Substance Use and Mental Health in A Multiethnic
Sample,” Maternal and Child Health
Journal, vol. 15, no. 8, pp. 1282-1286, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Tanya Beran, and Qing Li, “The
Relationship between Cyberbullying and School Bullying,” The Journal of Student Wellbeing, vol. 1, no. 2, pp.
16-33, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Justin W. Patchin, Sameer
Hinduja, Summary of Our Cyberbullying Research (2007-2025), Cyberbullying
Research Center, 2024. [Online]. Available: https://cyberbullying.org/summary-of-our-cyberbullying-research
[23] Victoria Brown, Elizabeth
Clery, and Christopher Ferguson, “Estimating the Prevalence of Young People
Absent from School Due to Bullying,” National Centre for Social Research,
2011.
[Google Scholar]
[24] Ainoa Mateu et al., “Cyberbullying
and Post-Traumatic Stress Symptoms in UK Adolescents,” Archives of Disease in Childhood, vol. 105, no. 10,
pp. 951-956, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Njoud Alrasheed et al., “Prevalence
and Risk Factors of Cyberbullying and its Association with Mental Health among
Adolescents in Saudi Arabia,” Cureus,
vol. 14, no. 12, pp. 1-10, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Gassem Gohal et al., “Prevalence and Related Risks
of Cyberbullying and its Effects on Adolescent,” BMC psychiatry, vol. 23, no. 1, pp 1-10, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Nawal A. Alissa, and Rawan
Abu Shryei, “Cyberbullying among Female College Students in Saudi Arabia,” International Journal of Child, Youth and
Family Studies, vol. 16, no. 1, pp. 52-66, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Damian Maher, “Cyberbullying:
An Ethnographic Case Study of One Australian Upper Primary School Class,” Youth Studies Australia,
vol. 27, no. 4, pp. 50-57, 2008.
[Google Scholar] [Publisher Link]
[29] Batoul Haidar, Maroun
Chamoun, and Fadi Yamout, “Cyberbullying Detection: A Survey on Multilingual Techniques,”
2016 European Modelling Symposium (EMS),
Pisa, Italy, pp. 165-171, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Samaneh Nadali et al., “A
Review of Cyberbullying Detection: An Overview,” 2013 13th International Conference on Intellient Systems
Design and Applications, Salangor, Malaysia, pp. 325-330, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Norulzahrah Mohd Zainudin
et al., “A Review on Cyberbullying in Malaysia from Digital Forensic
Perspective,” 2016 International
Conference on Information and Communication Technology (ICICTM), Kuala
Lumpur, Malaysia, pp. 246-250, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Nancy E. Willard, Cyberbullying and Cyberthreats: Responding
to the Challenge of Online Social Aggression, Threats, and Distress, Research
press, 2025.
[Google Scholar] [Publisher Link]
[33] Jennifer Bayzick, April
Kontostathis, and Lynne Edwards, “Detecting the Presence of Cyberbullying using
Computer Software,” WebSci Conference, Koblenz, Germany, pp. 1-2, 2011.
[Google Scholar]
[34] Taeho Jo, Machine
Learning Foundations, Supervised,
Unsupervised, and Advanced Learning, Springer Cham, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Mohammed Ali Al-garadi, Kasturi
Dewi Varathan, and Sri Devi Ravana, “Cybercrime Detection in Online
Communications: The Experimental Case of Cyberbullying Detection in the Twitter
Network,” Computers in Human Behavior,
vol. 63, pp. 433-443, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Michele Di Capua, Emanuel
Di Nardo, and Alfredo Petrosino, “Unsupervised Cyber Bullying Detection in
Social Networks,” 2016 23rd International
Conference on Pattern Recognition (ICPR), Cancun, Mexico, pp. 432-437, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Xiaowei Gu, “A
Self-Training Hierarchical Prototype-based Approach for Semi-Supervised
Classification,” Information Sciences,
vol. 535, pp. 204-224, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Vinita Nahar et al., “Semi-Supervised
Learning for Cyberbullying Detection in Social Networks,” Databases Theory and Applications: 25th Australasian
Database Conference, Brisbane,
QLD, Australia, pp. 160-171,
2014.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Yann LeCun, Yoshua Bengio,
and Geoffrey Hinton, “Deep Learning,” Nature,
vol. 521, no. 7553, pp. 436-444, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Shervin Minaee et al., “Deep
Learning--based Text Classification: A Comprehensive Review,” ACM Computing Surveys (CSUR),
vol. 54, no. 3, pp. 1-40, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Celestine Iwendi et al., “Cyberbullying
Detection Solutions based on Deep Learning Architectures,” Multimedia Systems, vol.
29, no. 3, pp. 1839-1852, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[42] K.G. Apoorva, and D. Uma, “Detection
of Cyberbullying Using Machine Learning and Deep Learning Algorithms,” 2022 2nd Asian Conference on
Innovation in Technology (ASIANCON), Ravet, India, pp. 1-7, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Jalal Omer Atoum, “Cyberbullying
Detection Neural Networks using Sentiment Analysis,” 2021 International Conference on Computational Science and
Computational Intelligence (CSCI), Las Vegas, NV, USA, pp. 158-164, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Roman Egger, and Enes Gokce,
Natural Language Processing (NLP): An Introduction: Making Sense of Textual
Data, Applied Data Science in
Tourism, Springer, Cham, pp. 307-334, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[45] K.R. Chowdhary, Natural
Language Processing, Fundamentals
of Artificial Intelligence, Springer, New Delhi, pp. 603-649, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Dipanjan Sarkar, Text Analytics with Python, A
Practitioner's Guide to Natural Language Processing, Apress Berkeley, CA, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Elizabeth D. Liddy, Natural
Language Processing, 2nd Ed., Encyclopedia of Library and
Information Science, NY, Marcel Decker, Inc, 2001.
[Google Scholar] [Publisher Link]
[48] Yue Kang et al., “Natural
Language Processing (NLP) in Management Research: A Literature Review,” Journal of Management Analytics,
vol. 7, no. 2, pp. 139-172, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[49] Muhammad Abdul-Mageed, and Mona
Diab, “AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and
Sentiment Analysis,” Proceedings of the Eighth International Conference on
Language Resources and Evaluation (LREC'12), Istanbul, Turkey, vol. 515,
pp. 3907-3914, 2012.
[Google Scholar] [Publisher Link]
[50] Hossam S. Ibrahim, Sherif
M. Abdou, and Mervat Gheith, “Sentiment Analysis for Modern Standard Arabic and
Colloquial,” arXiv Preprint,
pp. 95-109, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[51] Kenneth R. Beesley, “Finite-State
Morphological Analysis and Generation of Arabic at Xerox Research: Status and
Plans in 2001,” ACL Workshop on Arabic
Language Processing: Status and Perspective, vol. 1, pp. 1-8, 2001.
[Google Scholar]
[52] Tim Buckwalter, “Issues in
Arabic Orthography and Morphology Analysis,” Proceedings of the Workshop on Computational Approaches to Arabic
Script-based Languages, Geneva, Switzerland, pp. 31-34, 2004.
[Google Scholar] [Publisher Link]
[53] Mohamed Elmahdy et al., “Survey
on Common Arabic Language Forms from a Speech Recognition Point of View,” Proceeding of International Conference on
Acoustics (NAG-DAGA), Rotterdam, pp. 63-66, 2009.
[Google Scholar] [Publisher Link]
[54] Kareem Darwish et al., “A Panoramic Survey of
Natural Language Processing in the Arab World,” Communications of the ACM, vol. 64, no. 4, pp. 72-81, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[55] Mohamed Abd Elaziz et al., Recent
Advances in NLP: The Case of Arabic Language, Springer Cham, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[56] Batoul Haidar, Maroun
Chamoun, and Ahmed Serhrouchni, “Multilingual Cyberbullying Detection System:
Detecting Cyberbullying in Arabic Content,” 2017
1st cyber security in networking conference (CSNet), Rio de
Janeiro, Brazil, pp. 1-8, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[57] Azalden Alakrot, Liam
Murray, and Nikola S. Nikolov, “Towards Accurate Detection of Offensive
Language in Online Communication in Arabic,” Procedia Computer Science, vol. 142, pp. 315-320, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[58] Dhiaa Musleh et al., “A Machine Learning Approach
to Cyberbullying Detection in Arabic Tweets,” Computers, Materials & Continua, vol. 80, no. 1, pp. 1033-1054, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[59] Bandeh Ali Talpur, and Declan
O’Sullivan, “Multi-Class Imbalance in Text Classification: A Feature
Engineering Approach to Detect Cyberbullying in Twitter,” Informatics, vol. 7, no. 4, pp. 1-22, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[60] M. Rahman, S. Nur, M. T.
Ahmed, D. Das, and A. T. Islam, “A Feature Engineering Approach for Detecting
Cyberbullying in Bangla Text using Machine Learning,” 2022 International Conference on Recent Progresses in Science,
Engineering and Technology (ICRPSET), Rajshahi, Bangladesh, pp. 1-5, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[61] Jheng-Long Wu, and Chiao-Yu
Tang, “Classifying The Severity of Cyberbullying Incidents by using A
Hierarchical Squashing-Attention Network,” Applied
Sciences, vol. 12, no. 7, pp. 1-19, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[62] Madhura Vikram Vyawahare, and
Sharvari Govilkar, “Severity Detection of Cyberbullying in Online Social
Networks Using Machine Learning,” 2022 5th
International Conference on Advances in Science and Technology (ICAST), Mumbai,
India, pp. 1-6, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[63] Sylvia W. Azumah et al., “Cyberbullying
in Text Content Detection: An Analytical Review,” International Journal of Computers and Applications, vol. 45, no. 9, pp. 579-586, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[64] Tanjim Mahmud et al., “Cyberbullying
Detection for Low-Resource Languages and Dialects: Review of the State of the
Art,” Information Processing &
Management, vol. 60, no. 5, pp 1-52, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[65] Hooayda Allwaibed et al.,
“Cyberbullying Detection Approaches for Arabic Texts: Systematic Literature
Review,” Frontiers in Artificial
Intelligence, vol. 8, pp. 1-13, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[66] Bandeh Ali Talpur, and Declan
O’Sullivan, “Cyberbullying Severity Detection: A Machine Learning Approach,” PloS One, vol. 15, no. 10, pp.
1-19, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[67] Jason Wei, and Kai Zou, “Eda:
Easy Data Augmentation Techniques for Boosting Performance on Text
Classification Tasks,” Proceedings of the
2019 Conference on Empirical Methods in Natural Language Processing and the 9th
International Joint Conference on Natural Language Processing (EMNLP-IJCNLP),
Hong Kong, China, pp. 6382-6388, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[68] Anna Glazkova, “A
Comparison of Synthetic Oversampling Methods for Multi-Class Text
Classification,” arXiv Preprint,
pp. 1-12, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[69] Ossama Obeid et al., “CAMeL Tools: An Open Source
Python Toolkit for Arabic Natural Language Processing,” Proceedings of the
Twelfth Language Resources and Evaluation Conference, Marseille, France,
pp. 7022-7032, 2020.
[Google Scholar] [Publisher Link]
[70] Edward Loper, and Steven
Bird, “Nltk: The Natural Language Toolkit,” arXiv
Preprint, pp.
1-8, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[71] Taha Zerrouki, “PyArabic: A
Python Package for Arabic Text,” Journal
of Open Source Software, vol. 8, no. 84, pp. 1-6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[72] Corinna Cortes, and Vladimir
Vapnik, “Support-Vector Networks,” Machine
Learning, vol. 20, no. 3, pp. 273-297, 1995.
[CrossRef] [Google Scholar] [Publisher Link]
[73] David M.W. Powers, “Evaluation:
From Precision, Recall and F-Measure to ROC, Informedness, Markedness and
Correlation,” arXiv Preprint,
pp. 37-63, 2020.
[CrossRef] [Google Scholar] [Publisher Link]