Experimental Evaluation of Resampling Algorithms on the Imbalance Violence Video Detection

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2022 by IJETT Journal
Volume-70 Issue-7
Year of Publication : 2022
Authors : Moch Arief Soeleman, Catur Supriyanto, Dwi Puji Prabowo, Pulung Nurtantio Andono
DOI : 10.14445/22315381/IJETT-V70I7P226

How to Cite?

Moch Arief Soeleman, Catur Supriyanto, Dwi Puji Prabowo, Pulung Nurtantio Andono, "Experimental Evaluation of Resampling Algorithms on the Imbalance Violence Video Detection" International Journal of Engineering Trends and Technology, vol. 70, no. 7, pp. 260-268, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I7P226

Violence detection is part of the video surveillance research area and has played an important role in the last decade. Convolution Neural Network (CNN) has become a very successful classifier for violence video detection. The learned features of CNN give a superior result over the handcrafted features of traditional machine learning. Long Short-Term Memory (LSTM) layer process the learned features to capture the temporal dependencies. Violence video detection is a binary classification that categorizes the instance video into violence or non-violence. However, the number of video clips in each class is not balanced, which makes it hard to collect the positive class. In this direction, this work presents the empirical results of resampling techniques to enhance the performance of video violence detection. This work compares four resampling techniques Random Under Sampling (RUS, Synthetic Minority Oversampling Technique (SMOTE), Random Over Sampling (ROS), and the combination of SMOTE and RUS. The experiments are conducted on two popular benchmark datasets, Hockey and Crowd Datasets. The number of positive classes of these datasets is reduced to create an imbalance of datasets for experimental purposes. The experiment results demonstrated that RUS produced superior performance compared to the other resampling techniques in terms of G-means and AUC.

Convolution Neural Network (CNN), Imbalance dataset, Resampling algorithm, Long Short-Term Memory (LSTM), Violence video detection.

[1] F. A. Pujol, H. Mora, and M. L. Pertegal, “A soft computing approach to violence detection in social media for smart cities,” Soft Comput., vol. 24, no. 15, pp. 11007–11017, 2020, doi: 10.1007/s00500-019-04310-x.
[2] D. J. Samuel R. et al., “Real time violence detection framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM,” Comput. Netw., vol. 151, pp. 191–200, 2019, doi: 10.1016/j.comnet.2019.01.028.
[3] L. Xu, C. Gong, J. Yang, Q. Wu, and L. Yao, “Violent video detection based on MoSIFT feature and sparse coding,” IEEE Int. Conf. Acoust. Speech Signal Process. ICASSP, 2014.
[4] T. Zhang, W. Jia, B. Yang, J. Yang, X. He, and Z. Zheng, “MoWLD: a robust motion image descriptor for violence detection,” Multimed. Tools Appl., vol. 76, no. 1, pp. 1419–1438, 2017, doi: 10.1007/s11042-015-3133-0.
[5] Y. Gao, H. Liu, X. Sun, C. Wang, and Y. Liu, “Violence detection using Oriented Violent Flows,” Image Vis. Comput., vol. 48–49, pp. 37–41, Apr. 2016, doi: 10.1016/j.imavis.2016.01.006.
[6] P. Zhou, Q. Ding, H. Luo, and X. Hou, “Violence detection in surveillance video using low-level features,” PLOS ONE, vol. 13, no. 10, p. e0203668, Oct. 2018, doi: 10.1371/journal.pone.0203668.
[7] A. Ben Mabrouk and E. Zagrouba, “Abnormal behavior recognition for intelligent video surveillance systems: A review,” Expert Syst. Appl., vol. 91, pp. 480–491, Jan. 2018, doi: 10.1016/j.eswa.2017.09.029.
[8] Deepak K., Vignesh L.K.P., and Chandrakala S., “Autocorrelation of gradients based violence detection in surveillance videos,” ICT Express, vol. 6, no. 3, pp. 155–159, Sep. 2020, doi: 10.1016/j.icte.2020.04.014.
[9] M. Baba, V. Gui, C. Cernazanu, and D. Pescaru, “A Sensor Network Approach for Violence Detection in Smart Cities Using Deep Learning,” Sensors, vol. 19, no. 7, p. 1676, Apr. 2019, doi: 10.3390/s19071676.
[10] F. U. M. Ullah, A. Ullah, K. Muhammad, I. U. Haq, and S. W. Baik, “Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network,” Sensors, vol. 19, no. 11, p. 2472, 2019, doi: 10.3390/s19112472.
[11] M. Asad, J. Yang, J. He, P. Shamsolmoali, and X. He, “Multi-frame feature-fusion-based model for violence detection,” Vis. Comput., vol. 37, no. 6, pp. 1415–1431, 2021, doi: 10.1007/s00371-020-01878-6.
[12] S. A. Sumon, R. Goni, N. B. Hashem, T. Shahria, and R. M. Rahman, “Violence Detection by Pretrained Modules with Different Deep Learning Approaches,” Vietnam J. Comput. Sci., vol. 07, no. 01, pp. 19–40, 2020, doi: 10.1142/S2196888820500013.
[13] M. Cheng, K. Cai, and M. Li, “RWF-2000: An Open Large Scale Video Database for Violence Detection,” ArXiv191105913 Cs, Oct. 2020, Accessed: Apr. 02, 2022. [Online]. Available: http://arxiv.org/abs/1911.05913
[14] H. Wang et al., “Mitosis detection in breast cancer pathology images by combining handcrafted and convolutional neural network features,” J. Med. Imaging, vol. 1, no. 3, pp. 034003, 2014, doi: 10.1117/1.JMI.1.3.034003.
[15] P. Wang, E. Fan, and P. Wang, “Comparative analysis of image classification algorithms based on traditional machine learning and deep learning,” Pattern Recognit. Lett., vol. 141, pp. 61–67, 2021, doi: 10.1016/j.patrec.2020.07.042.
[16] I. Bintang and G. P. Kusuma, “Porn Detection in a Video Streaming Using Hybrid Network of CNN and LSTM,” Int. J. Eng. Trends Technol., vol. 69, no. 11, pp. 248–255, 2021, doi: 10.14445/22315381/IJETT-V69I11P231.
[17] T. Xiao, L. Liu, K. Li, W. Qin, S. Yu, and Z. Li, “Comparison of Transferred Deep Neural Networks in Ultrasonic Breast Masses Discrimination,” BioMed Res. Int., p. 10, 2018.
[18] M. A. Soeleman, C. Supriyanto, and D. P. Prabowo, “An Empirical Study of CNN-LSTM on Class Imbalance Datasets for Violence Video Detection,” in The 2021 International Conference on Computer, Control, Informatics and Its Applications, Virtual/online conference Indonesia, Oct. 2021, pp. 81–85. doi: 10.1145/3489088.3489126.
[19] S. K. Pandey and R. R. Janghel, “Automatic detection of arrhythmia from imbalanced ECG database using CNN model with SMOTE,” Australas. Phys. Eng. Sci. Med., vol. 42, no. 4, pp. 1129–1139, 2019, doi: 10.1007/s13246-019-00815-9.
[20] I. Brown and C. Mues, “An experimental comparison of classification algorithms for imbalanced credit scoring data sets,” Expert Syst. Appl., vol. 39, no. 3, pp. 3446–3453, 2012, doi: 10.1016/j.eswa.2011.09.033.
[21] Z. Xu et al., “Software defect prediction based on kernel PCA and weighted extreme learning machine,” Inf. Softw. Technol., vol. 106, pp. 182–200, Feb. 2019, doi: 10.1016/j.infsof.2018.10.004.
[22] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, 2004, doi: 10.1145/1007730.1007735.
[23] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002, doi: 10.1613/jair.953.
[24] Q. Fan, Z. Wang, D. Li, D. Gao, and H. Zha, “Entropy-based Fuzzy Support Vector Machine for Imbalanced Datasets,” Knowl.-Based Syst., p. 32, 2016.
[25] J. Demsar, “Statistical Comparisons of Classifiers over Multiple Data Sets,” J. Mach. Learn. Res., vol. 7, p. 30, 2006.
[26] J. Xiao, Y. Wang, J. Chen, L. Xie, and J. Huang, “Impact of resampling methods and classification models on the imbalanced credit scoring problems,” Inf. Sci., vol. 569, pp. 508–526, 2021, doi: 10.1016/j.ins.2021.05.029.
[27] T. Hassner, Y. Itcher, and O. Kliper-Gross, “Violent flows: Real-time detection of violent crowd behavior,” in 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, Jun. 2012, pp. 1–6. doi: 10.1109/CVPRW.2012.6239348.
[28] J. Mahmoodi and A. Salajeghe, “A classification method based on optical flow for violence detection,” Expert Syst. Appl., vol. 127, pp. 121–127, 2019, doi: 10.1016/j.eswa.2019.02.032.
[29] M. Ghazal, N. Waisi, and N. Abdullah, “The detection of handguns from live-video in real-time based on deep learning,” TELKOMNIKA Telecommun. Comput. Electron. Control, vol. 18, no. 6, p. 3026, 2020, doi: 10.12928/telkomnika.v18i6.16174.