Bi-Attention LSTM with CNN based Multi-task Human Activity Detection in Video Surveillance

Bi-Attention LSTM with CNN based Multi-task Human Activity Detection in Video Surveillance

  IJETT-book-cover           
  
© 2021 by IJETT Journal
Volume-69 Issue-11
Year of Publication : 2021
Authors : Shankargoud Patil, Kappargaon S. Prabhushetty
DOI :  10.14445/22315381/IJETT-V69I11P225

How to Cite?

Shankargoud Patil, Kappargaon S. Prabhushetty, "Bi-Attention LSTM with CNN based Multi-task Human Activity Detection in Video Surveillance," International Journal of Engineering Trends and Technology, vol. 69, no. 11, pp. 192-204, 2021. Crossref, https://doi.org/10.14445/22315381/IJETT-V69I11P225

Abstract
Computer vision and pattern recognition, the hot subjects include crowd analysis and anomalous trajectories detection. Anomaly detection is a technique for distinguishing between different patterns and identifying uncommon patterns in a short amount of time. Abnormal event detection and localization is a difficult research challenge due to its complexity. It`s made to detect unusual events in monitoring videos automatically. In the proposed method, humans` normal and abnormal activities are detected through Deep Learning (DL) and image processing. To use the proposed Bi-Attention Long Short- Term Memory (Bi-Attention LSTM) model to extract just the necessary spatial and temporal information from videos and to predict the multi-task activities of humans as abnormal or normal using the introduced Convolutional Neural Network (CNN). Video is taken as input and is then transformed into frames, and background subtraction is used to identify the moving objects (people) in the video frame. The proposed Convolutional Neural Network (CNN) with Bi-Attention LSTM model extracts temporal and spatial characteristics before classifying to determine if a specific human action is normal or abnormal. In terms of accuracy, sensitivity, specificity, error, precision, F1_score, FPR, kappa, and MCC, a performance analysis compares the proposed system to the existing system. On the UMN dataset, Area under the Curve (AUC) and Equal Error Rate (EER) are also considered for comparison. The proposed method finds human operations the most sensible, giving 98.4436% accuracy, which is higher than other existing methods. The results explore the efficacy of the proposed system for classifying human activities from the videos.

Keywords
Bi-Attention Long Short-Term Memory, Convolutional Neural Network, Human Activity Detection, Normal and Abnormal Activities, and Video Surveillance.

Reference
[1] V. Tsakanikas and T. Dagiuklas, Video surveillance systems-current status and future trends, Computers & Electrical Engineering, 70 (2018) 736-753.
[2] S. Ramasamy Ramamurthy and N. Roy, Recent trends in machine learning for human activity recognition—A survey, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8 (2018) e1254,.
[3] W. Xu, Z. Miao, X.-P. Zhang, and Y. Tian, A hierarchical Spatiotemporal model for human activity recognition, IEEE Transactions on Multimedia, 19 (2017) 1494-1509.
[4] A. Ullah, K. Muhammad, W. Ding, V. Palade, I. U. Haq, and S. W. Baik, Efficient activity recognition using lightweight CNN and DSGRU network for surveillance applications, Applied Soft Computing, 103 (2021) 107102.
[5] V. M. Vishnu, M. Rajalakshmi, and R. Nedunchezhian, Intelligent traffic video surveillance and accident detection system with dynamic traffic signal control, Cluster Computing, 21 (2018) 135-147.
[6] Y. Li, R. Xia, Q. Huang, W. Xie, and X. Li, Survey of Spatio-temporal interest point detection algorithms in the video, IEEE Access, 5 (2017) 10323-10331.
[7] S. Chaudhary, M. A. Khan, and C. Bhatnagar, Multiple anomalous activity detection in videos, Procedia Computer Science, 125 (2018) 336-345.
[8] D. Singh and C. K. Mohan, Graph formulation of video activities for abnormal activity recognition, Pattern Recognition, 65 (2017) 265- 272.
[9] A. Ullah, K. Muhammad, J. Del Ser, S. W. Baik, and V. H. C. de Albuquerque, Activity recognition using temporal optical flow convolutional features and multilayer LSTM, IEEE Transactions on Industrial Electronics, 66 (2018) 9692-9702.
[10] F. Zhou, L. Wang, Z. Li, W. Zuo, and H. Tan, Unsupervised learning approach for abnormal event detection in surveillance video by hybrid autoencoder, Neural Processing Letters, 52 (2020) 961-975.
[11] M. Al-Nawashi, O. M. Al-Hazaimeh, and M. Saraee, A novel framework for intelligent surveillance system based on abnormal human activity detection in academic environments, Neural Computing and Applications, 28 (2017) 565-572,.
[12] R. K. Tripathi, A. S. Jalal, and S. C. Agrawal, Suspicious human activity recognition: a review, Artificial Intelligence Review, 50 (2018) 283-339,.
[13] A. B. Mabrouk and E. Zagrouba, Abnormal behavior recognition for intelligent video surveillance systems: A review, Expert Systems with Applications, 91 (2018) 480-49.
[14] C. Dhiman and D. K. Vishwakarma, A review of state-of-the-art techniques for abnormal human activity recognition, Engineering Applications of Artificial Intelligence, 77 (2019) 21-4.
[15] H. Hattori, N. Lee, V. N. Boddeti, F. Beainy, K. M. Kitani, and T. Kanade, Synthesizing a scene-specific pedestrian detector and pose estimator for static video surveillance, International Journal of Computer Vision, 126 (2018) 1027-1044.
[16] R. Nawaratne, D. Alahakoon, D. De Silva, and X. Yu, Spatiotemporal anomaly detection using deep learning for real-time video surveillance, IEEE Transactions on Industrial Informatics, 16 (2019) 393-402.
[17] A. Jalal, Y.-H. Kim, Y.-J. Kim, S. Kamal, and D. Kim, Robust human activity recognition from depth video using spatiotemporal multi-fused features, Pattern Recognition, 61 (2017) 295-308.
[18] K. Pawar and V. Attar, Deep learning approaches for video-based anomalous activity detection, World Wide Web, 22 (2019) 571-601.
[19] W. Huang, H. Ding, and G. Chen, A novel deep multi-channel residual networks-based metric learning method for moving human localization in video surveillance, Signal Processing, 142 (2018) 104-113.
[20] W. Bouachir, R. Gouiaa, B. Li, and R. Noumeir, Intelligent video surveillance for real-time detection of suicide attempts, Pattern Recognition Letters, 110 (2018) 1-7.
[21] H. Yao, A. Cavallaro, T. Bouwmans, and Z. Zhang, Guest editorial introduction to the special issue on group and crowd behavior analysis for intelligent multi-camera video surveillance, IEEE Transactions on Circuits and Systems for Video Technology, 27 (2017) 405-408.
[22] T. Wang, M. Qiao, Y. Deng, Y. Zhou, H. Wang, Q. Lyu, et al., Abnormal event detection based on analysis of movement information of video sequence, Optik, 152 (2018) 50-60.
[23] A. Jordao, L. A. B. Torres, and W. R. Schwartz, Novel approaches to human activity recognition based on accelerometer data, Signal, Image and Video Processing, 12 (2018) 1387-1394.
[24] S. Wan, L. Qi, X. Xu, C. Tong, and Z. Gu, Deep learning models for real-time human activity recognition with smartphones, Mobile Networks and Applications, 25 (2020) 743-755.
[25] X. Zhang, Q. Yu, and H. Yu, Physics inspired methods for crowd video surveillance and analysis: a survey, IEEE Access, 6 (2018) 66816-66830.
[26] F. Najar, S. Bourouis, N. Bouguila, and S. Belghith, Unsupervised learning of finite full covariance multivariate generalized Gaussian mixture models for human activity recognition, Multimedia Tools and Applications, 78 (2019) 18669-18691. Shankargoud Patil & Kappargaon S. Prabhushetty. / IJETT, 69(11), 192-204, 2021 204
[27] T. Singh and D. K. Vishwakarma, Video benchmarks of human action datasets: a review, Artificial Intelligence Review, 52 (2019) 1107- 1154.
[28] K.-E. Ko and K.-B. Sim, Deep convolutional framework for abnormal behavior detection is a smart surveillance system, Engineering Applications of Artificial Intelligence, 67 (2018) 226-234 .
[29] J. Huang, S. Lin, N. Wang, G. Dai, Y. Xie, and J. Zhou, Tse-cnn: A two-stage end-to-end cnn for human activity recognition, IEEE journal of biomedical and health informatics, 24 (2019) 292-299.
[30] M. A. Khan, K. Javed, S. A. Khan, T. Saba, U. Habib, J. A. Khan, et al., Human action recognition using a fusion of multiview and deep features: an application to video surveillance, Multimedia tools and applications, (2020) 1-27.
[31] J. T. Zhou, J. Du, H. Zhu, X. Peng, Y. Liu, and R. S. M. Goh, AnomalyNet: An anomaly detection network for video surveillance, IEEE Transactions on Information Forensics and Security, 14 (2019) 2537-2550.
[32] Al-Dhamari, A., Sudirman, R. and Mahmood, N.H., Transfer deep learning along with binary support vector machine for abnormal behavior detection. IEEE Access, 8 (2020) 61085-61095.