A Significant Detection of APT using MD5 Hash Signature and Machine Learning Approach

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2022 by IJETT Journal
Volume-70 Issue-4
Year of Publication : 2022
Authors : R C. Veena, S H. Brahmananda


MLA Style: R C. Veena, and S H. Brahmananda "A Significant Detection of APT using MD5 Hash Signature and Machine Learning Approach." International Journal of Engineering Trends and Technology, vol. 70, no. 4, Apr. 2022, pp. 95-106. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I4P208

APA Style: R C. Veena, & S H. Brahmananda. (2022). A Significant Detection of APT using MD5 Hash Signature and Machine Learning Approach. International Journal of Engineering Trends and Technology, 70(4), 95-106. https://doi.org/10.14445/22315381/IJETT-V70I4P208

The overwhelming penetration of the internet has created day-to-day life easy. Associated with the rich benefits of the internet come new threats and challenges. An Advanced Persistent Threat (APT) is one such threat caused by suspicious agents accessing data or surveillance servers over a prolonged period. APT attacks have been using a variety of specialized tools and techniques. APT hackers and malware are more common and improvised than ever. Attackers have previously aimed at a system for financial and personal benefit. The type of attack includes several other political motives supported by governments or nations. Nations like the United States, India, Russia, and the U.K. are sufferers. APT involves several stages and a definite approach to operational strategy. Besides, techniques and technologies used in APT attacks vary to camouflage the surveillance applications and penetrate unsuspecting networks. This work presents a Machine Learning (ML) Algorithm-based APT Attacks detection framework. MD5 is even more hazardous than previously thought in cryptography techniques. Attackers can impersonate clients to servers that support MD5 hashing for handshake transcripts. The proposed detection framework resulted in highly effective detection of APT attacks at the initial stage based on the MD5 signature using the ML approach. More than 50% of antivirus software has validated the identified MD5 signature as malicious. This detection framework prevents APTs from fast-spreading from compromising a single computer to taking over several systems or the complete infrastructure. The developed system got trained with 76 types of APT signatures. The total number of threats variant used for training is 645. The proposed ML framework has an accuracy of 99% compared to the published accuracy of 96.1% [23] for early detection of APT from an unknown domain.

APT, MD5 Hashing, Network Security, Hackers, Machine Learning, Threat Hunting.

[1] Code E, Advanced Persistent Threat, Understanding the Danger and How to Protect Your Organization. 1st Edition. Amsterdam: Elsevier. (2012).
[2] Hyunjoo Kim, Jonghyun Kim, Ikkyun Kim, Tai-myung Chung, Behaviour-Based Anomaly Detection on Big Data. Australian Information Security Management Conference. 13 (2015) 73-80.
[3] Luh R, Schrittwieser S, Marschalek S, Janicke H. Design of an Anomaly-Based Threat Detection & Explication System, International Conference on Information Systems Security and Privacy. 3 (2017) 397-402.
[4] Mees W, Multi-Agent Anomaly-Based APT Detection, Information Assurance and Cyber Defence. 111 (2012) 03.
[5] Vatamanu C, Gavrilut D, Benches R, A Practical Approach on Clustering Malicious Pdf Documents, Journal of Computer Virology and Hacking Techniques. 8 (2012) 151-163.
[6] Caldwell T, Spear-Phishing, How to Spot and Mitigate the Menace, Computer Fraud and Security. 1 (2013) 11-16.
[7] Xiaohua Yan, Joy Ying Zhang. Early Detection of Cyber Security Threats using Structured Behavior Modeling, ACM Transactions on Information and System Security. V(N) (2013) A.
[8] Zimba A, Chen H, & Wang Z, Bayesian Network-Based Weighted APT Attack Paths are Modelling in Cloud Computing, Future Gener. Comput. Syst. Elsevier. 96 (2019) 525-537.
[9] Lajevardi, Amir & Amini, Morteza. A Semantic-Based Correlation Approach for Detecting Hybrid and Low-Level Apis. Future Generation Computer Systems. 96 (2019).
[10] D. Yan, F. Liu and K. Jia, Modelling an Information-Based Advanced Persistent Threat Attack on the Internal Network. ICC 2019 - 2019 IEEE International Conference on Communications (ICC). (2019) 1-7. doi: 10.1109/ICC.2019.8761077.
[11] Bai, Tim & Bian, Haibo & Abou Daya, Abbas & Salahuddin, Mohammad & Limam, Noura & Boutaba, Raouf. A Machine Learning Approach for RDP-based Lateral Movement Detection. (2019).
[12] Chu, Wen-Lin & Lin, Chih-Jer & Chang, Ke-Neng. Detection and Classification of Advanced Persistent Threats and Attacks Using the Support Vector Machine. Applied Sciences. 9 (2019) 4579.
[13] Alshamrani A, Myneni S, Chowdhary A, and Huang D, A Survey on Advanced Persistent Threats: Techniques, Solutions, Challenges, and Research Opportunities, IEEE Communications Surveys & Tutorials. 21(2) (2019) 1851–1877.
[14] Li Z, Chen Q. A, Yang R, and Chen Y, Threat Detection and Investigation with System-Level Provenance Graphs: A Survey. (2020).
[15] W. Ul Hassan, D. Li, K. Jee, X. Yu, K. Zou, D. Wang,Z. Chen, Z. Li, J. Gui, A. Bates, J.-i. Gui, This is Why We Can’t Cache Nice Things: Lightning-Fast Threat Hunting using Suspicion-Based Hierarchical Storage, in ACSAC. 14 (2020). doi:10.1145/3427228.3427255.
[16] Q. Wang, W. U. Hassan, D. Li, K. Jee, X. Yu, K. Zou, J. Rhee, Z. Chen, W. Cheng, C. A. Gunter, H. Chen, You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. (2019) 17.
[17] M. N. Hossain, S. Sheikhi, R. Sekar, Combating Dependence Explosion in Forensic Analysis Using Alternative Tag Propagation Semantics, In IEEE S&P. (2020).
[18] W. U. Hassan, M. A. Noureddine, P. Datta, A. Bates, OmegaLog: High-Fidelity Attack Investigation via Transparent Multilayer Log Analysis. (2020) 16.
[19] S. Ndichu, S. Kim, S. Ozawa, T. Misu, K. Makishima, A Machine Learning Approach to Detecting Javascript-Based Attacks Using AST Features and Paragraph Vectors, Applied Soft Computing Journal. 84 (2019) 105721. doi: 10.1016/j.asoc.2019.105721.
[20] Z. Li, Y. Chen, Q. Chen, T. Zhu, C. Xiong, H. Yang, Effective and Light-Weight Deobfuscation and Semantic-Aware Attack Detection For Powershell Scripts, In Proceedings of the ACM Conference on Computer and Communications Security. (2019). doi:10.1145/3319535.3363187.
[21] H. Wang et al., An Evolutionary Study of IoT Malware, in IEEE Internet of Things Journal. 8(20) (2021) 15422-15440. doi: 10.1109/JIOT.2021.3063840.
[22] W. Zhang, H. Wang, H. He and P. Liu, DAMBA: Detecting Android Malware by ORGB Analysis, IEEE Trans. Rel. 69(1) (2020) 55-69.
[23] H.-S. Ham, H.-H. Kim, M.-S. Kim and M.-J. Choi, Linear SVM-Based Android Malware Detection, Proc. Front. Innov. Future Comput. Commun. (2014) 575-585.
[24] A. Calleja, J. Tapiador and J. Caballero, The Malsource Dataset: Quantifying Complexity and Code Reuse in Malware Development, IEEE Trans. Inf. Forensics Security. 14(12) (2019) 3175-3190.
[25] Lucian C. Ongoing MD5 support endangers cryptographic protocols. [Online]. Available: https://www.computerworld.com/article/3020066/ongoing-md5-support-endangers-cryptographic-protocols.html
[26] X. Y. Han, T. Pasquier, A. Bates, J. Mickens, and M. Seltzer, UNICORN: Runtime Provenance-Based Detector for Advanced Persistent Threats, In Proceedings of the Network and Distributed System Security Symposium, San Diego, CA, USA. (2020).
[27] (2019). Paganini, P. Phishers Continue to Abuse Adobe and Google Open Redirects. [Online]. Availables: https://securityaffairs.co/wordpress/91877/cyber-crime/adobe-google-open-redirects.html.
[28] Geluvaraj B, Satwik P.M, Ashok Kumar T.A, The Future of Cybersecurity: Major Role of Artificial Intelligence, Machine Learning, and Deep Learning in Cyberspace. In Lecture Notes on Data Engineering and Communications Technologies; Springer Singapore: Singapore. 15 (2019) 739–747.
[29] Ahuja R, Chug A, Gupta S, Ahuja P, Kohli S, Classification and Clustering Algorithms of Machine Learning with their Applications. In Nature-Inspired Computation in Data Mining and Machine Learning; Yang, X.S., He, X.S., Eds.; Springer International Publishing: Cham, Switzerland.11 (2020) 225–248.
[30] Cho, Do & Nam, Ha. A Method of Monitoring and Detecting APT Attacks Based on Unknown Domains. Procedia Computer Science. 150 (2019) 316-323.
[31] Weina Niu, Xiaosong Zhang, GuoWu Yang, Jianan Zhu, Zhongwei Ren. Identifying APT Malware Domain Based on Mobile DNS Logging. Mathematical Problems in Engineering. 2 (2017) 1-9.
[32] Marchetti M, Pierazzi F, Colajanni M, Guido A. Analysis of High Volumes of Network Traffic for Advanced Persistent Threat Detection. Computer Networks. 109 (2016) 127–141.
[33] A. Zimba, H. Chen, Z. Wang, and M. Chishimba, Modeling and Detection of the Multi-Stages of Advanced Persistent Threats Attacks Based on Semi-Supervised Learning and Complex Networks Characteristics, Future Generation Computer Systems. 106 (2020) 501–517.
[34] [Online]. Available: https://apt.securelist.com/
[35] Do Xuan Choa, Ha Hai Nam. A Method of Monitoring and Detecting APT Attacks Based on Unknown Domains, Procedia Computer Science. 150 (2019) 316–323
[36] Zitong Li, Xiang Cheng, Lixiao Sun, Ji Zhang, Bing Chen, A Hierarchical Approach for Advanced Persistent Threat Detection with Attention-Based Graph Neural Networks, Security and Communication Networks. (2021) 14. https://doi.org/10.1155/2021/9961342
[37] (2020). Zou, Q. An Approach for Detection of Advanced Persistent Threat Attacks. Computer, IEEE Computer Society. [Online]. Availabe: https://www.researchgate.net/publication/34726137
[38] An_Approach_for_Detection_of_Advanced_Persistent_Threat_Attacks
[39] Yan D, Liu F, & Jia K, Modelling an Information-Based Advanced Persistent Threat Attack on the Internal Network. In ICC 2019-2019 IEEE International Conference on Communications (ICC) IEEE. (2019) 1-7. IEEE. https://ieeexplore.ieee.org/abstract/document/8761077
[40] Lv K, Chen Y, & Hu C, Dynamic Defence Strategy Against Advanced Persistent Threat Under Heterogeneous Networks, Information Fusion. 49 (2019) 216-226. https://doi.org/10.1016/j.inffus.2019.01.001
[41] Joloudari J. H, Haderbadi M, Mashmool A, GhasemiGol M, Band S. S, & Mosavi A, Early Detection of the Advanced Persistent Threat Attack Using Performance Analysis of Deep Learning. IEEE Access. 8 (2020) 186125-186137. https://ieeexplore.ieee.org/abstract/document/9214817
[42] Chen W, Helu X, Jin C, Zhang M, Lu H, Sun Y, & Tian Z, Advanced Persistent Threat Organization Identification Based on Software Gene of Malware. Transactions on Emerging Telecommunications Technologies. 31(12) (2020) e3884.
[43] Cheng X, Zhang J, Tu Y, & Chen B, Cyber Situation Perception for Internet of Things Systems Based on Zero Day Attack Activities Recognition within the Advanced Persistent Threat, Concurrency and Computation: Practice and Experience. (2020) e6001.
[44] Fraser N, Plan F, O Leary J, Cannon V, Leong R, Perez D, & Shen C, APT41—A dual espionage and cybercrime operation. FireEye Blog. (2019).
[45] Vencelin Gino V, Amit KR Ghosh, Enhancing Cyber Security Measures for Online Learning Platforms, SSRG International Journal of Computer Science and Engineering. 8(11) (2021) 1-5. https://doi.org/10.14445/23488387/IJCSE-V8I11P101
[46] Tara Kissoon, Optimum Spending on Cybersecurity Measures, Transforming Government: People, Process and Policy. (2020).
[47] Vencelin Gino V & Amit KR Ghosh. IJCS. 8(11) (2021) 1-5.
[48] Donald Somiari Ene, Isobo Nelson Davies, Godwin Fred Lenu, Ibiere Boma Cookey, Implementing ECC on Data Link Layer of the OSI Reference Model. SSRG International Journal of Computer Science and Engineering. 8(9) (2021) 12-16. https://doi.org/10.14445/23488387/IJCSE-V8I9P103
[49] Evans Mwasiaji, Kenneth Iloka, Cyber Security Concerns and Competitiveness for Selected Medium Scale Manufacturing Enterprises in the Context of Covid-19 Pandemic in Kenya. SSRG International Journal of Computer Science and Engineering. 8(8) (2021) 1-7. https://doi.org/10.14445/23488387/IJCSE-V8I8P101