Devising Malware Characteristics using Transformers

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2020 by IJETT Journal
Volume-68 Issue-5
Year of Publication : 2020
Authors : Simra Shahid, Tanmay Singh, Yash Sharma, Kapil Sharma
DOI :  10.14445/22315381/IJETT-V68I5P207S

Citation 

MLA Style: Simra Shahid, Tanmay Singh, Yash Sharma, Kapil Sharma  "Devising Malware Characteristics using Transformers" International Journal of Engineering Trends and Technology 68.5(2020):33-37. 

APA Style:Simra Shahid, Tanmay Singh, Yash Sharma, Kapil Sharma. Devising Malware Characteristics using Transformers International Journal of Engineering Trends and Technology, 68(5),33-37.

Abstract
In this paper, we present our approach of finding relevant malware behaviour texts from Malware Threat Reports as described by Lim [1]. Our main contribution is the opening attempt of Transfer Learning approaches, and how they generalize for the classification tasks like malware behaviour analysis.

Reference

[1] Lim, SweeKiat, et al. “Malwaretextdb: A database for annotated malware articles.” Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).2017.
[2] SHARMA, TANU. SOFTWARE BUG LOCALIZATIONUSING TOPIC MODELS. Diss.2016.
[3] Tripathi, Ashish Kumar, Kapil Sharma, and ManjuBala. “Parallel Hybrid BBO Search Method for Twitter Sentiment Analysis of Large Scale Datasets Using MapReduce.” International Journal of Information Security and Privacy (IJISP) 13.3 (2019):106-122.
[4] T. Sharma, K. Sharma and T. Sharma, “Software bug localizationusingPachinkoAllocationModel,”20163rdInternational Conference on Computing for Sustainable Global Development(INDIACom),NewDelhi,2016,pp.3603-3608.
[5] JAIN, DEEPAKSHI. CRYPTOCURRENCY PRICE PREDICTION USING TRANSFORMER: A DEEP LEARNING ARCHITECTURE. Diss.2019.
[6] Jatana, Nishtha, and Kapil Sharma. “Bayesian spam classification: Time-efficient radix encoded fragmented database approach.” 2014 International Conference on Computing for Sustainable Global Development (INDIACom). IEEE,2014.
[7] Pennington, Jeffrey, Richard Socher, and Christopher D. Manning. “Glove: Global vectors for word representation.” Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).2014.
[8] Loyola, Pablo, et al. “Villani at SemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning.” Proceedings of The 12th International Workshop on Semantic Evaluation.2018.
[9] Sikdar,UtpalKumar,BiswanathBarik,andBjo¨rnGamba¨ck. “Flytxt NTNU at SemEval-2018 Task 8: Identifying and Classifying Malware Text Using Conditional Random Fields and Na¨?veBayes Classifiers.” Proceedings of The 12th InternationalWorkshoponSemanticEvaluation.2018.
[10] Ma, Chunping, et al. “DM NLP at SemEval-2018 Task 8: neuralsequencelabellingwithlinguisticfeatures.”Proceedings of The 12th International Workshop on Semantic Evaluation. 2018.
[11] Ma, Xuezhe, and Eduard Hovy. “End-to-end sequence labeling via bi-directional lstm-cnns-crf.” arXiv preprint arXiv:1603.01354(2016).
[12] Fu, Mingming, Xuemin Zhao, and YonghongYan. “HCCL at SemEval-2018 Task 8: An End-to-End System for Sequence Labeling from Cybersecurity Reports.” Proceedings of The12thInternationalWorkshoponSemanticEvaluation.2018.
[13] Brew, Chris. “Digital Operatives at SemEval-2018 Task 8: Using dependency features for malware NLP.” Proceedings of The 12th International Workshop on Semantic Evaluation. 2018.
[14] Crammer,Koby,etal.“Onlinepassive-aggressivealgorithms.” Journal of Machine Learning Research 7.Mar (2006): 551- 585.
[15] Manikandan, R., Krishna Madgula, and SnehanshuSaha. “TeamDLatSemEval-2018task8:Cybersecuritytextanalysis using convolutional neural network and conditional random fields.” Proceedings of The 12th International Workshop on Semantic Evaluation.2018.
[16] Padia, Ankur, et al. “UMBC at SemEval-2018 Task 8: Understanding text about malware.” Proceedings of International Workshop on Semantic Evaluation (SemEval-2018).2018.
[17] Ravikiran, Manikandan, and Krishna Madgula. “Fusing Deep Quick Response Code Representations Improves Malware Text Classification.” Proceedings of the ACM Workshop on Crossmodal Learning and Application.2019.
[18] Howard, Jeremy, and Sebastian Ruder. “Universal language model fine-tuning for text classification.” arXiv preprint arXiv:1801.06146(2018).
[19] Mikolov, Tomas, et al. “Distributed representations of words and phrases and their compositionality.” Advances in neural information processing systems. 2013.
[20] Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXivpreprint arXiv:1810.04805(2018).
[21] Yang, Zhilin, et al. “Xlnet: Generalized autoregressive pre-training for language understanding.” Advances in neural information processing systems.2019.

Keywords
Transformer Models, BERT, XLNETS, ULMFIT, Malware Characteristics, APT reports, binary classification, sampling, Transfer Learning.