Utilizing Machine Learning Algorithms to Automatically Categorize Software Test Cases

Utilizing Machine Learning Algorithms to Automatically Categorize Software Test Cases

  IJETT-book-cover           
  
© 2023 by IJETT Journal
Volume-71 Issue-9
Year of Publication : 2023
Author : Abdullahi Ahmed Abdirahma, Abdirahman Osman Hashi, Siti Zaiton Mohd Hashim, Mohamed Abdirahman Elmi
DOI : 10.14445/22315381/IJETT-V71I9P203

How to Cite?

Abdullahi Ahmed Abdirahma, Abdirahman Osman Hashi, Siti Zaiton Mohd Hashim, Mohamed Abdirahman Elmi, "Utilizing Machine Learning Algorithms to Automatically Categorize Software Test Cases," International Journal of Engineering Trends and Technology, vol. 71, no. 9, pp. 27-35, 2023. Crossref, https://doi.org/10.14445/22315381/IJETT-V71I9P203

Abstract
The creation of an efficient strategy to help decrease the problem of manual effort expended by software developers when labelling software test cases has been the focus of many academic researchers. To ensure that all features and applications are fully tested, it is important to have a framework that can effectively match the feature labels and test cases in the correct sequence. Irrelevant labeling of test cases can result in inaccuracy, so avoiding it is a key objective of this paper. As a result, the primary goal of this work is to extend a previous method for doing automatic directory categorization of test cases based on their test case description using the K-nearest-neighbor classifier, Logistic regression, Decision tree and MLP. Bag-of-word (Bow) is used as a vector representation and fits all classifiers. The experimental results reveal that using KNN-BOW and MLP have a higher score than Logistic regression and Decision tree since it outperformed and obtained 77% accuracy vs. 71% for KNN-TF-IDF. Meanwhile, we extended using KNN-BOW and MLP-BOW have scored a good result compared to Logistic regression and Decision tree, as it outperformed and achieved 77% accuracy in comparison with the 67% and 65% that Logistic regression and Decision tree achieved, respectively. As a result, KNN-BOW and MLP-BOW are excellent choices for directory categorization based on test case descriptions. The suggested strategy contributes to the domain by ensuring that machine learning algorithms can easily directly classify test-case descriptions.

Keywords
Test cases, K nearest neighbors, Bag of words, Logistic regression, Decision tree.

References
[1] Junji Shimagaki et al., “Automatic Topic Classification of Test Cases using Text Mining at an Android Smartphone Vendor,” Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1-10, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Lijun Shan, and Hong Zhu, “Generating Structurally Complex Test Cases by Data Mutation: A Case Study of Testing an Automated Modelling Tool,” The Computer Journal, vol. 52, no. 5, pp. 571-588, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Stephen H. Edwards, “Using Software Testing to Move Students from Trial-and-Error to Reflection-in-Action,” Proceedings of the 35th SIGCSE Technical Symposium on Computer Science Education, pp. 26-30, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Xiao-Yi Zhang, Zheng Zheng, and Kai-Yuan Cai, “Exploring the Usefulness of Unlabelled Test Cases in Software Fault Localization,” Journal of Systems and Software, vol. 136, pp. 278-290, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Saswat Anand et al., “An Orchestrated Survey of Methodologies for Automated Software Test Case Generation,” Journal of Systems and Software, vol. 86, no. 8, pp. 1978-2001, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[6] P.A. Stocks, and D.A. Carrington, “Test Templates: A Specification-Based Testing Framework,” Proceedings of 1993 15th International Conference on Software Engineering, IEEE, pp. 405-414, 1993.
[CrossRef] [Google Scholar] [Publisher Link]
[7] P. Ammann, and J. Offutt, “Using Formal Methods to Derive Test Frames in Category-Partition Testing,” Proceedings of COMPASS'94-1994 IEEE 9th Annual Conference on Computer Assurance, pp. 69-79, 1994.
[CrossRef] [Google Scholar] [Publisher Link]
[8] A. Hartman, and K. Nagin., “The AGEDIS Tools for Model-Based Testing,” ACM SIGSOFT Software Engineering Notes, vol. 29, no. 4, pp. 129-132, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Dathar A. Hasan et al., “The Impact of Test Case Generation Methods on the Software Performance: A Review,” International Journal of Science and Business, vol. 5, no. 6, pp. 33-44, 2021.
[Google Scholar] [Publisher Link]
[10] Yanjie Zhao et al., “Towards Automatically Repairing Compatibility Issues in Published Android Apps,” Proceedings of the 44th International Conference on Software Engineering, pp. 2142-2153, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[11] He Ye et al., “Automated Classification of Overfitting Patches with Statically Extracted Code Features,” IEEE Transactions on Software Engineering, vol. 48, no. 8, pp. 2920-2938, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Muhammad Khatibsyarbini et al., “Test Case Prioritization Approaches in Regression Testing: A Systematic Literature Review,” Information and Software Technology, vol. 93, pp. 74-93, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Karina Curcio et al., “Requirements Engineering: A Systematic Mapping Study in Agile Software Development,” Journal of Systems and Software, vol. 139, pp. 32-50, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Sillitti et al., “Collecting, Integrating and Analyzing Software Metrics and Personal Software Process Data,” Proceedings 29th Euromicro Conference, vol. 3, pp. 336-342, 2003.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Raymond P. L. Buse, and Thomas Zimmermann, “Information Needs for Software Development Analytics,” 34th International Conference on Software Engineering, IEEE, pp. 987-996, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Dan Han et al., “Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs,” 19th Working Conference on Reverse Engineering, IEEE, pp. 83-92, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Christof Ebert, and Hassan Soubra, “Functional Size Estimation Technologies for Software Maintenance,” IEEE Software, vol. 31, no. 6, pp. 24-29, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Abram Hindle et al., “Relating Requirements to Implementation via Topic Analysis: Do Topics Extracted from Requirements Make Sense to Managers and Developers?,” 28th IEEE International Conference on Software Maintenance, pp. 243-252, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Annibale Panichella et al., “How to Effectively Use Topic Models for Software Engineering Tasks? An Approach based on Genetic Algorithms,” 35th International Conference on Software Engineering, IEEE, pp. 522-531, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Ed Keenan et al., “Tracelab: An Experimental Workbench for Equipping Researchers to Innovate, Synthesize, and Comparatively Evaluate Traceability Solutions,” 34th International Conference on Software Engineering, IEEE, pp. 1375-1378, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Thomas H. Davenport, Jeanne G. Harris, and Robert Morison, Analytics at Work: Smarter Decisions, Better Results, Harvard Business Press, 2010.
[Google Scholar] [Publisher Link]
[22] Stuart McIlroy et al., “Analyzing and Automatically Labelling the Types of User Issues that are Raised in Mobile App Reviews,” Empirical Software Engineering, vol. 21, no. 3, pp. 1067-1106, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Haibing Li et al., "Financial Fraud Detection: Multi-Objective Genetic Programming with Grammars and Statistical Selection Learning," SSRG International Journal of Computer Science and Engineering, vol. 7, no. 2, pp. 1-18, 2020.
[CrossRef] [Publisher Link]
[24] P.M. Johnson et al., “Improving Software Development Management through Software Project Telemetry,” IEEE Software, vol. 22, no. 4, pp. 76-85, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Dongmei Zhang et al., “ Software Analytics as a Learning Case in Practice: Approaches and Experiences,” Proceedings of the International Workshop on Machine Learning Technologies in Software Engineering, pp. 55-58, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Akanksha Pandey, and L.S. Maurya, "Career Prediction Classifiers based on Academic Performance and Skills using Machine Learning," SSRG International Journal of Computer Science and Engineering, vol. 9, no. 3, pp. 5-20, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Barbara A. Kitchenham, and Shari L. Pfleeger, “Personal Opinion Surveys,” Guide to Advanced Empirical Software Engineering, Springer, pp. 63-92, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Anh Tuan Nguyen et al., “Duplicate Bug Report Detection with a Combination of Information Retrieval and Topic Modeling,” Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, pp. 70-79, 2012.
[CrossRef] [Google Scholar] [Publisher Link]