Efficient Dimensionality Reduction using Improved Fuzzy C-Means Entropy Approach with Caps-TripleGAN for Predicting Software Defect in Imbalanced Dataset

Efficient Dimensionality Reduction using Improved Fuzzy C-Means Entropy Approach with Caps-TripleGAN for Predicting Software Defect in Imbalanced Dataset

  IJETT-book-cover           
  
© 2022 by IJETT Journal
Volume-70 Issue-7
Year of Publication : 2022
Authors : Satya Srinivas Maddipati, Malladi Srinivas
DOI : 10.14445/22315381/IJETT-V70I7P201

How to Cite?

Satya Srinivas Maddipati, Malladi Srinivas, "Efficient Dimensionality Reduction using Improved Fuzzy C-Means Entropy Approach with Caps-TripleGAN for Predicting Software Defect in Imbalanced Dataset," International Journal of Engineering Trends and Technology, vol. 70, no. 7, pp. 1-9, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I7P201

Abstract
Early detection of bugs or defects in software life cycle reduces the effort required to develop software. Mainly there are two problems in software defect prediction, i.e., dimensionality reduction and class imbalance. Several works are done to predict software defects, but that method does not provide sufficient accuracy, and the error rate increases. To overcome these issues, this work is proposed. In this manuscript, Improved Fuzzy C-means-based Entropy (IFCME) approach with CapsNet triple generative adversarial network (Caps-Triple GAN) for predicting software defect imbalanced dataset for reducing dimensionality and Class Imbalance Problem. Converting the non-linear high dimensional data into low dimensional space to reduce the class imbalance problem uses IFCME. Caps-Triple GANs are used to classify the data with high accuracy and reduce the error rate of software data prediction. The simulation process is executed in the MATLAB platform. The proposed IFCME-Caps-Triple GAN-DR-SDP attains higher accuracy of 23.84%, 32.94%, 36.94%, High Precision of 26.94%, 37.32%, 28.94%, and the proposed method is compared with the existing methods such as Software defect prediction model based on LASSO–SVM (LASSO–SVM-DR-SDP), multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder (MHCPDP-MAA-DR-SDP), Tackling class imbalance problem in software defect prediction through cluster-based over-sampling with filtering (KMFOS-DR-SDP)respectively.

Keywords
Improved Fuzzy C-means-based Entropy (IFCME), Software defect prediction, Caps Net triple generative adversarial network (Caps-Triple GAN), Dimensionality reduction, and Class Imbalance Problem.

Reference
[1] Feng S, Keung J, Yu X, Xiao Y, Bennin K.E, Kabir M.A and Zhang M, “COSTE: Complexity-Based Over Sampling Technique to Alleviate the Class Imbalance Problem in Software Defect Prediction,” Information and Software Technology, vol. 129, pp. 106432, 2021.
[2] Ali L, Wajahat I, Golilarz N.A, Keshtkar F and Bukhari S.A.C, “LDA–GA–SVM: Improved Hepatocellular Carcinoma Prediction Through Dimensionality Reduction and Genetically Optimized Support Vector Machine,” Neural Computing and Applications, vol. 33, no. 7, pp. 2783-2792, 2021.
[3] Sadiq M.T, Yu X and Yuan Z, “Exploiting Dimensionality Reduction and Neural Network Techniques for the Development of Expert Brain–Computer Interfaces,” Expert Systems with Applications, vol. 164, pp. 114031, 2021.
[4] Peng X, Xu D and Chen D, “Robust Distribution-Based Nonnegative Matrix Factorizations for Dimensionality Reduction,” Information Sciences, vol. 552, pp. 244-260, 2021.
[5] Sharmila S and Vijayarani S, “Association Rule Mining Using Fuzzy Logic and Whale Optimization Algorithm,” Soft Computing, vol. 25, no. 2, pp. 1431-1446, 2021
[6] Sadr A.V, Bassett B.A and Kunz M, “A Flexible Framework for Anomaly Detection Via Dimensionality Reduction,” Neural Computing and Applications, pp. 1-11, 2021
[7] Sadr A.V, Bassett B.A and Kunz M, “A Flexible Framework for Anomaly Detection Via Dimensionality Reduction,” Neural Computing and Applications, pp. 1-11. 2021
[8] Cheung M, Campbell J.J, Whitby L, Thomas R.J, Braybrook J and Petzing J, “Current Trends in Flow Cytometry Automated Data Analysis Software,” Cytometry Part A, 2021.
[9] Maddipati S.S and Srinivas M, “Machine Learning Approach for Classification from Imbalanced Software Defect Data Using PCA & CSANFIS,” Materials Today: Proceedings, 2021.
[10] Maddipati S.S and Srinivas M,. Statistical Testing on Prediction of Software Defects,” EAI Endorsed Transactions on Energy Web, vol. 5, no. 20, 2018.
[11] Yang C.H, Chuang L.Y and Lin Y.D, “Epistasis Analysis Using an Improved Fuzzy C-Means-Based Entropy Approach,” IEEE Transactions on Fuzzy Systems, vol. 28, no. 4, pp. 718-730, 2019.
[12] Wang X, Tan K, Du Q, Chen Y and Du P, “Caps-Triple GAN: GAN-Assisted CapsNet for Hyperspectral Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 9, pp. 7232-7245, 2019.
[13] NASA dataset link, [Online]. Available: https://github.com/klainfo/NASADefectDataset
[14] AZEEEM dataset link, [Online]. Available: https://github.com/bharlow058/AEEEM-and-other-SDP-datasets
[15] Wang K, Liu L, Yuan C and Wang Z, “Software defect prediction model based on LASSO–SVM,” Neural Computing and Applications, vol. 33, no. 14, pp. 8249-8259, 2021.
[16] Wu J, Wu Y, Niu N and Zhou M, “MHCPDP: Multi-Source Heterogeneous Cross-Project Defect Prediction Via Multi-Source Transfer Learning and Autoencoder,” Software Quality Journal, pp. 1-26, 2021
[17] Gong L, Jiang S and Jiang L, “Tackling Class Imbalance Problem in Software Defect Prediction Through Cluster-Based OverSampling with Filtering,” IEEE Access, vol. 7, pp. 145725-145737, 2019.
[18] Arvinder Kaur and Kamaldeep Kaur, ”Performance Analysis of Ensemble Learning for Predicting Defects in Open Source Software,” International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2014.
[19] Thanh Tung Khuat, My Hanh Le, “Ensemble Learning for Software Fault Prediction Problem with Imbalanced Data,” International Journal of Electrical and Computer Engineering (IJECE),vol. 9, no. 4, 2019.
[20] Jaroslaw Hryszko, Lech Madeyski, “Cost Effectiveness of Software Defect Prediction in an Industrial Project,” Foundations of Computing and Decision Sciences, vol. 43, no. 1, 2018.
[21] Kazuya Tanaka; Akito Monden; Zeynep Yücel, ”Prediction of Software Defects Using Automated Machine Learning,” 20th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2019. Doi: 10.1109/SNPD.2019.8935839.
[22] Pradeep Singh, ”Stacking Based Approach for Prediction of Faulty Modules,” IEEE Conference on Information and Communication Technology (CICT), 2019.
[23] Haitao He, Xu Zhang, Qian Wang, Jiadong Ren, Jiaxin Liu, Xiaolin Zhao, Yongqiang Cheng, ”Ensemble MultiBoost Based on RIPPER Classifier for Prediction of Imbalanced Software Defect Data,” IEEE Access, vol. 7.
[24] Z. Li, X. Jing, X. Zhu, and H. Zhang, "Heterogeneous Defect Prediction Through Multiple Kernel Learning and Ensemble Learning," IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, pp. 91-102, 2017. Doi: 10.1109/ICSME.2017.19.
[25] X. Xia, D. Lo, S. J. Pan, N. Nagappan, and X. Wang, "HYDRA: Massively Compositional Model for Cross-Project Defect Prediction," in IEEE Transactions on Software Engineering, vol. 42, no. 10, pp. 977-998, 2016. Doi: 10.1109/TSE.2016.2543
[26] Boehm B. W, “Software Engineering Economics,” Prentice-Hall, Englewood Cliffs, NJ, 1981.
[27] Satya Srinivas Maddipati, Dr. G Pradeepini, Dr. A Yesubabu, ”Software Defect Prediction using Adaptive Neuro-Fuzzy Inference System,” International Journal of Applied Engineering Research, vol. 13, no. 1, pp. 394-397, 2018.
[28] Jyoti Goyal, Bal Kishan, “TLHEL: Two Layer Heterogeneous Ensemble Learning for Prediction of Software Faults,” International Journal of Engineering Trends and Technology, vol. 69, no. 4, pp. 16-20, 2021.