Feature Raking and Stacked Sparse Autoencoder based Framework for the Prediction of Breast Cancer

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2022 by IJETT Journal
Volume-70 Issue-5
Year of Publication : 2022
Authors : Atul Kumar Ramotra, Vibhakar Mansotra
DOI :  10.14445/22315381/IJETT-V70I5P213


MLA Style: Atul Kumar Ramotra, and Vibhakar Mansotra. "Feature Raking and Stacked Sparse Autoencoder based Framework for the Prediction of Breast Cancer." International Journal of Engineering Trends and Technology, vol. 70, no. 5, May. 2022, pp. 103-110. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I5P213

APA Style:Atul Kumar Ramotra, & Vibhakar Mansotra. (2022). Feature Raking and Stacked Sparse Autoencoder based Framework for the Prediction of Breast Cancer. International Journal of Engineering Trends and Technology, 70(5), 103-110. https://doi.org/10.14445/22315381/IJETT-V70I5P213

Achieving higher classification accuracy using machine learning is a challenging process. It is important to understand each input variable`s significance and contribution to the target class to accomplish this goal. Learning from the suitable representation of the original feature set also enhances the performance of the learning algorithms. This work proposes a framework based on the feature ranking and feature learning techniques for the prediction of Breast Cancer. The main components of the proposed framework include ranking the input variables using the Pearson Correlation method and feature representation of the dataset using Stacked Sparse Autoencoder. The experimental result shows that the proposed framework has achieved an accuracy of 98.42%.

Disease Prediction, Breast Cancer, Feature Ranking, Pearson Correlation, Stacked Sparse Autoencoder.

[1] Y. Kumar, A. Koul, R. Singla and M. F. Ijaz, Artificial Intelligence in Disease Diagnosis: A Systematic Literature Review, Synthesizing Framework and Future Research Agenda, Journal of Ambient Intelligence and Humanized Computing. (2022) 1-28.
[2] M. E. Hossain, A. Khan, M. A. Moni, and S. Uddin, Use of Electronic Health Data for Disease Prediction: A Comprehensive Literature Review, IEEE/ACM Transactions on Computational Biology and Bioinformatics. 18(2) (2019) 745–758.
[3] K. Verma, S. Bhardwaj, R. Arya, M. S. U. Islam, M. Bhushan, A. Kumar, and P. Samant, Latest Tools for Data Mining and Machine Learning, International Journal of Innovative Technology and Exploring Engineering. 8(9S) (2019).
[4] H. Wang, T. M. Khoshgoftaar and K. Gao, A Comparative Study of Filter-Based Feature Ranking Techniques, IEEE International Conference on Information Reuse & Integration. (2010) 43-48.
[5] G. Zhong, L. N. Wang, X. Ling, and J. Dong, An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning, Journal of Finance and Data Science. 2(4) (2016) 265-278.
[6] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. E. Alsaadi, A Survey of Deep Neural Network Architectures and their Applications, Neurocomputing. 234(19) (2017) 11-26.
[7] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle, Greedy Layer-Wise Training of Deep Networks, Proceedings of the 19th International Conference on Neural Information Processing Systems, (2006) 153-160.
[8] Y. Qi, C. Shen, D. Wang, J. Shi, X. Jiang, and Z. Zhu, Stacked Sparse Autoencoder-Based Deep Network for Fault Diagnosis of Rotating Machinery, IEEE Access. 7 (2017) 15066-15079.
[9] Y. F. Niu, Z. Lin, S. J. Ting, and X. Xue, L. Gang, Theories and Applications of Auto-Encoder Neural Networks: A Literature Survey, Chinese Journal of Computers. 1 (2019) 203-230.
[10] J. Yu, X. Zheng, and S. Wang, A Deep Autoencoder Feature Learning Method for Process Pattern Recognition, Journal of Process Control. 79 (2019) 1-15.
[11] D. Charte, F. Charte, S. García, M. J. Jesus and F. A Herrera, A Practical Tutorial on Autoencoders for Nonlinear Feature Fusion: Taxonomy, Models, Software and Guidelines, Information Fusion. 44 (2018) 78-96.
[12] J. Zhai, S. Zhang, J. Chen and Q, He Autoencoder and its Various Variants, IEEE International Conference on Systems, Man, and Cybernetics. (2018) 415-419.
[13] I. Zenbout, A. Bouramoul, and S. Meshoul, Stacked Sparse Autoencoder for Unsupervised Features Learning in PanCancer miRNA Cancer Classification, Conference on Innovative Trends in Computer Science. (2019).
[14] C. S. N. Pathirage, J. Li, L. Li, H. Hao, W. Liu and R. Wang, Development and Application of a Deep Learning-Based Sparse Autoencoder Framework for Structural Damage Identification, Structural Health Monitoring. 18(1) (2019) 103–122.
[15] J. Wang, W. Wang, F. Luo, and S. Wei, Modulation Classification Based on Denoising Autoencoder and Convolutional Neural Network with GNU Radio, The Journal of Engineering. 19 (2019) 6188–6191.
[16] A. Law and A. Ghosh, Multi-label Classification using a Cascade of a Stacked Autoencoder and Extreme Learning Machines, Neurocomputing. 358(17) (2019) 222-234.
[17] C. Li, D. Zhao, S. Mu, W. Zhang, N. Shi, and L. Li, Fault Diagnosis for Distillation Process Based on CNN–DAE, Chinese Journal of Chemical Engineering. 27(3) (2019) 598-604.
[18] G. Lu, X. Zhaoa, J. Yin, W. Yang and B. Li, Multi-Task Learning Using Variational Auto-Encoder for Sentiment Classification, Pattern Recognition Letters. 132 (2020) 115-122.
[19] M. Chen, X. Shi, Y. Zhang, D. Wu, and M. Guizani, Deep Feature Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network, IEEE Transactions on Big Data. 7(4) (2017) 750-758.
[20] F. Farahnakian and J. Heikkonen, A Deep Auto-Encoder Based Approach for an Intrusion Detection System, 20th International Conference on Advanced Communication Technology. (2018).
[21] W. W. Y. Ng, G. Zeng, J. Zhang, D. S. Yeung, and W. Pedrycz, Dual Autoencoder Features for the Imbalance Classification Problem, Pattern Recognition. 60 (2016) 875-889.
[22] M. A. Aslam, C. Xue, Y. Chen, A. Zhang, M. Liu, K. Wang and D. Cui, Breath Analysis Based Early Gastric Cancer Classification from Deep-Stacked Sparse Autoencoder Neural Network, Scientific Reports. 11 (2021) 4014.
[23] (2013). M. Lichman, UCI Machine Learning Repository, Irvine: the University of California, School of Information and Computer Science. [Online]. Available: https://archive.ics.uci.edu/
[24] K. P. Bennett and J. A. Blue, A Support Vector Machine Approach to Decision Trees, IEEE International Joint Conference on Neural Networks. 3 (1998) 2396-2401.
[25] Y. Prasad, K. K. Biswas, and C. K. Jain, SVM Classifier Based Feature Selection Using GA, ACO and PSO for siRNA Design, Proceedings of the First International Conference on Advances in Swarm Intelligence. 2 (2010) 307–314.
[26] D. Lavanya and KU. Rani, Ensemble Decision Tree Classifier for Breast Cancer Data, International Journal of Information Technology Convergence and Services. 2(1) (2012).
[27] A. Mert, N. K l ç and A. Akan, An Improved Hybrid Feature Reduction for Increased Breast Cancer Diagnostic Performance, Biomedical Engineering Letters. 3(3) (2014) 285-291.
[28] L. Peng, W. Chen, W. Zhou, F. Li, J. Yang and J. Zhang, An Immune-inspired Semi-supervised Algorithm for Breast Cancer Diagnosis, Computer Methods and Programs in Biomedicine. 134 (2016) 259-265.
[29] R. J. Marandi, S. Davarzani, M. S. Gharibdousti and B. K. Smith, An Optimum ANN-Based Breast Cancer Diagnosis: Bridging Gaps Between ANN Learning and Decision-Making Goals, Applied Soft Computing. 72 (2018) 108-120.
[30] N. Darapureddy, N. Karatapu, and T. K. Battula, Research of Machine Learning Algorithms using K-Fold Cross Validation, International Journal of Engineering and Advanced Technology. 8(6S) (2019)