Classifying Pap Smear Images with an Advanced Composite Random Forest Model

Classifying Pap Smear Images with an Advanced Composite Random Forest Model

© 2022 by IJETT Journal
Volume-70 Issue-10
Year of Publication : 2022
Authors : Sharmistha Bhattacharjee, Dipankar Ray, Diganta Saha, D. Sobya
DOI : 10.14445/22315381/IJETT-V70I10P230

How to Cite?

Sharmistha Bhattacharjee, Dipankar Ray, Diganta Saha, D. Sobya, "Classifying Pap Smear Images with an Advanced Composite Random Forest Model," International Journal of Engineering Trends and Technology, vol. 70, no. 10, pp. 307-318, 2022. Crossref,

Manual screening and diagnosis of conventional Pap-smear slides for cervical cancer diagnosis is slow and suffers from human error. Here we have proposed a hybrid-deep-learning model achieved using k-means cluster and Random Forest models, which aims to identify the most prevailing characteristics of cervical tissues and classify them into different cytopathological classes. Just because the texture, shape (morphometric), and color of the nucleus and cytoplasm together or individually play a vital role in PAP smear image classification, fifteen prominent features are extracted based on it to classify images collected from the Herlev Pap Smear dataset. Gray Level Covariance Matrix and Gabor Filter helped extract the texture-based features, whereas morphometric and color-based characteristics were abstracted using Canny's edge detection and histogram analysis. In addition, a new and advanced cutting-edge compound random forest model is constructed to categorize the PAP smear photos. It was noted that the suggested hybrid approach offers up to 99% effectiveness. Additionally, this study also demonstrated a thorough comparison of the suggested model. It was observed that the suggested model also performs admirably when measured against Support Vector Machine (SVM) and Deep-Multilayer Perceptron methods.

Cervical Cancer, Herlev Pap Smear dataset, Gray Level Covariance Matrix, Random Forest, Deep-Multilayer Perceptron.

[1] H. Sung et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA: A Cancer Journal for Clinicians.
[2] E.K.W. Schulte,“Standardization of Biological Dyes and Stains: Pitfalls and Possibilities,” Histochemistry, vol. 95, no. 4, pp. 319–328, 1991.
[3] A. Tang,Foong, and J.T, “A Qualitative Evaluation of Random Forest Feature Learning,” Recent Advances on Soft Computing and Data Mining, Springer, Cham, pp. 359-368, 2014.
[4] D.Riana, D. H. Widyantoro, and T. L. Mengko, “Extraction and Classification Texture of Inflammatory Cells and Nuclei in Normal Pap Smear Images,” 4th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME), pp. 65-69, 2015.
[5] C. Vens and F. Costa, “Random Forest Based Feature Induction,” IEEE 11th International Conference on Data Mining, pp. 744-753, 2011.
[6] B.Nithya, V.Ilango, “Evaluation of Machine Learning Based Optimized Feature Selection Approaches and Classification Methods for Cervical Cancer Prediction,” SN Applied Sciences, vol. 1, no. 641, 2019.
[7] L. Nanni, S. Ghidoni, S. Brahnam, “Ensemble of Convolutional Neural Networks for Bioimage Classification," Applied Computing and Informatics, vol. 17, no. 1, pp. 19-35, 2020.
[8] Bora et al., “Automated Classification of Pap Smear Images to Detect Cervical Dysplasia," Computer Methods and Programs in Biomedicine, vol 138, pp. 31-47, 2017.
[9] W. William et al., “A Pap-Smear Analysis Tool (PAT) for Detection of Cervical Cancer from Pap-Smear Images," BioMedical Engineering OnLine, vol. 18, 2019.
[10] Kyi Pyar Win et al., “Computer-Assisted Screening for Cervical Cancer using Digital Image Processing of Pap smear Images," Applied Sciences, 2020.
[11] M.A Devi et al., “Classification of Cervical Cancer Using Artificial Neural Networks," Procedia Computer Science, vol. 89, pp. 465-472, 2016.
[12] Y Songet al., “Segmentation of Overlapping Cytoplasm in Cervical Smear Images via Adaptive Shape Priors Extracted from Contour Fragments," IEEE Transactions on Medical Imaging, vol. 38, no. 12, pp. 2849-2862, 2019.
[13] MDE-Lab : The Management and Decision Engineering Laboratory, 2022. [Online]. Available:
[14] R.E. Haralick, K. Shanmugam, and I. Dinstein, “Textural Features for Image Classification," IEEE Transactions on Systems, Man and Cybernetics, vol. 3, no. 6, 1973.
[15] Y Hamamoto et al., “A Gabor Filter-Based Method for Recognizing Handwritten Numberals," Pattern Recognition, vol. 31, no. 4, pp. 395-400, 1998.
[16] Jain A. K, F. Farrokhnia, “Unsupervised Texture Segmentation Using Gabor Filters," Pattern Recognition, vol. 24, no. 12, pp. 1167- 1186, 1991.
[17] Cruz Jerome et al., “Object Recognition and Detection by Shape and Color Pattern Recognition Utilizing Artificial Neural Networks,"IEEE, pp. 140-144, 2013.
[18] See Yuen Chark and Norliza Mohd Noor, “Integrating Complete Gabor Filter to the Random Forest Classification Algorithm for Face Recognition," Journal of Engineering Science and Technology, vol. 14, no. 2, pp. 859-874. 2019.
[19] L. Breiman, “Random Forests," Machine Learning, vol. 45, pp. 5–32, 2001.
[20] A Liaw,and M. Wiener, “Classification and Regression by Random Forest," News - Reddit, vol. 2, no. 3, pp. 18-22, 2002.
[21] Srivastava, Durgesh & Bhambhu, Lekha, “Data Classification using Support Vector Machine," Journal of Theoretical and Applied Information Technology, vol. 12, pp. 1-7, 2010.
[22] Mahmudul et al., “An Algorithm for Training Multilayer Perceptron (MLP) for Image Reconstruction using Neural Network without Overfitting," International Journal of Scientific & Technology Research, vol. 4, no. 2, pp. 271-275, 2015.
[23] Laurene Fausett, “Fundamentals of Neural Networks: Architectures, Algorithms, and Applications," Pearson Education, 2008.
[24] Dr. Surendiran R, Dr. Thangamani M, Monisha S, Rajesh P, "Exploring the Cervical Cancer Prediction by Machine Learning and Deep Learning with Artificial Intelligence Approaches," International Journal of Engineering Trends and Technology, vol. 70, no. 7, pp. 94- 107, 2022. Crossref,
[25] M. M. Puranik, S.V.Halse, "A Review Paper: Study of Various Types of Noises in Digital Images," International Journal of Engineering Trends and Technology, vol. 57, no. 1, pp. 40-43, 2018. Crossref,
[26] O. Sarrafzadeh, andA. Dehnavi, “Nucleus and Cytoplasm Segmentation in Microscopic Images using K-Means Clustering and Region Growing," Advance Biomedical Research, vol. 4, no. 174, 2015.
[27] Plissiti, Marina E., and Christophoros Nikou, “A Review of Automated Techniques for Cervical Cell Image Analysis and Classification," Biomedical Imaging and Computational Modelling in Biomechanics, Springer Netherlands, pp. 1-18, 2013.
[28] Breiman et al., “Classification and Regression Trees," Routledge, 2017.
[29] E. Martin, “Pap-Smear Classification," Master’s thesis, Technical University of Denmark:-DTU, 2003.
[30] V .P. Amadi, N.D Nwiabu, V. I. E. Anireh, "Case-Based Reasoning System for the Diagnosis and Treatment of Breast, Cervical and Prostate Cancer," SSRG International Journal of Computer Science and Engineering, vol. 8, no. 8, pp. 13-20, 2021. Crossref,
[31] S. Banerjee and D. Dutta Majumdar, “A 2D Shape Metric and its Implementation in Biomedical Imaging,” Pattern Recognition Letters, vol. 17, no. 2, pp. 141–147, 1996.
[32] S. Parui, E. Sarma, and D. Majumder, “Studies on Some Multimodal Medical Image Registration Approaches for Diagnostic and Therapeutic Planning: with Some Case Studies,” Pattern Recognition Letters, vol. 4, pp. 201-204, 1986.
[33] D. Dutta Majumder, and D. Ray, “Approaches of Multimodal Medical Images Registration and Fusion: Efficacy on Diagnostic and Therapeutic Planning,” IETE Journal of Research, vol. 57, no. 6, pp. 498-514, 2011.
[34] W. Lisheng, Q. Gan, and T. Ji, “Cervical Cancer Histology Image Identification Method Based on Texture and Lesion Area Features,” Computer-Assisted Surgery Abingdon, vol. 22, pp. 186-199. 2017.
[35] S. Bhattacharjee, Y.J Singh, and D. Ray, “Comparative Performance Analysis of Machine Learning Classifiers on Ovarian Cancer Dataset,” Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), IEEE, pp. 213-218, 2017.