Kernel Perceptron Feature Selection Based on Sparse Bayesian Probabilistic Relevance Vector Machine Classification for Disease Diagnosis with Healthcare Data

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2020 by IJETT Journal
Volume-68 Issue-3
Year of Publication : 2020
Authors : Mr.G.Arun, Dr.C N Marimuthu
DOI :  10.14445/22315381/IJETT-V68I3P210S


MLA Style: Mr.G.Arun, Dr.C N Marimuthu  "Kernel Perceptron Feature Selection Based on Sparse Bayesian Probabilistic Relevance Vector Machine Classification for Disease Diagnosis with Healthcare Data" International Journal of Engineering Trends and Technology 68.3(2020):50-63.

APA Style:Mr.G.Arun, Dr.C N Marimuthu. Kernel Perceptron Feature Selection Based on Sparse Bayesian Probabilistic Relevance Vector Machine Classification for Disease Diagnosis with Healthcare Data International Journal of Engineering Trends and Technology, 68(3),50-63.

Disease diagnosis with big healthcare data is a significant problem to be resolved for finding the presence of disease at an early stage. The conventional classification techniques designed for disease prediction does not provide higher diagnosis rate. Besides, the feature selection accuracy of the existing algorithm is also lower. In order to solve this limitation, a Kernel Perceptron Feature Selection based Sparse Bayesian Probabilistic Relevance Vector Machine (KPFS-SBPRVM) Technique is proposed. The KPFS-SBPRVM Technique is designed for disease diagnosis with higher accuracy and lesser time. The KPFS-SBPRVM Technique comprises two steps, namely feature selection and classification for finding the existence of the disease in big healthcare data. Initially, Kernel Perceptron Feature Selection (KPFS) is performed which is a variant of perceptron learning algorithm with kernel function to extract the significant medical features from input big Healthcare dataset. With the relevant features, then Probabilistic Relevance Vector Machine Classification (SBPRVMC) step is carried out in KPFS-SBPRVM technique to classify the big healthcare data as normal data or abnormal data. SBPRVMC is a machine learning technique which uses Bayesian inference for probabilistic classification. In KPFS-SBPRVM technique, SBPRVMC constructs the hyperplane among the healthcare data to classify as normal data or abnormal data. By this way, the disease gets diagnosed at an early stage with higher accuracy and minimal time consumption. Experimental evaluation of KPFS-SBPRVM technique is carried out on factors such as feature selection rate, disease diagnosis rate, disease diagnosis time, and false-positive rate with respect to a number of patient’s medical data.


[1] Shamsul Huda; John Yearwood; Herbert F. Jelinek; Mohammad Mehedi Hassan; Giancarlo Fortino; Michael Buckland, “A Hybrid Feature Selection With Ensemble Classification for Imbalanced Healthcare Data: A Case Study for Brain Tumor Diagnosis”, IEEE Access, Volume 4 , Pages: 9145 – 9154, 2016.
[2] N. Sneha, Tarun Gangil, “Analysis of diabetes mellitus for early prediction using optimal features selection”, Journal of Big Data, Springer, Volume 6, Issue 13, December 2019.
[3] Yichuan Wang, LeeAnn Kung, William Yu Chung Wang, Casey G. Cegielski, “An integrated big data analytics-enabled transformation model: Application to health care”, Information & Management, Elsevier, Volume 55, Issue 1, January 2018, Pages 64-79.
[4] Min Chen, Jun Yang, Jiehan Zhou, Yixue Hao, Jing Zhang, and Chan-Hyun Youn, “5G-Smart Diabetes: Toward Personalized Diabetes Diagnosis with Healthcare Big Data Clouds”, IEEE Communications Magazine, Volume 56, Issue 4, April 2018, Pages 16 – 23.
[5] Jie Xu, Li Wang, Yunfeng Shen, Kaifen Yuan, Yue Nie, Yingxuan Tian, Xiangdong Jian, Xing Ma, and Jinhong Guo, “Family-based Big Medical-Level Data Acquisition System”, IEEE Transactions on Industrial Informatics, Volume 15, Issue 4, April 2019, Pages 2321 – 2329.
[6] C. B. Sivaparthipan, N. Karthikeyan and S. Karthik, “Designing statistical assessment healthcare information system for diabetics analysis using big data”, Multimedia Tools and Applications, Springer, November 2018, Pages 1–14.
[7] Dimitrios Kollias, Athanasios Tagaris, Andreas Stafylopatis, Stefanos Kollias, Georgios Tagaris, “Deep neural architectures for prediction in healthcare”, Complex & Intelligent Systems, Springer, Volume 4, Issue 2, Pages 119–131, June 2018.
[8] JafarA.ALzubi, Balasubramaniyan Bharathikannan, Sudeep Tanwar, Ramachandran Manikandan, Ashish Khanna, Chandrasekar Thaventhiran, “Boosted neural network ensemble classification for lung cancer disease diagnosis”, Applied Soft Computing, Elsevier, Volume 80, Pages 579-591, July 2019.
[9] Ankita Sharma, Deepika Shukla, Tripti Goel, and Pravat Kumar Mandal, “BHARAT: An Integrated Big Data Analytic Model for Early Diagnostic Biomarker of Alzheimer`s disease”, Frontiers in Neurology, Volume 10, Article 9, Pages 1-7, February 2019.
[10] Diellza Nagavci, Mentor Hamiti, Besnik Selimi, “Review of Prediction of Disease Trends using Big Data Analytics”, International Journal of Advanced Computer Science and Applications, Volume 9, Issue 8, Pages 46-50, 2018.
[11] Mehrbakhsh Nilashi, Hossein Ahmadi, Leila Shahmoradi, Othman Ibrahim, Elnaz Akbari, “A predictive method for hepatitis disease diagnosis using ensembles of neuro-fuzzy technique”, Journal of Infection and Public Health, Elsevier, Volume 12, Issue 1, Pages 13-20, January–February 2019.
[12] Samsuddin Ahmed, Kyu Yeong Choi, Jang Jae Lee, Byeong C. Kim, Goo-Rak Kwon, “Ensembles of Patch-Based Classifiers for Diagnosis of Alzheimer Diseases”, IEEE Access, Volume 7, Pages 73373 – 73383, May 2019.
[13] Divya Jain, Vijendra Sing, “Feature selection and classification systems for chronic disease prediction: A review”, Egyptian Informatics Journal, Elsevier, Volume 19, Issue 3, Pages 179-189, November 2018.
[14] Vinitha S, Sweetlin S, Vinusha H and Sajini S, “Disease Prediction Using Machine Learning Over Big Data”, Computer Science & Engineering: An International Journal (CSEIJ), Volume 8, Issue 1, Pages 1-8, February 2018.
[15] Hari Mohan, RaiKalyan Chatterjee, “A unique feature extraction using MRDWT for automatic classification of abnormal heartbeat from ECG big data with Multilayered Probabilistic Neural Network classifier”, Applied Soft Computing, Volume 72, Pages 596-608, November 2018.
[16] Min Chen, Yixue Hao, Kai Hwang, Lu Wang, Lin Wang, “Disease Prediction by Machine Learning Over Big Data from Healthcare Communities”, IEEE Access, Volume 5, Pages 8869 – 8879, April 2017.
[17] Siuly Siuly, “Medical Big Data: Neurological Diseases Diagnosis through Medical Data Analysis”, Data Science and Engineering, Springer, Volume 1, Issue 2, Pages 54–64, June 2016.
[18] Chunxue Wu, Chong Luo, Naixue Xiong, Wei Zhang, Tai-Hoon Kim, “A Greedy Deep Learning Method for Medical Disease Analysis”, IEEE Access, Volume 6, Pages 20021 – 20030, April 2018.
[19] Abril Valeria Uriarte-Arcia ,Itzamá López-Yáñez, Cornelio Yáñez-Márquez, “One-Hot Vector Hybrid Associative Classifier for Medical Data Classification”, PLoS ONE, Volume 9, Issue 4, Pages 1-13, April 2014.
[20] Mallikarjun M. Kodabagi, Ahelam Tikotikar, “Clustering?based approach for medical data classification”, Concurrency and Computation Practice and Experience, Wiley Online Library, Volume 31, Issue 14, Pages 1-14, 2018.
[21] H.S. Hota, Seema Dewangan, “Classification of Health Care Data Using Machine Learning Technique”, International journal of Engineering Science Invention, Volume 5, Issue 9, Pages 2319 – 6734, September 2016.
[22] Diabetes 130-US hospitals for years 1999-2008 Data Set:
[23] Epileptic Seizure Recognition Data Set:
[24] Murugesan C and Marimuthu C.N “Cost optimization of PV-diesel systems in Nano grid using LJ cuckoo search and its application in mobile towers”, Mobile Networks and Applications Volume 24, Issue 2, Pages 340-349, April 2019.
[25] Deepa A and Marimuthu C.N “Design of a high speed Vedic multiplier and square architecture based on Yavadunam Sutra”, S?dhan? 44 (9), 197 .

Big healthcare data, Disease diagnosis, Hyper-parameter Vector, Kernel Perceptron Feature Selection, Patient Medical Data, Similarity, Sparse Bayesian Probabilistic Relevance Vector Machine Classification