Towards Transparent Diabetes Prediction: Unveiling the Factors with Explainable AI
Towards Transparent Diabetes Prediction: Unveiling the Factors with Explainable AI |
||
|
||
© 2024 by IJETT Journal | ||
Volume-72 Issue-5 |
||
Year of Publication : 2024 | ||
Author : Tran Quang Vinh, Haewon Byeon |
||
DOI : 10.14445/22315381/IJETT-V72I5P103 |
How to Cite?
Tran Quang Vinh, Haewon Byeon, "Towards Transparent Diabetes Prediction: Unveiling the Factors with Explainable AI," International Journal of Engineering Trends and Technology, vol. 72, no. 5, pp. 26-35, 2024. Crossref, https://doi.org/10.14445/22315381/IJETT-V72I5P103
Abstract
Automatic diabetes prediction using machine learning and Explainable AI (XAI) has emerged as a promising approach for early detection and improved patient outcomes. This study investigates the current landscape of XAI research in diabetes diagnosis. The paper examines the transition from basic machine learning algorithms to complex deep learning models, emphasizing the importance of data quality and data preprocessing for accurate and interpretable results, particularly when dealing with tabular data from medical records. The integration of XAI techniques allows us to understand how these models arrive at their predictions, fostering trust and transparency. Despite these advancements, limitations remain. The generalizability of findings based on limited datasets needs further exploration through studies using more diverse data sources and real-world clinical settings. Additionally, the potential of XAI in diabetes management can be further enhanced by integrating these models with mobile applications and Internet of Things (IoT) sensor technology, paving the way for personalized and continuous monitoring. In conclusion, XAI research in diabetes prediction holds immense potential for improving healthcare delivery. By addressing current limitations and exploring new avenues of research, XAI can empower healthcare professionals and patients in the fight against diabetes.
Keywords
Diabetes, Explainable Artificial Intelligence, LIME, SHAP, Deep Learning.
References
[1] Astrid Petersmann et al., “Definition, Classification and Diagnosis of Diabetes Mellitus,” Experimental and Clinical Endocrinology & Diabetes, vol. 127, no. S01, pp. S1–S7, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[2] The Top 10 Causes of Death, World Health Organization, 2020. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death
[3] Diabetes, World Health Organization, 2023. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/diabetes
[4] Mark A. Atkinson, George S. Eisenbarth, and Aaron W. Michels, “Type 1 Diabetes,” The Lancet, vol. 383, no. 9911, pp. 69–82, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Sudesna Chatterjee, Kamlesh Khunti, and Melanie J. Davies, “Type 2 Diabetes,” The Lancet, vol. 389, no. 10085, pp. 2239–2251, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[6] John P. Kirwan, Jessica Sacks, and Stephan Nieuwoudt, “The Essential Role of Exercise in the Management of Type 2 Diabetes,” Cleveland Clinic Journal of Medicine, vol. 84, pp. S15–S21, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Julia Shaver, “The State of Telehealth Before and After the COVID-19 Pandemic,” Primary Care: Clinics in Office Practice, vol. 49, no. 4, pp. 517–530, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Fei Jiang et al., “Artificial Intelligence in Healthcare: Past, Present and Future,” Stroke and Vascular Neurology, vol. 2, no. 4, pp. 230-243, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Şefki Kolozali et al., “Explainable Early Prediction of Gestational Diabetes Biomarkers by Combining Medical Background and Wearable Devices: A Pilot Study with a Cohort Group in South Africa,” IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 4, pp. 1860–1871, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Mirza Mansoor Baig et al., “Early Detection of Prediabetes and T2DM Using Wearable Sensors and Internet-of-Things-Based Monitoring Applications,” Applied Clinical Informatics, vol. 12, no. 1, pp. 1–9, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Marleen Olde Bekkink et al., “Early Detection of Hypoglycemia in Type 1 Diabetes Using Heart Rate Variability Measured by a Wearable Device,” Diabetes Care, vol. 42, no. 4, pp. 689–692, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Toshita Sharma, and Manan Shah, “A Comprehensive Review of Machine Learning Techniques on Diabetes Detection,” Visual Computing for Industry, Biomedicine, and Art, vol. 4, no. 3, pp. 1-16, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Ciro Rodriguez-León et al., “Mobile and Wearable Technology for the Monitoring of Diabetes-Related Parameters: Systematic Review,” JMIR mHealth and uHealth, vol. 9, no. 6, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, “Why Should I Trust You?": Explaining the Predictions of Any Classifier,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA: ACM, pp. 1135–1144, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[15] About, PubMed, 2023. [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/about/
[16] Scott M. Lundberg, and Su-In Lee, “A Unified Approach to Interpreting Model Predictions,” Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4768-4777, 2017.
[Google Scholar] [Publisher Link]
[17] Michael Greenacre et al., “Principal Component Analysis,” Nature Reviews Methods Primers, vol. 2, no. 100, pp. 1–21, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Anshuman Guha, “Building Explainable and Interpretable Model for Diabetes Risk Prediction,” International Journal of Engineering Research and Technology, vol. 9, no. 9, pp. 1037-1042, 2020.
[Google Scholar] [Publisher Link]
[19] Shichao Jia et al., “Visualizing Surrogate Decision Trees of Convolutional Neural Networks,” Journal of Visualization, vol. 23, pp. 141–156, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Robert J. Aumann, and Sergiu Hart, Handbook of Game Theory with Economic Applications, Elsevier Science, vol. 2, pp. 1-818, 1992.
[Google Scholar] [Publisher Link]
[21] Cem Özkurt, “Combining Chaotic Transformations and Machine Learning Algorithms: Evaluating Explainable Artificial Intelligence Model Performance,” Research Square, pp. 1-17, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Varad Vishwarupe et al., “Explainable AI and Interpretable Machine Learning: A Case Study in Perspective,” Procedia Computer Science, vol. 204, pp. 869-876, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Kevin René Broløs et al., “An Approach to Symbolic Regression Using Feyn,” arXiv, pp. 1-18, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, “Anchors: High-Precision Model-Agnostic Explanations,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 1527-1535, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Ioannis Kavakiotis et al., “Machine Learning and Data Mining Methods in Diabetes Research,” Computational and Structural Biotechnology Journal, vol. 15, pp. 104-116, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Pima Indians Diabetes Database, Kaggle, 2016. [Online]. Available: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database [27] Alistair E.W. Johnson et al., “MIMIC-III, A Freely Accessible Critical Care Database,” Scientific Data, vol. 3, no. 1, pp. 1-9, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Nora El-Rashidy et al., “Utilizing fog Computing and Explainable Deep Learning Techniques for Gestational Diabetes Prediction,” Neural Computing and Applications, vol. 35, no. 10, pp. 7423-7442, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Varada Vivek Khanna et al., “Explainable Artificial Intelligence-Driven Gestational Diabetes Mellitus Prediction using Clinical and Laboratory Markers,” Cogent Engineering, vol. 11, no. 1, pp. 1-19, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Karthick Kanagarathinam, “Early Stage Diabetes Risk Prediction Dataset,” IEEEDataSet, 2021.
[CrossRef] [Publisher Link]
[31] Sayed Asaduzzaman et al., “Dataset on Significant Risk Factors for Type 1 Diabetes: A Bangladeshi Perspective,” Data in Brief, vol. 21, pp. 700-708, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Leo Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
[CrossRef] [Google Scholar] [Publisher Link]
[33] V. Vakil et al., “Explainable Predictions of Different Machine Learning Algorithms Used to Predict Early Stage Diabetes,” arXiv, pp. 1-15, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Tiani Chen, and Carloas Guestrin, “XGBoost: A Scalable Tree Boosting System,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[35] V. Kishore Ayyadevara, Decision Tree, Pro Machine Learning Algorithms, pp. 71-103, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[36] M.A. Hearst et al., “Support Vector Machines,” IEEE Intelligent Systems and their Applications, vol. 13, no. 4, pp. 18–28, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Padraig Cunningham, and Sarah Jane Delany, “k-Nearest Neighbour Classifiers: A Tutorial,” ACM Computing Surveys, vol. 54, no. 6, pp. 1–25, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Francesco Curia, “Explainable and Transparency Machine Learning Approach to Predict Diabetes Develop,” Health and Technology, vol. 13, no. 5, pp. 769–780, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Hafsa Binte Kibria et al., “An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI,” Sensors, vol. 22, no. 19, pp. 1-37, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Gustavo E.A.P.A. Batista, Ronaldo C. Prati, and Maria Carolina Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,” SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 20-29, 2004.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Robert E. Schapire, Explaining AdaBoost, Empirical Inference, pp. 37–52, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Yuhan Du et al., “An Explainable Machine Learning-Based Clinical Decision Support System for Prediction of Gestational Diabetes Mellitus,” Scientific Reports, vol. 12, no. 1, pp. 1-14, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[43] Lionel P. Joseph, Erica A. Joseph, and Ramendra Prasad, “Explainable Diabetes Classification using Hybrid Bayesian-Optimized TabNet Architecture,” Computers in Biology and Medicine, vol. 151, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Sercan Ö. Arik, and Tomas Pfister, “TabNet: Attentive Interpretable Tabular Learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 8, pp. 6679-6687, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Maria A. Kennelly et al., “Pregnancy Exercise and Nutrition with Smartphone Application Support: A Randomized Controlled Trial,” Obstetrics & Gynecology, vol. 131, no. 5, pp. 818–826, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Isfafuzzaman Tasin et al., “Diabetes Prediction using Machine Learning and Explainable AI Techniques,” Healthcare Technology Letters, vol. 10, no. 1–2, pp. 1–10, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Gangani Dharmarathne et al., “A Novel Machine Learning Approach for Diagnosing Diabetes with a Self-Explainable Interface,” Healthcare Analytics, vol. 5, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Diabetes Dataset, kaggle, 2020. [Online]. Available: https://www.kaggle.com/datasets/mathchi/diabetes-data-set