Predicting Chronic Diseases with Health IT: a Survey on Popular Techniques

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2019 by IJETT Journal
Volume-67 Issue-6
Year of Publication : 2019
Authors : Tongbin Zhang, Li Cai, Chuandi Pan
DOI :  10.14445/22315381/IJETT-V67I6P204


MLA Style: Tongbin Zhang, Li Cai, Chuandi Pan"Predicting Chronic Diseases with Health IT: a Survey on Popular Techniques" International Journal of Engineering Trends and Technology 67.6 (2019): 18-24.

APA Style:Tongbin Zhang, Li Cai, Chuandi Pan (2019). Predicting Chronic Diseases with Health IT: a Survey on Popular Techniques International Journal of Engineering Trends and Technology, 67(6), 18-24.

In this paper, we study and compare popular predictive techniques to predict the chronic diseases with the support of health information technology. Interestingly, we show that there is no technique guaranteeing the good predictive outcomes for all diseases. In many cases, the well-known state-of-the-art techniques, such as support vector machine, was significantly outperformed by simpler classical techniques. We also show that using feature selection would improve the predictive performance. However, choosing the right predictive technique is still the crucial factor. Therefore, in health information technology industrial practice, the predictive healthcare system should change from only relying on only one technique to integrating multiple techniques in case-study basis.

[1] FK Weigel, TL Switaj, J Hamilton. Leveraging Health information technology to improve quality in federal healthcare. US Army Med Dep J, Oct-Dec, 2015: 68-74
[2] TT Chen, CD Pan. Design and Realization of Digital Intensive Care System. International Journal of Engineering Trends and Technology, vol. 36, no. 7, May-Jun, 2016: 337-342
[3] R Miotto, L Li, BA Kidd, et al. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep, vol. 6, pp. 26094, May 17, 2016.
[4] R Bellazzi, B Zupan. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform, vol. 77, no. 2, Feb, 2008: 81-97
[5] PB Jensen, LJ Jensen, S Brunak. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet, vol. 13, no. 6, Jun, 2012: 395-405
[6] D Dahlem, D Maniloff, C Ratti. Predictability Bounds of Electronic Health Records. Sci Rep, vol. 5, pp. 11865, Jul 7, 2015.
[7] NG Weiskopf, G Hripcsak, S Swaminathan, et al. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform, vol. 46, no. 5, Oct, 2013: 830-836
[8] NG Weiskopf, C Weng. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc, vol. 20, no. 1, Jan 1, 2013:144-151
[9] Y Bengio, A Courville, P Vincent. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell, vol. 35, no. 8, Aug, 2013: 1798-1828
[10] MI Jordan, TM Mitchell. Machine learning: Trends, perspectives, and prospects. Science, vol. 349, no. 6245, Jul 17, 2015: 255-260
[11] D Silver, J Schrittwieser, K Simonyan, et al. Mastering the game of Go without human knowledge . Nature, vol. 550, no. 7676, pp. 354, 2017.
[12] A Liaw, M Wiener. Classification and regression by randomForest. R news, vol. 2, no. 3, 2002: 18-22
[13] SH Huang, P LePendu, SV Iyer, et al. Toward personalizing treatment for depression: predicting diagnosis and severity. J Am Med Inform Assoc, vol. 21, no. 6, Nov-Dec, 2014: 1069-75
[14] S Lyalina, B Percha, P LePendu, et al. Identifying phenotypic signatures of neuropsychiatric disorders from electronic medical records . J Am Med Inform Assoc, vol. 20, no. e2, Dec, 2013: 297-305
[15] R Bro, AK Smilde. Principal component analysis. Analytical Methods, vol. 6, no. 9, 2014: 2812-2831
[16] L Van Der Maaten, G Hinton. Visualizing data using t-sne (2008). J Mach Learn Res, vol. 1117, no., 2017: 2579-2605
[17] L Kamkar, SK Gupta, D Phung. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso. J Biomed Inform, vol. 53, Feb, 2015: 277-90
[18] BJ Marafino, WJ Boscardin, RA Dudley. Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes. J Biomed Inform, vol. 54, Apr, 2015: 114-120
[19] N Cao, S Zeng, F Shen, et al. Predictive and Preventive Models for Diabetes Prevention using Clinical Information in Electronic Health Record . In IEEE International Conference on Bioinformatics and Biomedicine, Washington DC, 2015.
[20] V Vapnik, SE Golowich, AJ Smola. Support vector method for function approximation, regression estimation and signal processing. Advances in neural information processing system, Feb, 1970: 281-287
[21] R Kohavi. The power of decision tables. Machine Learning: ECML-95, Springer, 1995: 174-189:
[22] R Hecht-Nielsen. Theory of the backpropagation neural network. Neural networks for perception, Elsevier, 1992: 65-93
[23] T Duquesne, JFL Gall. Random trees, levy processes and spatial branching processes. Mathematics, October 2005:113-116
[24] G Holmes, R Kirkby, B Pfahringer. Stress-testing hoeffding trees. Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings: 495-502.
[25] M Hall, E Frank, G Holmes, et al. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, vol. 11, no. 1, 2009: 10-18
[26] ttest2: two-sample t-test. 08/08/2017.
[27] "ICD-10 online versions," World Health Organization, 2014.
[28] R Peck, C Olsen, JL Devore. Introduction to statistics and data analysis: Cengage Learning. Springer International Publishing, 2015.

Chronic disease prediction, Health IT, Random Forest