Data Mining Based Imputation Techniques to Handle Missing Values in Gene Expressed Dataset

The microarray analysis results in datasets with massive expression levels of genes as rows and following the various laboratory conditions as columns. Due to experimental errors, these datasets frequently have some content dropping. The presence of missing values in data sets significantly reduces efficiency and accuracy. It can influence the outcome of the visualization study of gene representation. Therefore, how to predict missing records indeed becomes significant to examine the elementary arrangement. Missing data imputation has received numerous attractions from researchers. This paper summarizes most of the techniques proposed for the imputation of missing data. It contains a thorough discussion about various advantages and disadvantages of global, local, and hybrid approaches and knowledge-assisted approaches. This paper has described MCAR, MNAR, MAR techniques to identify the type of missing data. Precisely this article compares all the methods and puts forward a better understanding of these techniques.

Correlation Structure, Gene Expression Data, Imputation, Missing Value.

