An Empirical Data Cleaning Technique for CFDs
Authors : Satyanarayana Mummana , Ravi kiran Rompella
Satyanarayana Mummana , Ravi kiran Rompella . "An Empirical Data Cleaning Technique for C FDs". International Journal of Engineering Trends and Technology (IJETT). V4(9):3730-3735 Sep 2013.
Data cleaning is a basic data preprocessing technique for before forwarding the data to data mining approach ,but it leads to an intresting research area in the field of data mining. Data cleaning is the process of finding and deleting noisy data/records from the database. The simplest technique used for data cleaning is based on Functional Dependencies. As FDs works on entire instance of a table we introduced a new technique called Conditional Functional Dependencies. CFDs are like if then rules. The de pendence between the columns of a table are represented as conditions using functions.. For example if we consider a employee table which maintains the employee name,id,city,pincode and etc. In this table the employees who are belongs to the same city, are all may have the same pincode, So that we can generate a FD that city --- >pincode. CFD means using specific condition for the FD. ex:city=vizag ---- >pincode=531005. The main agend of our project is to find the CFD violated rows in a table using the created CFDs. These CFDs violated rows are deleted to correct data
