A survey: classification of huge cloud Datasets with efficient Map - Reduce policy

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2014 by IJETT Journal
Volume-18 Number-2
Year of Publication : 2014
Authors : Miss Apexa B. Kamdar, Prof. Jay M. Jagani
DOI :  10.14445/22315381/IJETT-V18P218


Miss Apexa B. Kamdar, Prof. Jay M. Jagani "A survey: classification of huge cloud Datasets with efficient Map - Reduce policy ", International Journal of Engineering Trends and Technology (IJETT), V18(2),103-107 Dec 2014. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group


Cloud computing has become a feasible mainstream solution for data processing, storage and distribution. It assures on demand, scalable, pay-as-you-go compute and storage capacity. To analyze such huge data on clouds, it is very important to research data mining approach based on cloud computing model from both theoretical and practical views. There are large amount of data in cloud database or any other cloud file systems, then apply mining on that data to extract knowledge. Data mining is the process of analyzing data from different perspective and shortening it into useful information. For that Naïve Bayes and support vector machine algorithms are used, which are classification algorithms. In this paper both algorithms are used with MapReduce policy and get the high accuracy, efficiency, high performance. Hadoop is an open source implementation of Map Reduce which can achieve better performance.


[1] B. Liu, E. Blasch, Y. Chen, D. Shen, G. Chen, “Scalable Sentiment Classification for Big Data Analysis Using Naive Bayes Classifier”, IEEE Intl Conf. on Big Data, Oct. 2013.
[2] Lijuan Zhou, Hui Wang, Wenbo Wang, “Parallel Implementation of Classification Algorithms Based on Cloud Computing Environment”, TELKOMNIKA Indonesian Journal of Electrical Engineering, Vol.10, No.5, September 2012, pg. no. 1087-1092.
[3] Seyed Reza Pakize, Abolfazl Gandomi, “Comparative Study of Classification Algorithms Based on MapReduce Model”, International Journal of Innovative Research in Advanced Engineering, ISSN: 2349-2163, Volume 1 Issue 7, August 2014, pg. no. 251-254.
[4] Mladen A. Vouk, “Cloud Computing – Issues, Research and Implementations”, Journal of Computing and Information Technology - CIT 16, 2008, 4, 235–246 doi:10.2498/cit.1001391
[5] Hetal Bhavsar, Amit Ganatra, “A Comparative Study of Training Algorithms for Supervised Machine Learning”, International Journal of Soft Computing and Engineering ISSN: 2231-2307, Volume-2, Issue-4, September 2012, pg. no. 74-81.
[6] Yugang Dai, Haosheng Sun, “The naive Bayes text classification algorithm based on rough set in the cloud platform”, Journal of Chemical and Pharmaceutical Research, ISSN: 0975-7384, 2014, pg. no. 1636-1643.
[7] P Beaulah Soundarabai, Aravindh S, Thriveni J, K.R. Venugopal and L.M. Patnaik, “Big Data Analytics: An Approach using Hadoop Distributed File System, International Journal of Engineering and Innovative Technology, ISSN: 2277-3754, Volume 3, Issue 11, May 2014, pg. no. 239-244.
[8] Vijay D. Katkar, Siddhant Vijay Kulkarni, “A Novel Parallel implementation of Naive Bayesian classifier for Big Data”, International Conference on Green Computing, Communication and Conservation of Energy, 978-1-4673-6126-2/2013 IEEE, pg. no. 847-852.
[9] Yogachandran Rahulamathavan, Raphael C.-W. Phan, Suresh Veluru, Kanapathippillai Cumanan, Muttukrishnan Rajarajan, “Privacy-Preserving Multi-Class Support Vector Machine for Outsourcing the Data Classification in Cloud”, IEEE Transactions on Depedable and Secure Computing, Vol. 11, January 2014, pg. no. 1-14.

Cloud computing, Data Mining, Naïve bayes, support vector machine, Hadoop, Map Reduce.