A Novel Technique for Parallelization of Genetic Algorithm using Hadoop

  ijett-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2013 by IJETT Journal
Volume-4 Issue-8                      
Year of Publication : 2013
Authors : Ms. Kanchan Sharadchandra Rahate , Prof. L.M.R.J. Lobo

MLA 

Ms. Kanchan Sharadchandra Rahate , Prof. L.M.R.J. Lobo. "A Novel Technique for Parallelization of Genetic Algorithm using Hadoop". International Journal of Engineering Trends and Technology (IJETT). V4(8):3328-3331 Jul 2013. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group.

Abstract

Document categoriza tion is used in education, government sectors , art , industry etc. Categorizing a document to enable immediate finding of it in the future motivated the concept of Classification involving Document categorization. Manual document classification involves alot of effort and is time consuming. The basic idea implemented in this paper speeds up processing and reduces manual intervention , by atomizing this categorization. This idea is an edge over the existing classification systems. The implementation of the system basically involves getting into to parallelize G enetic A lgorithm (GA) thus improving the processing speed. The use of Hadoop MapReduce and HDFS (Hadoop Distributed File System) framework helps to store big data and speeds up the calculations involve d in the computation of genetic algorithm. The motivation of this work has reason from mapreduce fare well in terms of scalability, fault tolerance, and ease - of - use. This is adjoined by ha doop being an open - source and h adoop being written in Java.

References

[1] Kanchan A. Khedikar and Dr. Mrs. S. S. Apte. “ Latest Technology In Networking: Cloud Architecture ” , in ICETT 2010 .
[2] Konstant in Shvachko. The Hadoop Distributed File System , 978 - 1 - 4244 - 7153 - 9/10/$26.00 published at ©2010 IEEE.
[3] Apache Hadoop. http://hadoop.apache.org/
[4] Genetic Algorithms in the Cloud ” from MENTION.
[5] Dhruba Borthakur. The H adoop Distributed File System: Architecture and Design.
[6] Introduction to Genetic Algorithm http://lancet.mit.edu/mbwall/presentations0/IntroToGAs/
[7] Mariusz Nowostawski And Riccardo Poli . Parallel Genetic Algorithm Taxonomy published in KES ’ 99, May 13, 1999.
[8] Adriana Pietramala, Veronica L. Policicchio, Pasquale Rullo and Inderbir Sidhu. A Genetic Algorithm for Text Classification Rule Induction Appears in W. Daelemans et al. (Eds .): ECML PKDD 2008, Part II, LNAI 5212, pp. 188 – 203, 2008. @ Springer - Verlag Berlin Heidelberg 2008.
[9] Sandeep Tayal. Tasks Scheduling Optimization for the Cloud Computing Systems. In (IJAEST) International Journal Of Advanced Engineering Sciences And Technologies, Volume No. 5, Issue No. 2, 111 – 115.
[10] Linda Di Geronimo, Filomena Ferrucci, Alfonso Murolo, Federica Sarro. A Parallel Genetic Algorithm Based on Hadoop MapReduce for the Automatic Generation of JUnit Test Suites. 2012 IEEE Fifth International Confer ence on Software Testing, Verification and Validation. 978 - 0 - 7695 - 4670 - 4/12 $26.00 © 2012 IEEE DOI 10.1109/ICST.2012.177
[11] Lecture on MapReduce access on http://hadoop.apache.org/mapreduce/ .
[12] Lecture on ha doop HDFS access on http://hadoop.apache.org/hdfs/ .
[13] Hadoop: The Definitive Guide by Tom White, First Edition. Copyright © 2009 Tom White. All rights reserved. Printed in the United States of America. Published by O ’ Reilly Media. (e - book)
[14] Hadoop MapReduce Cookbook by Srinath Perera and Thilina Gunarathne. Copyright © 2013. Published by Packt Publishing Ltd., ISBN 978 - 1 - 84951 - 728 - 7

Keywords
Hadoop, HDFS, MapReduce, PGA, OlexGA