Extracting Large Data using Big Data Mining

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2014 by IJETT Journal
Volume-9 Number-11                          
Year of Publication : 2014
Authors : Ms. Neha A. Kandalkar , Prof. Avinash Wadhe


Ms. Neha A. Kandalkar , Prof. Avinash Wadhe. "Extracting Large Data using Big Data Mining", International Journal of Engineering Trends and Technology (IJETT), V9(11),576-582 March 2014. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group


Innovations in technology and greater affordability of digital devices have presided over today’s Age of Big Data, in the quantity and diversity of high frequency digital data. These data hold the potential to allow decision makers to track development progress, improve social protection, and understand where existing policies and programmes require adjustment. For example Turning Big Data—call logs, mobile-banking transactions, online user-generated content such as blog posts and Tweets, online searches, satellite images, etc.—into actionable information requires using computational techniques to unveil patterns within and between these extremely large socioeconomic datasets. The data-driven decision-making is now being recognized broadly, and there is growing enthusiasm for the notion of ``Big Data.’’ But there is currently a wide gap between its potential and its realization of real Big Data. Heterogeneity, scale, timeliness, complexity, and privacy problems with Big Data impede progress at all phases of the pipeline that can create value from data. When the data requires us to make decisions, the problems start right away during data acquisition, , currently in an ad hoc manner, about what data to keep and what to discard, and how to store what we keep reliably with the right metadata. Much data today from tweets and blogs are weakly structured pieces of text and is not natively in structured format, while images and video are structured for storage and display, but not for semantic content and search. With this, transforming such content into a structured format for later analysis it is a major challenge. A major investment in Big Data which should be properly directed, can result not only in major scientific advances, but also lay the foundation for the next generation of advances in science, medicine, and business.


[1]. “Data Mining with Big Data”, Xindong Wu, Fellow, IEEE, Xingquan Zhu, Senior Member, IEEE, Gong-Qing Wu, and Wei Ding, Senior Member, IEEE
[2] “Cross-platform aviation analytics using big-data methods “Larsen, T.
[3]. “A Sketch of Big Data Technologies “Zaiying Liu; Ping Yang; Lixiao Zhang
[4] .M.H. Alam, J.W. Ha, and S.K. Lee, “Novel Approaches to Crawling Important Pages Early,” Knowledge and Information Systems, vol. 33, no. 3, pp 707-734, Dec. 2012.
[5]. S. Aral and D. Walker, “Identifying Influential and SusceptibleMembers of Social Networks,” Science, vol. 337, pp. 337-341, 2012.
[6] A. Machanavajjhala and J.P. Reiter, “Big Privacy: Protecting Confidentiality in Big Data,” ACM Crossroads, vol. 19, no. 1, pp. 20-23, 2012.
[7]IntelBigthinkersonBigData, http://www.intel.com/content/www/us/en/bigdata/big-thinkers-on-big-data.html, 2012.
[8]. U. Fayyad. Big Data Analytics: Applications and Opportunities in On-line Predictive Modeling. http:// big-data-mining.org/keynotes/#fayyad, 2012.
[9]. E. Dumbill, \What is big data? An introduction to the big data landscape." O'Reilly Strata. January 11, 2012. http://strata.oreilly.com/2012/01/what-is-big-data.html.
[10]. J . Bughin, M. Chui, and J. Manyika, Clouds, Big Data, and SmartAssets: Ten Tech-Enabled Business Trends to Watch. McKinSey Quarterly, 2010.
[11]. S. Papadimitriou and J. Sun, “Disco: Distributed Co-Clustering with Map-Reduce: A Case Study Towards Petabyte-Scale End-to- End Mining,” Proc. IEEE Eighth Int’l Conf. Data Mining (ICDM ’08), pp. 512-521, 2008.
[12]. “From Databases to Big Data” Sam Madden , Massachusetts Institute of Technology
[13] R. Ahmed and G. Karypis, “Algorithms for Mining the Evolution of Conserved Relational States in Dynamic Networks,” Knowledge and Information Systems, vol. 33, no. 3, pp. 603-630, Dec. 2012.

Big data, Autonomous sources, Data mining, Data acquisition, Aggregation .