Extracting Large Data using Big Data Mining

Ms. Neha A. Kandalkar , Prof. Avinash Wadhe


Innovations in technology and greater affordability of digital devices have presided over today’s Age of Big Data, in the quantity and diversity of high frequency digital data. These data hold the potential to allow decision makers to track development progress, improve social protection, and understand where existing policies and programmes require adjustment. For example Turning Big Data—call logs, mobile-banking transactions, online user-generated content such as blog posts and Tweets, online searches, satellite images, etc.—into actionable information requires using computational techniques to unveil patterns within and between these extremely large socioeconomic datasets. The data-driven decision-making is now being recognized broadly, and there is growing enthusiasm for the notion of ``Big Data.’’ But there is currently a wide gap between its potential and its realization of real Big Data. Heterogeneity, scale, timeliness, complexity, and privacy problems with Big Data impede progress at all phases of the pipeline that can create value from data. When the data requires us to make decisions, the problems start right away during data acquisition, , currently in an ad hoc manner, about what data to keep and what to discard, and how to store what we keep reliably with the right metadata. Much data today from tweets and blogs are weakly structured pieces of text and is not natively in structured format, while images and video are structured for storage and display, but not for semantic content and search. With this, transforming such content into a structured format for later analysis it is a major challenge. A major investment in Big Data which should be properly directed, can result not only in major scientific advances, but also lay the foundation for the next generation of advances in science, medicine, and business.


Big data, Autonomous sources, Data mining, Data acquisition, Aggregation .