High Speed Improved Decision Tree for Mining Streaming Data

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2014 by IJETT Journal
Volume-18 Number-8
Year of Publication : 2014
Authors : M.Rajyalakshmi , Dr.P.Srinivasulu

Citation 

M.Rajyalakshmi , Dr.P.Srinivasulu "High Speed Improved Decision Tree for Mining Streaming Data", International Journal of Engineering Trends and Technology (IJETT), V18(8),386-392 Dec 2014. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group

Abstract

Decision tree construction is a well studied problem in data mining. Recently, there was much interest in mining streaming data. For data stream classification, time is a major issue. However, these spatial datasets are too large to be classified effectively in a reasonable period of time using existing methods. Contained in the existing work two theorems presenting the McDiarmids bound for both the data gain, utilized in ID3 algorithm, and for Gini index, applied in Classification and Regression Trees algorithm. However it doesn’t handle compress optimized tree size. Tree size increases because the data size increases.CART/ ID3 with McDiarmid’s bound gives High False positive error rate. In this proposed work, we are going to develop a new method for decision tree classification on spatial data streams using a data structure called Peano Count Tree (P-tree). The Peano Count Tree is a spatial data organization that gives a lossless compressed representation of a spatial data set and facilitates efficient classification and other data mining techniques. Using P-tree structure, fast calculation of measurements, such as information gain, may well be achieved. We compare P-tree based Naive Bayes decision tree induction classification and a classical decision tree induction method with respect for a speed and accuracy. Bayesian averaging over Decision trees allows estimating on attributes to assess the category posterior distribution and estimates the prospect of creating misleading decisions. The clustering problem has actually been addressed in several contexts in plenty of disciplines; due to this problem experimental data ought to clean the data before applying the data mining techniques. In this paper a brand new framework is proposed by integrating decision tree based attribute selection for data clustering. In this proposed system robust Modified Boosting algorithm is proposed to decision trees for clustering the outcomes. Experimental results gives better accuracy compare to existing approaches.

References

[1] Yakup Yildirim,Adnan Yazici, Turgay Yilmaz, “Automatic Semantic Content Extraction in Videos Using a Fuzzy Ontology and Rule-Based Model," IEEE Transations knowl. Data Eng.25(1): 47-61(2013).
[2] M. Petkovic and W. Jonker, “An Overview of Data Models and Query Languages for Content-Based Video Retrieval," Proc. Int`l Conf. Advances in Infrastructure for EBusiness, Science, and Education on the Internet, Aug. 2000.
[3] M. Petkovic and W. Jonker, “Content-Based Video Retrieval by Integrating Spatiotemporal and Stochastic Recognition of Events," Proc. IEEE Int`l Workshop Detection and Recognition of Events in Video, pp. 75-82, 2001.
[4] L. Bai, S.Y. Lao, G. Jones, and A.F. Smeaton, “Video Semantic Content Analysis Based on Ontology,” IMVIP ’07: Proc. 11th Int’l Machine Vision and Image Processing Conf., pp. 117-124, 2007.
[5] G.G. Medioni, I. Cohen, F. Bre´mond, S. Hongeng, and R. Nevatia, “Event Detection and Analysis from Video Streams,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 23, no. 8, pp. 873-889, Aug. 2001.
[6] S. Hongeng, R. Nevatia, and F. Bre´mond, “Video-Based Event Recognition: Activity Representation and Probabilistic Recognition Methods,” Computer Vision and Image Understanding, vol. 96, no. 2, pp. 129-162, 2004.
[7] A. Hakeem and M. Shah, “Multiple Agent Event Detection and Representation in Videos,” Proc. 20th Nat’l Conf. Artificial Intelligence (AAAI), pp. 89-94, 2005.