Pattern Discovery For Text Mining Using Pattern Taxonomy

  ijett-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2013 by IJETT Journal
Volume-4 Issue-10
Year of Publication : 2013
Authors : Miss Dipti S.Charjan , Prof. Mukesh A.Pund

Citation 

Miss Dipti S.Charjan , Prof. Mukesh A.Pund. "Pattern Discovery For Text Mining Using Pattern Taxonomy". International Journal of Engineering Trends and Technology (IJETT). V4(10):4550-4555 Oct 2013. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group.

Abstract

In this paper, we focused on developing efficient mining algorithm for discovering patterns from large data collection. and search for useful and interesting patterns. In the field of text mining, pattern mining techniques can be used to find various text patterns, such as frequent itemsets, closed frequent itemsets, co-occurring terms. This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. In proposed system we can take sufficient .txt file as inputs & we apply various algorithms & generate expected results. Text-mining refers generally to the process of extracting interesting and non-trivial information and knowledge from unstructured text. An important difference with search is that search requires a user to know what he or she is looking for while text mining attempts to discover information in a pattern that is not known beforehand.

References

[1] K. Aas and L. Eikvil, “Text Categorisation: A Survey,” Technical Report Raport NR 941, Norwegian Computing Center, 1999.
[2] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” Proc. 20th Int’l Conf. Very Large Data Bases (VLDB ’94), pp. 478-499, 1994.
[3] H. Ahonen, O. Heinonen, M. Klemettinen, and A.I. Verkamo, “Applying Data Mining Techniques for Descriptive Phrase Extraction in Digital Document Collections,” Proc. IEEE Int’l Forum on Research and Technology Advances in Digital Libraries (ADL ’98), pp. 2-11, 1998.
[4] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison Wesley, 1999.
[5] N. Cancedda, N. Cesa-Bianchi, A. Conconi, and C. Gentile, “Kernel Methods for Document Filtering,” TREC, 2002.
[6] N. Cancedda, E. Gaussier, C. Goutte, and J.-M. Renders, “Word- Sequence Kernels,” J. Machine Learning Research, vol. 3, pp. 1059- 1082, 2003.
[7] M.F. Caropreso, S. Matwin, and F. Sebastiani, “Statistical Phrases in Automated Text Categorization,” Technical Report IEI-B4-07- 2000, Instituto di Elaborazione dell’Informazione, 2000.
[8] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
[9] S.T. Dumais, “Improving the Retrieval of Information from External Sources,” Behavior Research Methods, Instruments, and Computers, vol. 23, no. 2, pp. 229-236, 1991.
[10] J. Han and K.C.-C. Chang, “Data Mining for Web Intelligence,” Computer, vol. 35, no. 11, pp. 64-70, Nov. 2002.
[11] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” Proc. ACM SIGMOD Int’l Conf. Management of Data (SIGMOD ’00), pp. 1-12, 2000.
[12] Y. Huang and S. Lin, “Mining Sequential Patterns Using Graph Search Techniques,” Proc. 27th Ann. Int’l Computer Software and Applications Conf., pp. 4-9, 2003.
[13] N. Jindal and B. Liu, “Identifying Comparative Sentences in Text Documents,” Proc. 29th Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR ’06), pp. 244-251, 2006.
[14] T. Joachims, “A Probabilistic Analysis of the Rocchio Algorithm with tfidf for Text Categorization,” Proc. 14th Int’l Conf. Machine Learning (ICML ’97), pp. 143-151, 1997.
[15] T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” Proc. European Conf. Machine Learning (ICML ’98),, pp. 137-142, 1998.
[16] T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” Proc. 16th Int’l Conf. Machine Learning (ICML ’99), pp. 200-209, 1999.
[17] W. Lam, M.E. Ruiz, and P. Srinivasan, “Automatic Text Categorization and Its Application to Text Retrieval,” IEEE Trans. Knowledge and Data Eng., vol. 11, no. 6, pp. 865-879, Nov./Dec. 1999.
[18] D.D. Lewis, “An Evaluation of Phrasal and Clustered Representations on a Text Categorization Task,” Proc. 15th Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR ’92), pp. 37-50, 1992.
[19] D.D. Lewis, “Feature Selection and Feature Extraction for Text Categorization,” Proc. Workshop Speech and Natural Language, pp. 212-217, 1992.
[20] D.D. Lewis, “Evaluating and Optimizing Automous Text Classification Systems,” Proc. 18th Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR ’95), pp. 246-254, 1995.
[21] X. Li and B. Liu, “Learning to Classify Texts Using Positive and Unlabeled Data,” Proc. Int’l Joint Conf. Artificial Intelligence (IJCAI ’03), pp. 587-594, 2003.
[22] Y. Li, W. Yang, and Y. Xu, “Multi-Tier Granule Mining for Representations of Multidimensional Association Rules,” Proc. IEEE Sixth Int’l Conf. Data Mining (ICDM ’06), pp. 953-958, 2006.
[23] Y. Li, C. Zhang, and J.R. Swan, “An Information Filtering Model on the Web and Its Application in Jobagent,” Knowledge-Based Systems, vol. 13, no. 5, pp. 285-296, 2000.
[24] Y. Li and N. Zhong, “Interpretations of Association Rules by Granular Computing,” Proc. IEEE Third Int’l Conf. Data Mining (ICDM ’03), pp. 593-596, 2003.
[25] Y. Li and N. Zhong, “Mining Ontology for Automatically Acquiring Web User Information Needs,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 4, pp. 554-568, Apr. 2006.
[26] Y. Li, X. Zhou, P. Bruza, Y. Xu, and R.Y. Lau, “A Two-Stage Text Mining Model for Information Filtering,” Proc. ACM 17th Conf. Information and Knowledge Management (CIKM ’08), pp. 1023-1032, 2008.
[27] H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins, “Text Classification Using String Kernels,” J. Machine Learning Research, vol. 2, pp. 419-444, 2002.
[28] A. Maedche, Ontology Learning for the Semantic Web. Kluwer Academic, 2003.
[29] C. Manning and H. Schu¨ tze, Foundations of Statistical Natural Language Processing. MIT Press, 1999.
[30] I. Moulinier, G. Raskinis, and J. Ganascia, “Text Categorization: A Symbolic Approach,” Proc. Fifth Ann. Symp. Document Analysis and Information Retrieval (SDAIR), pp. 87-99, 1996.
[31] Ning Zhong, Yuefeng Li, and Sheng-Tang Wu, “Effective Pattern Discovery for Text Mining”,vol.24,No.1,Jan.2012.

Keywords
Text mining, text classification, pattern mining, pattern evolving, information filtering