Parts of Speech Taggers for Dravidian Languages

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2015 by IJETT Journal
Volume-21 Number-7
Year of Publication : 2015
Authors : Anjali M K, BabuAnto P
DOI :  10.14445/22315381/IJETT-V21P263

Citation 

Anjali M K, BabuAnto P"Parts of Speech Taggers for Dravidian Languages", International Journal of Engineering Trends and Technology (IJETT), V21(7),342-347 March 2015. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group

Abstract

The process of assigning one of the parts-of- speech(POS) to the given word in a text is called Parts-of-speech tagging. POS tagging is a very important pre-processing task for language processing activities. This paper made a detailed study about the taggers available on morphologically rich Dravidian languages which includes Malayalam, Kannada, Tamil and Telugu. It also briefs various approaches used for POS tagging.

References

[1]TanveerSiddiqui , U S Tiwary, Natural Language Processing and Information Retrieval, Oxford University Press.
[2]http://www.britannica.com/EBchecked/topic/171083/Dravidianlanguages
[3] http://www.gktoday.in/classical-languages-of-india/
[4]Jurafsky, D., Martin, J. H., Speech and Language Processing-An Introduction to Natural LanguageProcessing, Computational Linguistics andSpeech Recognition, Prentice Hall, Upper Saddle River, NewJersey,2000
[5]L. R. .Rabiner, B. H. Juang, An Introduction to Hidden Markov Models,IEEE ASSP MAGAZINE JANUARY 1986
[6]Chris Mueller, Support Vector Machines, www.osl.iu.edu/~chemuell/projects/presentations/svm.pdf
[7] Marti A Hearst,Support Vector Machines, IEEE Intelligent Systems Magazine, 1998.
[8] maxent.sourceforge.net/about.html
[9] AdwaitRatnaparkhi, A Simple Introduction to Maximum Entropy Models for Natural Language Processing, IRCS Report 97.
[10] Adam L Berger, Stephen A Della Pietra, Vincent J Della Pietra, A Maximum Entropy Approach to Natural Language Processing,Journal Computational Linguistics, Volume 22, Issue 1, March 1996.
[11] John Lafferty, Andrew McCallum, Fernando Pereira, ?Conditional Random Fields: Probabilistic Modelsfor Segmenting and Labeling Sequence Data,?Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), pages282-289.
[12] Manju K., Soumya S., Sumam Mary Idicula, Developmentof a POS Tagger for Malayalam - An Experience,artcom, pp.709-713, 2009 International Conference on Advances inRecent Technologies in Communication and Computing,2009
[13]Rajeev R R, Jisha P jayan, Elizabeth Sherly, Parts of speech Tagger forMalayalam, IJCSIT International Journal of Computer Science and Information Technology, vol 2, No.2, December 2009, pp 209-213
[14]Antony P J, Santhanu P Mohan and Soman K P (2010), SVM Based Parts of Speech Tagger for Malayalam, International Conference on- Recent Trendsin Information, Telecommunication and Computing (ITC 2010).
[15]http://www.ldcil.org/workInProgress.aspx
[16] P J Antony, K P Soman, Kernel based part of speech tagger for Kannada, International Conference on Machine Learning and Cybernetics (ICMLC), 2010 (Volume:4 )
[17] Shambhavi.B. R,Ramakanth Kumar P Revanth G, A Maximum EntropyApproach to Kannada Part Of Speech Tagging, International Journal of Computer Applications (0975 – 8887), Volume 41– No.13, March 2012
[18]Shambhavi B R,Ramakanth Kumar P, Kannada Part-Of-Speech Taggingwith Probabilistic Classifiers, International Journal of Computer Applications (0975 – 888) Volume 48– No.17, June 2012
[19]Pallavi, Anitha S Pillai, Parts Of Speech (POS) Tagger for Kannada UsingConditional Random Fields (CRFs), National Conference on IndianLanguage Computing, 2014
[20] Bhuvaneshwari C. Melinamath, Hierarchical Annotator System For Kannada Language, Impact: International Journal of Research in Engineering & Technology Vol. 2, Issue 5, May 2014, 97-110 [21] VasuRanganathan, Development of Morphological Tagger for Tamil, Tamil Internet Conference 2001
[22] Arulmozhi. P, Sobha. L, Kumara Shanmugam. B, Parts of Speech Tagger for Tamil, Symposium on Indian Morphology, Phonology& Language Engineering19 – 21 March, 2004
[23]M Ganesan, S. Raja, Morpheme and Parts-of-Speech tagging ofTamil Corpus, Symposium on Indian Morphology, Phonology& Language Engineering19 – 21 March, 2004
[24] ArulmozhiPalanisamy,SobhaLalitha Devi, HMM based POS Tagger fora Relatively Free Word Order Language,Research in Computing Science,2006, pp. 37-48
[25] S. LakshmanaPandian, T. V. Geetha, Morpheme based Language Model for Tamil Part-of-Speech Tagging,Research journal on Computer science and computer engineering with applications Issue 38 (July-December 2008)pp 19-25
[26]M. Selvam, A.M. Natarajan,Improvement Of Rule Based Morphological Analysis AndPos Tagging In Tamil Language Via Projection And Induction Techniques, International Journal Of Computers Issue 4, Volume 3, 2009
[27] Dhanalakshmi V, Anand Kumar, Shivapratap G, Soman KP,Rajendran S, Tamil POS Tagging using Linear Programming,International Journal ofRecent Trends in Engineering, Vol. 1, No. 2, May 2009
[28]Dhanalakshmi V, Anandkumar M, Rajendran S, Soman K P, POS Taggerand Chunker for Tamil Language, Tamil Internet Conference 2009
[29] MadhuRamanathan, Vijay Chidambaram, AshishPatro, An Attempt at Multilingual POS Tagging for Tamil, pages.cs.wisc.edu/~madhurm/CS769_final_report.pdf
[30]T. Sree Ganesh,Telugu Parts Of Speech Tagging In WSD, Language In India, Volume 6 : 8 August 2006
[31]RamaSree, R.J., KusumaKumari P., Combining Pos Taggers For Improved Accuracy To Create Telugu Annotated Texts For Information Retrieval,www.Ulib.Org/Conference/2007/Ramasree.pdf
[32]G.SindhiyaBinulal, P. AnandGoud, K.P.Soman, A SVM based Approachto Telugu Parts Of SpeechTagging using SVMTool, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009
[33] SrinivasuBadugu, Morphology Based POS Tagging on Telugu,IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 1, No 1, January 2014.

Keywords
POS Tagger; Dravidian Language; Rule based; Stochastic; Hybrid; Malayalam; Telugu; Kannada; Tamil.