Machine transliteration: A Review of Literature

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2016 by IJETT Journal
Volume-37 Number-6
Year of Publication : 2016
Authors : Kanwaljit Kaur
DOI :  10.14445/22315381/IJETT-V37P257


Kanwaljit Kaur"Machine transliteration: A Review of Literature", International Journal of Engineering Trends and Technology (IJETT), V37(6),327-336 July 2016. ISSN:2231-5381. published by seventh sense research group

Machine transliteration is an emerging research area which converts words from one language to another without losing its phonological characteristics. Transliteration is a supporting tool for machine translation and Cross language information retrieval. Transliteration is mainly used for handling named entities and out of vocabulary words in a machine translation system. It preserves the phonetic structure of the words. This paper discusses the various challenges, approaches and existing systems in transliteration. The major challenges in developing a transliteration system are missing sounds, zero or multiple character mappings, differences between scripts etc. The approaches for the transliteration system can be phoneme based, grapheme based or combination of both. Few researches that have taken place in the field of transliteration are listed in this paper, although the list may not be exhaustive.


[1] S. Karimi, F. Scholer, and A. Turpin, "Machine transliteration survey," ACM Computing Surveys (CSUR), vol. 43, p. 17, 2011.
[2] M. Arbabi, S. M. Fischthal, V. C. Cheng, and E. Bart, "Algorithms for Arabic name transliteration," IBM Journal of research and Development, vol. 38, pp. 183-194, 1994.
[3] K. Kaur and P. Singh, "Review of Machine Transliteration Techniques," International Journal of Computer Applications, vol. 107, 2014.
[8] S. Wan and C. M. Verspoor, "Automatic English-Chinese name transliteration for development of multilingual resources," in Proceedings of the 17th international conference on Computational linguistics-Volume 2, 1998, pp. 1352-1356.
[9] B.-J. Kang and K.-S. Choi, "Automatic Transliteration and Back-transliteration by Decision Tree Learning," in LREC, 2000.
[10] J.-H. Oh and K.-S. Choi, "An English-Korean transliteration model using pronunciation and contextual rules," in Proceedings of the 19th international conference on Computational linguistics-Volume 1, 2002, pp. 1-7.
[11] C.-J. Lee and J. S. Chang, "Acquisition of English-Chinese transliterated word pairs from parallel-aligned texts using a statistical machine transliteration model," in Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond-Volume 3, 2003, pp. 96-103.
[12] M. G. Malik, "Punjabi machine transliteration," in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 1137-1144.
[13] H. Surana and A. K. Singh, "A More Discerning and Adaptable Multilingual Transliteration Mechanism for Indian Languages," in IJCNLP, 2008, pp. 64-71.
[14] G. Hong, M.-J. Kim, D.-G. Lee, and H.-C. Rim, "A hybrid approach to english-korean name transliteration," in Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration, 2009, pp. 108-111.
[15] P. Antony, V. Ajith, and K. Soman, "Kernel method for english to kannada transliteration," in Recent Trends in Information, Telecommunication and Computing (ITC), 2010 International Conference on, 2010, pp. 336-338.
[16] A. A. Kak, N. Mehdi, and A. A. Lawaye, "Building a Cross Script Kashmiri Converter: Issues and Solutions," Proceedings of Oriental COCOSDA (The International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques), 2010.
[17] K. Deep and V. Goyal, "Development of a Punjabi to English transliteration system," International Journal of Computer Science and Communication, vol. 2, pp. 521- 526, 2011.
[18] J. Kaur and G. S. Josan, "Statistical Approach to Transliteration from English to Punjabi," International Journal on Computer Science and Engineering, vol. 3, pp. 1518-1527, 2011.
[19] M. L. Dhore, S. K. Dixit, and T. D. Sonwalkar, "Hindi to english machine transliteration of named entities using conditional random fields," International Journal of Computer Applications, vol. 48, pp. 31-37, 2012.
[20] G. S. Lehal and T. S. Saini, "Development of a Complete Urdu-Hindi Transliteration System," in COLING (Posters), 2012, pp. 643-652.
[21] P. Rathod, M. Dhore, and R. Dhore, "Hindi and Marathi to English machine transliteration using SVM," International Journal on Natural Language Computing, vol. 2, pp. 55- 71, 2013.
[22] M. A. Malik, C. Boitet, L. Besacier, and P. Bhattcharyya, "Urdu Hindi machine transliteration using SMT," WSSANLP-2013, p. 43, 2013.
[23] P. Sanjanaashree, "Joint layer based deep learning framework for bilingual machine transliteration," in Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on, 2014, pp. 1737-1743.
[24] G. S. Lehal and T. S. Saini, "Sangam: A Perso-Arabic to Indic Script Machine Transliteration Model," in Proceedings of 10th International Conference on Natural Language Processing.
[25] S. Mathur and V. P. Saxena, "Hybrid appraoch to English- Hindi name entity transliteration," in Electrical, Electronics and Computer Science (SCEECS), 2014 IEEE Students` Conference on, 2014, pp. 1-5.
[26] C. Sunitha and A. Jaya, "A phoneme based model for english to malayalam transliteration," in International Confernce on Innovation Information in Computing Technologies, 2015, pp. 1-4.
[27] M. A. Zahid, N. I. Rao, and A. M. Siddiqui, "English to Urdu transliteration: An application of Soundex algorithm," in Information and Emerging Technologies (ICIET), 2010 International Conference on, 2010, pp. 1-5.
[28] R. C. Balabantaray and D. Sahoo, "Odia transliteration engine using moses," in Business and Information Management (ICBIM), 2014 2nd International Conference on, 2014, pp. 27-29.

Transliteration, Machine translation, Cross Language Information Retrieval, Named Entities.