Speech Recognition: A Review of Literature

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
© 2016 by IJETT Journal
Volume-37 Number-6
Year of Publication : 2016
Authors : Kirandeep Singh
DOI :  10.14445/22315381/IJETT-V37P254


Kirandeep Singh"Speech Recognition: A Review of Literature", International Journal of Engineering Trends and Technology (IJETT), V37(6),302-310 July 2016. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group

Speech recognition is a process of identifying what a person speaks into a mike or any other similar hardware and reflects its meaning in any required form such as text, image or any event. This thesis provides a description of implementation of Speaker Independent Isolated Punjabi and English Digits Recognition system. The system is developed by using two different techniques, first is pattern based technique (DTW (Dynamic Time Wrapping)) and second is statistical based technique (HMM (Hidden Markov Model)). The system uses the Mel Frequency Cepstral Coefficients (MFCCs) technique for the purpose of features extraction. The developed system works for Punjabi as well as English digits recognition.


[1] De Wachter, M., et al., Template-based continuous speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2007. 15(4): p. 1377-1390.
[2] Resch, B., Automatic Speech Recognition with HTK. Signal Processing and Speech Communication Laboratory. Inffeldgase. Austria. Disponible en Internet: http://www. igi. tugraz. at/lehre/CI, 2003.
[3] Kekre, H., A.A. Athawale, and G. Sharma. Speech recognition using vector quantization. in Proceedings of the International Conference & Workshop on Emerging Trends in Technology. 2011. ACM.
[4] Moore, R.K., Twenty things we still don’t know about speech, in Proc. CRIM/FORWISS Workshop on Progress and Prospects of speech Research an Technology. 1994.
[5] Tebelskis, J., Speech recognition using neural networks. 1995, Siemens AG.
[6] Pan, Y., P. Shen, and L. Shen, Speech emotion recognition using support vector machine. International Journal of Smart Home, 2012. 6(2): p. 101-108.
[7] Anusuya, M. and S.K. Katti, Speech recognition by machine, a review. arXiv preprint arXiv:1001.2267, 2010.
[8] Alhawiti, K.M., Advances in Artificial Intelligence Using Speech Recognition. World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 2015. 9(6): p. 1397-1400.
[9] Furui, S., Theory and Applications. 1 ed. Speech Technology, ed. K.J. Fang Chen. US: Springer. XXVII, 331.
[10] Furui, S., 50 years of progress in speech and speaker recognition. SPECOM 2005, Patras, 2005: p. 1-9.
[11] Gulzar, T., et al., A systematic analysis of automatic speech recognition: an overview. Int. J. Curr. Eng. Technol, 2014. 4(3): p. 1664-1675.
[12] Furui, S., Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981. 29(2): p. 254-272.
[13] Sambur, M.R. and L.R. Rabiner, Statistical decision approach to the recognition of connected digits. The Journal of the Acoustical Society of America, 1976. 60(S1): p. S12-S12.
[14] Rabiner, L. and J. Wilpon, A simplified, robust training procedure for speaker trained, isolated word recognition systems. The Journal of the Acoustical Society of America, 1980. 68(5): p. 1271-1276.
[15] Lee, K.-F. and H.-W. Hon. Large-vocabulary speakerindependent continuous speech recognition using HMM. in Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on. 1988. IEEE.
[16] Kita, K., T. Kawabaa, and T. Hanazawa. HMM continuous speech recognition using stochastic language models. in Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International Conference on. 1990. IEEE.
[17] Suzuki, H., et al. Speech recognition using voicecharacteristic- dependent acoustic models. in Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP`03). 2003 IEEE International Conference on. 2003. IEEE.
[18] Revathi, A. and Y. Venkataramani. Speaker independent continuous speech and isolated digit recognition using VQ and HMM. in Communications and Signal Processing (ICCSP), 2011 International Conference on. 2011. IEEE.
[19] Dua, M., et al., Punjabi automatic speech recognition using HTK. IJCSI International Journal of Computer Science Issues, 2012. 9(4): p. 1694-0814.
[20] Kuamr, A., M. Dua, and T. Choudhary. Continuous hindi speech recognition using gaussian mixture HMM. in Electrical, Electronics and Computer Science (SCEECS), 2014 IEEE Students` Conference on. 2014. IEEE.
[21] Baby, D., J.F. Gemmeke, and T. Virtanen. Exemplar-based speech enhancement for deep neural network based automatic speech recognition. in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015. IEEE.
[22] Nguyen, Q.B., T.T. Vu, and C.M. Luong. Improving acoustic model for English ASR System using deep neural network. in Computing & Communication Technologies-Research, Innovation, and Vision for the Future (RIVF), 2015 IEEE RIVF International Conference on. 2015. IEEE.
[23] Lee, S., Y. Lee, and N. Cho. Multi-stage speech enhancement for automatic speech recognition. in 2016 IEEE International Conference on Consumer Electronics (ICCE). 2016. IEEE.
[24] Mohan and B. Jagan. Speech recognition using MFCC and DTW. in Advances in Electrical Engineering (ICAEE), 2014 International Conference on. 2014. IEEE.
[25] Ravinder, K. (2010, November). Comparison of hmm and dtw for isolated word recognition system of punjabi language. In Iberoamerican Congress on Pattern Recognition (pp. 244- 252). Springer Berlin Heidelberg.

Speech Recognition, Acoustic Vector, Mel-Frequency Cepstrum Coefficients, Hidden Markov Model, Fast FourierTransform.