Context Dependent Tri-Phone Automatic Speech Recognition using Novel Spectrum Analysis Approach

  IJETT-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2015 by IJETT Journal
Volume-30 Number-5
Year of Publication : 2015
Authors : Amr M. Gody, Tamer M. Barakat, Sayed A. Zaky
DOI :  10.14445/22315381/IJETT-V30P241

Citation 

Amr M. Gody, Tamer M. Barakat, Sayed A. Zaky"Context Dependent Tri-Phone Automatic Speech Recognition using Novel Spectrum Analysis Approach", International Journal of Engineering Trends and Technology (IJETT), V30(5),217-222 December 2015. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group

Abstract
In this research, speech recognition is implemented for English Tri-Phone unit recognition. Newly developed features called Best Tree Encoding [1] is used to define the recognition parameters. HMM is used as recognition engine. To verify the proposed model, all results are compared against the most popular features used in similar approaches in ASR the Mel Frequency Cepstral Coefficients (MFCC). The most popular HMM tool kit the HTK is used for designing and implementing HMM. The most popular corpus database the TIMIT database is used in all experiments through this research. The proposed model gives success rate for tri phone recognition94%with respect to the success rate of MFCC for tri phone recognition of the same samples. The D and A may be ignored in both MFCC and BTE.

 References

[1] Amr M. Gody, "Wavelet Packets Best Tree 4 Points Encoded (BTE) Features", The Eighth Conference on Language Engineering, Ain-Shams University, Cairo, Egypt,PP 189-198, December 2008.
[2] Barnard, E, Gouws, E, Wolvaardt, K and Kleynhans, N. "Appropriate baseline values for HMM-based speech recognition". 15th Annual Symposium of the Pattern Recognition Association of South Africa, Grabouw, South Africa, November 2004.
[3] Amr M. Gody, Rania Ahmed AbulSeoud,Mohamed Hassan "Automatic Speech Annotation Using HMM based on Best Tree Encoding (BTE) Feature", The Eleventh Conference on Language Engineering, Ain-Shams University, Cairo, Egypt PP. 153-159 ,December 2011.
[4] Amr M. Gody, Rania Ahmed AbulSeoud,Maha M. Adham, Eslam E. Elmaghraby "Automatic Speech Using Wavelet Packets Increased Resolution Best Tree Encoding", The Twelfth Conference on Language Engineering, Ain-Shams University, Cairo, Egypt PP. 126-134, December 2012.
[5] Amr M. Gody, Rania Ahmed AbulSeoud,Eslam E. Elmaghraby "Automatic Speech Recognition Of Arabic Phones Using Optimal- Depth – Split –EnergyBesttree Encoding", The Twelfth Conference on Language Engineering, Ain-Shams University, PP. 144-156, December 2012, Cairo, Egypt.
[6] Amr M. Gody, Rania Ahmed AbulSeoud, Mai Ezz El-Din,"Using Mel-Mapped Best Tree Encoding for Baseline-Context-Independent-Mono-Phone Automatic Speech Recognition" Ain-Shams journal,2015.
[7] Maha M. Adham, "Phone Level Speech Segmentation Using Wavelet Packets", Fayoum University, 2013.
[8] Eslam E. Elmaghraby "Enhancement Speed Of Large Vocabulary Speech Recognition System", Fayoum University, 2013.
[9] HTK Book documentation, http://htk.eng.cam.ac.uk/docs/docs.shtml.
[10] Amr M. Gody, Rania Ahmed Abul Seoud ,Marian M.Ibraheem, " Hybrid Model design for Baseline-Context-Independent-Mono-Phone Automatic Speech Recognition", International Journal of Engineering Trends and Technology (IJETT) – Volume 27 Issue :2231-5381- September 2015

Keywords
Automatic Speech recognition, English Phone Recognition, Wavelet packets, Mel scale, MFCC, HTK and Best Tree Encoding.