Context Dependent Tri-Phone Automatic Speech Recognition using Novel Spectrum Analysis Approach
Citation
Amr M. Gody, Tamer M. Barakat, Sayed A. Zaky"Context Dependent Tri-Phone Automatic Speech Recognition using Novel Spectrum Analysis Approach", International Journal of Engineering Trends and Technology (IJETT), V30(5),217-222 December 2015. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group
Abstract
In this research, speech recognition is
implemented for English Tri-Phone unit recognition.
Newly developed features called Best Tree Encoding
[1] is used to define the recognition parameters. HMM
is used as recognition engine. To verify the proposed
model, all results are compared against the most
popular features used in similar approaches in ASR the
Mel Frequency Cepstral Coefficients (MFCC). The
most popular HMM tool kit the HTK is used for
designing and implementing HMM. The most popular
corpus database the TIMIT database is used in all
experiments through this research. The proposed
model gives success rate for tri phone
recognition94%with respect to the success rate of
MFCC for tri phone recognition of the same samples.
The D and A may be ignored in both MFCC and BTE.
References
[1] Amr M. Gody, "Wavelet Packets Best Tree 4 Points Encoded
(BTE) Features", The Eighth Conference on Language
Engineering, Ain-Shams University, Cairo, Egypt,PP 189-198,
December 2008.
[2] Barnard, E, Gouws, E, Wolvaardt, K and Kleynhans, N.
"Appropriate baseline values for HMM-based speech
recognition". 15th Annual Symposium of the Pattern
Recognition Association of South Africa, Grabouw, South
Africa, November 2004.
[3] Amr M. Gody, Rania Ahmed AbulSeoud,Mohamed Hassan
"Automatic Speech Annotation Using HMM based on Best
Tree Encoding (BTE) Feature", The Eleventh Conference on
Language Engineering, Ain-Shams University, Cairo, Egypt
PP. 153-159 ,December 2011.
[4] Amr M. Gody, Rania Ahmed AbulSeoud,Maha M. Adham,
Eslam E. Elmaghraby "Automatic Speech Using Wavelet
Packets Increased Resolution Best Tree Encoding", The
Twelfth Conference on Language Engineering, Ain-Shams
University, Cairo, Egypt PP. 126-134, December 2012.
[5] Amr M. Gody, Rania Ahmed AbulSeoud,Eslam E.
Elmaghraby "Automatic Speech Recognition Of Arabic
Phones Using Optimal- Depth – Split –EnergyBesttree
Encoding", The Twelfth Conference on Language
Engineering, Ain-Shams University, PP. 144-156, December
2012, Cairo, Egypt.
[6] Amr M. Gody, Rania Ahmed AbulSeoud, Mai Ezz
El-Din,"Using Mel-Mapped Best Tree Encoding for
Baseline-Context-Independent-Mono-Phone Automatic
Speech Recognition" Ain-Shams journal,2015.
[7] Maha M. Adham, "Phone Level Speech Segmentation Using
Wavelet Packets", Fayoum University, 2013.
[8] Eslam E. Elmaghraby "Enhancement Speed Of Large
Vocabulary Speech Recognition System", Fayoum University,
2013.
[9] HTK Book documentation,
http://htk.eng.cam.ac.uk/docs/docs.shtml.
[10] Amr M. Gody, Rania Ahmed Abul Seoud ,Marian M.Ibraheem, " Hybrid Model design for
Baseline-Context-Independent-Mono-Phone Automatic
Speech Recognition", International Journal of Engineering
Trends and Technology (IJETT) – Volume 27 Issue
:2231-5381- September 2015
Keywords
Automatic Speech recognition, English
Phone Recognition, Wavelet packets, Mel scale,
MFCC, HTK and Best Tree Encoding.