Feature Extraction Using Empirical Mode Decomposition of Speech Signal

  ijett-book-cover  International Journal of Engineering Trends and Technology (IJETT)          
  
© 2012 by IJETT Journal
Volume-3 Issue-2                          
Year of Publication : 2012
Authors :  Nikil V Davis

Citation 

Nikil V Davis. " Feature Extraction Using Empirical Mode Decomposition of Speech Signal". International Journal of Engineering Trends and Technology (IJETT). V3(2):77-80 Mar-Apr 2012. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group

Abstract

Speech signal carries information related to not only the message to be conveyed, but also about speaker, language, emotional status of speaker, environment and so on. Speech is produced by exciting the time varying vocal tract system with a time varying e xcitation. Each sound is produced by a specific combination of excitation and vocal tract dynamics. This paper presents a speaker identification system using empirical mode decomposition (EMD) feature extraction method. The EMD is an adaptive multiresolution decomposition technique that appears to be suitable for non - linear, non - stationary data analysis. The EMD sifts the complex signal of time series without losing its original properties and then obtains some useful intrinsic mode function (I MF) components. T he FFT is the most useful method for frequency domain feature extraction . Wavelet transform(WT) is yet another method for feature extraction.

References

[1] I Jian - Da Wu, Yi - Jang Tsai , “ Speaker identification system using empirical mode decomposition and an artificial neural network ,” Expert Systems with Applications , 38 ,6112 – 6117.
[2] Avci, E., & Akpolat, Z. H. (2006). “Speech recognition using a wavelet packer adaptive network based fuzzy inference system,” Expert Systems with Applications , 31, 495 – 503.
[3] Corinthios, M. J. (197 1). A fast Fourier transform for high - speed signal processing. IEEE Transaction on Computer , C - 20, 843 – 846.
[4] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back - propagating errors . Nature, 323, 533 – 536.
[5] F.Cu mmins, M.Grimaldi, T.Leonard, and J.Simko, “The chains corpus: Characterizing individual speakers,” in Proc.SPECOM’06, St. Petersburg, Russia, 2006, pp.431 - 435.

Keywords
Speaker identification, Empirical mode decomposition, Intrinsic Mode Function