A Novel Speech Enhancement Solution Using Hybrid Wavelet Transformation Least Means Square Method
How to Cite?
Jagadish S.Jakati, Shridhar S.Kuntoji, "A Novel Speech Enhancement Solution Using Hybrid Wavelet Transformation Least Means Square Method," International Journal of Engineering Trends and Technology, vol. 69, no. 7, pp. 233-243, 2021. Crossref, https://doi.org/10.14445/22315381/IJETT-V69I7P230
Abstract
Currently, minimizing the noise in speech or audio signals is a challenging issue in the field of speech recognition, speech enhancement, and other speech communication applications. These applications have fascinated research community due to their diverse use in real-time, online and offline applications. Several approaches have been presented to enhance the quality of speech. Currently, the Wavelet Transformation based approach and Least Means Square based filtering schemes are extensively adopted in various researches. The existing techniques suffer from computational complexity and performance related issues. Thus, we focused on combining these schemes and presented a hybrid approach that uses wavelet packet transform and an adaptive LMS scheme. We present an extensive simulation study and comparative analysis by using the NOIZEUS speech corpus database. The experimental analysis shows a substantialaugmentation in the performance of speech enhancement.
Keywords
Speech enhancement, Noise filtering, DWT, LMS, NOIZEUS,STFT, MOS, CEP, STOI
Reference
[1] Dash, T. K., & Solanki, S. S., Comparative study of speech enhancement algorithms and their effect on speech intelligibility. In 2017 2nd International conference on communication and electronics systems (ICCES) (2017) 270-276. IEEE.
[2] Chiea, R. A., Costa, M. H., &Barrault, G., New insights on the optimality of parameterized Wiener filters for speech enhancement applications. Speech Communication, 109(2019) 46-54.
[3] Andersen, K. T., &Moonen, M., Robust speech-distortion weighted interframe Wiener filters for single-channel noise reduction. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(1)(2017)97-107.
[4] Enzner, G., &Thüne, P., Robust MMSE filtering for singlemicrophone speech enhancement, In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017) 4009-4013. IEEE.
[5] Pardede, H., Ramli, K., Suryanto, Y., Hayati, N., &Presekal, A., Speech enhancement for secure communication using coupled spectral subtraction and Wiener filter. Electronics, 8(8)(2019) 897.
[6] Dionelis, N., & Brookes, M., Phase-aware single-channel speech enhancement with modulation-domain Kalman filtering. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(5) 937-950PDCA12-70 data sheet,OptoSpeedSA, Mezzovico, Switzerland., (2018).
[7] Choi, H. S., Kim, J. H., Huh, J., Kim, A., Ha, J. W., & Lee, K., Phase-aware speech enhancement with deep complex u-net. In International Conference on Learning Representations., (2018).
[8] Bhowmick, A., & Chandra, M., Speech enhancement using voiced speech probability based wavelet decomposition. Computers & Electrical Engineering, 62(2017)706-718.
[9] Benesty, J., & Cohen, I., Single-channel speech enhancement in the STFT domain. In Canonical Correlation Analysis in Speech Enhancement (2018)37-57. Springer, Cham.
[10] Taherian, H., Wang, Z. Q., Chang, J., & Wang, D., Robust speaker recognition based on single-channel and multi-channel speech enhancement. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28(2020) 1293-1302.
[11] Wang, Z. Q., Wang, P., & Wang, D., Complex spectral mapping for single-and multi-channel speech enhancement and robust ASR. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28(2020)1778-1787.
[12] Taherian, H., Wang, Z. Q., Chang, J., & Wang, D., Robust speaker recognition based on single-channel and multi-channel speech enhancement. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28(2020)1293-1302.
[13] Mavaddaty, S., Ahadi, S. M., &Seyedin, S., Speech enhancement using sparse dictionary learning in wavelet packet transform domain. Computer Speech & Language, 44 (2017) 22-47.
[14] Ram, R., &Mohanty, M. N., Use of radial basis function network with discrete wavelet transform for speech enhancement. International Journal of Computational Vision and Robotics, 9(2) (2019) 207-223.
[15] ITU-T P.835, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm, ITU-T Recommendation (2003) 835.
[16] Sharma, N., Singh, M. K., Low, S. Y., & Kumar, A., Weighted Sigmoid-Based Frequency-Selective Noise Filtering for Speech Denoising. Circuits, Systems, and Signal Processing, 40(1)(2021) 276-295.
[17] Srinivasarao, V., &Ghanekar, U., Speech intelligibility enhancement: a hybrid wiener approach. International Journal of Speech Technology, 23(3)(2020) 517-525.
[18] Yu, H., Zhu, W. P., Ouyang, Z., & Champagne, B., A hybrid speech enhancement system with DNN based speech reconstruction and Kalman filtering. Multimedia Tools and Applications, 79(43)(2020) 32643-32663.
[19] Roy, S. K., Nicolson, A., &Paliwal, K. K.. DeepLPC: A deep learning approach to augmented Kalman filter-based single-channel speech enhancement. IEEE Access.
[20] Li, A., Yuan, M., Zheng, C., & Li, X., Speech enhancement using progressive learning-based convolutional recurrent neural network. Applied Acoustics, 166(2020) (2021), 107347.
[21] Roy, S. K., Nicolson, A., &Paliwal, K. K., DeepLPC-MHANet: Multi-Head Self-Attention for Augmented Kalman Filter-based Speech Enhancement. IEEE Access., (2021).
[22] Indra, J., Kiruba Shankar, R., Kasthuri, N., &GeethaManjuri, S., A Modified Tunable–Q Wavelet Transform Approach for Tamil Speech Enhancement. IETE Journal of Research, (2020),1-14.
[23] Garg, A., &Sahu, O. P., A hybrid approach for speech enhancement using Bionic wavelet transform and Butterworth filter. International Journal of Computers and Applications, 42(7)(2020), 686-696.
[24] Zhou, Weili; Zhu, Zhen., A novel BNMF-DNN based speech reconstruction method for speech quality evaluation under complex environments. International Journal of Machine Learning and Cybernetics, (), –. doi:10.1007/s13042-020-01214-3., (2020).
[25] Fu SW, Tsao Y, Hwang HT et al., Quality-net: an end-to-end nonintrusive speech quality assessment model based on BLSTM. arXiv preprint arXiv:1808.05344., (2018).
[26] Soni MH, Patil HA Novel subbandautoencoder features for nonintrusive quality assessment of noise suppressed speech. In: 2016 conference of the international speech communication association on interspeech. IEEE, (2016) ,3708–3712
[27] Rajesh KD, Arun K.,Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech. IET Signal Proc 9 (2015) 638–646.
[28] JagadishS.Jakati and ShridharS.Kuntoji., Speech Enhancement Using Novel Time-Frequency Analysis Techniques: A Survey on Comparison., International Journal of Advanced Trends in Computer Science and Engineering, 9(4)(2020)4229-4234.
[29] JagadishS.Jakati and ShridharS.Kuntoji., Efficient Speech Denoising Algorithm using Multi-levelDiscrete Wavelet Transform and Thresholding,International Journal of Emerging Trends in Engineering Research, 8(6)(2020) 2472-2480.
[30] JagadishS.Jakati and ShridharS.Kuntoji, A Noise Reduction Method Based on Modified LMS Algorithm of Real-time Speech Signals,WSEAS TRANSACTIONS on SYSTEMS and CONTROL, 16(2021)162-170.