Applied mel-frequency discrete wavelet coefficients and parallel model compensation for noise-robust speech recognition


Tufekei Z., Gowdy J. N., Gurbuz S., Patterson E.

SPEECH COMMUNICATION, cilt.48, sa.10, ss.1294-1307, 2006 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 48 Sayı: 10
  • Basım Tarihi: 2006
  • Doi Numarası: 10.1016/j.specom.2006.06.006
  • Dergi Adı: SPEECH COMMUNICATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.1294-1307
  • Çukurova Üniversitesi Adresli: Hayır

Özet

Interfering noise severely degrades the performance of a speech recognition system. The Parallel Model Compensation (PMC) technique is one of the most efficient techniques for dealing with such noise. Another approach is to use features local in the frequency domain, such as Mel-Frequency Discrete Wavelet Coefficients (MFDWCs). In this paper, we investigate the use of PMC and MFDWC features to take advantage of both noise compensation and local features (MFDWCs) to decrease the effect of noise on recognition performance. We also introduce a practical weighting technique based on the noise level of each coefficient. We evaluate the performance of several wavelet-schemes using the NOISEX-92 database for various noise types and noise levels. Finally, we compare the performance of these versus Mel-Frequency Cepstral Coefficients (MFCCs), both using PMC. Experimental results show significant performance improvements for MFDWCs versus MFCCs, particularly after compensating the HMMs using the PMC technique. The best feature vector among the six MFDWCs we tried gave 13.72 and 5.29 points performance improvement, on the average, over MFCCs for -6 and 0 dB SNR, respectively. This corresponds to 39.9% and 62.8% error reductions, respectively. Weighting the partial score of each coefficient based on the noise level further improves the performance. The average error rates for the best MFDWCs dropped from 19.57% to 16.71% and from 3.14% to 2.14% for -6 dB and 0 dB noise levels, respectively, using the weighting scheme. These improvements correspond to 14.6% and 31.8% error reductions for -6 dB and 0 dB noise levels, respectively. (c) 2006 Elsevier B.V. All rights reserved.

nterfering noise severely degrades the performance of a speech recognition system. The Parallel Model Compensation (PMC) technique is one of the most efficient techniques for dealing with such noise. Another approach is to use features local in the frequency domain, such as Mel-Frequency Discrete Wavelet Coefficients (MFDWCs). In this paper, we investigate the use of PMC and MFDWC features to take advantage of both noise compensation and local features (MFDWCs) to decrease the effect of noise on recognition performance. We also introduce a practical weighting technique based on the noise level of each coefficient. We evaluate the performance of several wavelet-schemes using the NOISEX-92 database for various noise types and noise levels. Finally, we compare the performance of these versus Mel-Frequency Cepstral Coefficients (MFCCs), both using PMC. Experimental results show significant performance improvements for MFDWCs versus MFCCs, particularly after compensating the HMMs using the PMC technique. The best feature vector among the six MFDWCs we tried gave 13.72 and 5.29 points performance improvement, on the average, over MFCCs for -6 and 0 dB SNR, respectively. This corresponds to 39.9% and 62.8% error reductions, respectively. Weighting the partial score of each coefficient based on the noise level further improves the performance. The average error rates for the best MFDWCs dropped from 19.57% to 16.71% and from 3.14% to 2.14% for -6 dB and 0 dB noise levels, respectively, using the weighting scheme. These improvements correspond to 14.6% and 31.8% error reductions for -6 dB and 0 dB noise levels, respectively.