Mel-scaled discrete wavelet coefficients for speech recognition


Gowdy J., TÜFEKCİ Z.

2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, İstanbul, Turkey, 5 - 09 June 2000, vol.5, pp.1351-1354

  • Publication Type: Conference Paper / Full Text
  • Volume: 5
  • City: İstanbul
  • Country: Turkey
  • Page Numbers: pp.1351-1354
  • Çukurova University Affiliated: No

Abstract

In this paper we propose a new feature vector consisting of Mel-Frequency Discrete Wavelet Coefficients (MFDWC). The MFDWC are obtained by applying the Discrete Wavelet Transform (DWT) to the mel-scaled log filterbank energies of a speech frame. The purpose of using the DWT is to benefit from its localization property in the time and frequency domains. MFDWC are similar to subband-based (SUB) features and multi-resolution (MULT) features in that both attempt to achieve good time and frequency localization. However. MFDWC have better time/frequency localization than SUB features and MULT features. We evaluated the performance of new features for clean speech and noisy speech and compared the performance of MFDWC with Mel-Frequency Cepstral Coefficients (MFCC), SUB features and MULT features. Experimental results on a phoneme recognition task showed that a MFDWC-based recognizer gave better results than recognizers based on MFCC. SUB features, and MULT features for the white gaussian noise, band-limited white gaussian noise and clean speech cases.