Affine-invariant visual features contain supplementary information to enhance speech recognition

Gurbuz, S; Patterson, E; Tufekci, ZEKERİYA; Gowdy, JN

Affine-invariant visual features contain supplementary information to enhance speech recognition

Gurbuz S., Patterson E., Tufekci Z., Gowdy J.

AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, cilt.2091, ss.175-181, 2001 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 2091
Basım Tarihi: 2001
Dergi Adı: AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED)
Sayfa Sayıları: ss.175-181
Çukurova Üniversitesi Adresli: Hayır

The performance of audio-based speech recognition systems degrades severely when there is a mismatch between training and usage environments due to background noise. This degradation is due to a loss of ability to extract and distinguish important information from audio features. One of the emerging techniques for dealing with this problem is the addition of visual features in a multimodal recognition system. This paper presents an affine-invariant, multimodal speech recognition system and focuses on the supplementary information that is available from video features.