Spectral Noise Tracking for Improved Nonstationary Noise Robust ASR
Konferenz: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
24.09.2014 - 26.09.2014 in Erlangen, Deutschland
Tagungsband: Speech Communication
Seiten: 4Sprache: EnglischTyp: PDF
Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt
Autoren:
Chinaev, Aleksej; Puels, Marc; Haeb-Umbach, Reinhold (Department of Communications Engineering, University of Paderborn, 33098, Paderborn, Germany)
Inhalt:
A method for nonstationary noise robust automatic speech recognition (ASR) is to first estimate the changing noise statistics and second clean up the features prior to recognition accordingly. Here, the first is accomplished by noise tracking in the spectral domain, while the second relies on Bayesian enhancement in the feature domain. In this way we take advantage of our recently proposedmaximum a-posteriori based (MAP-B) noise power spectral density estimation algorithm, which is able to estimate the noise statistics even in time-frequency bins dominated by speech. We show that MAP-B noise tracking leads to an improved noise model estimate in the feature domain compared to estimating noise in speech absence periods only, if the bias resulting from the nonlinear transformation from the spectral to the feature domain is accounted for. Consequently, ASR results are improved, as is shown by experiments conducted on the Aurora IV database.