Robust DNN-Based Speech Enhancement with Limited Training Data
Conference: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10/10/2018 - 10/12/2018 at Oldenburg, Deutschland
Proceedings: Speech Communication
Pages: 5Language: englishTyp: PDF
Personal VDE Members are entitled to a 10% discount on this title
Authors:
Rehr, Robert; Gerkmann, Timo (Signal Processing (SP), Department of Informatics, Universität Hamburg, Germany)
Abstract:
In conventional speech enhancement, statistical models for speech and noise are used to derive clean speech estimators. The parameters of the models are estimated blindly from the noisy observation using carefully designed algorithms. These algorithms generalize well to unseen acoustic conditions, but are unable to reduce highly non-stationary noise types. This shortcoming motivated the usage of machine-learning-based (ML-based) algorithms, in particular deep neural networks (DNNs). But if only limited training data are available, the noise reduction performance in unseen acoustic conditions suffers. In this paper, motivated by conventional speech enhancement, we propose to use the a priori and a posteriori signal-to-noise ratios (SNRs) for DNN-based speech enhancement systems. Instrumental measures show that the proposed features increase the robustness in unknown noise types even if only limited training data are available.