Uncertainty Decoding Using a Sampling Strategy Based on the Eigenvalue Decomposition

Conference: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
10/05/2016 - 10/07/2016 at Paderborn, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Huemmer, Christian; Stadter, Philipp; Kellermann, Walter (Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, 91058 Erlangen, Germany)

Abstract:
Uncertainty decoding combines a probabilistic distortion model with the acoustic model of a speech recognition system. This can be realized for DNN-based acoustic models by drawing feature samples from an estimated probability distribution and averaging the resulting set of posterior likelihoods at the output of the DNN. According to this principle, we consider a probabilistic feature description in the logmelspec domain to model the front-end estimation errors produced by a coherence-based Wiener filter. As main innovation with respect to previous work, we employ a sampling strategy based on the eigenvalue decomposition to capture (instead of neglect) the cross-correlations between the acoustic features as part of the uncertainty decoding scheme. The experimental results for real recordings provided by the REVERB Challenge task highlight the effectiveness of this sampling strategy in improving the recognition accuracy of a DNN-HMM hybrid system.