Research on Active Learning for NN - based Automatic Speech Recognition

Konferenz: ICMLCA 2021 - 2nd International Conference on Machine Learning and Computer Application
17.12.2021 - 19.12.2021 in Shenyang, China

Tagungsband: ICMLCA 2021

Seiten: 6Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Huang, Lijie (School of Electronic Information and Communications, Uazhong University of Science and Technology, Wuhan, China)
Gao, Ziqi (Experimental High School North Campus of Liaoning Province, Henyang, China)
Huang, Yucheng (The Kings Academy, Est Palm Beach, USA)
Wang, Ziqing (Century High School, Ancouver, Canada)

Inhalt:
In recent years, there have been major changes in Automatic Speech Recognition (ASR). And due to the development of deep learning, Neural Networks (NNs) are more used in ASR, such as Deep Neural Network - Hidden Markov Model (DNN-HMM) and End-to-End model. The cost of manual speech transcription is very high. Active Learning (AL) can choose to label only the samples with the most information, so as to reduce the labeling cost on the premise of ensuring the same performance as the machine learning model. However, despite its current popularity, active learning is rarely used in conjunction with ASR. This paper list all scenarios in AL and distinguish query strategies based on confidence, uncertainty, version space reduction and expected error reduction by categorization, and investigate the prevalence of these strategies in recent studies. We analyze AL for ASR based on Gaussian Mixture Model - Hidden Markov Model (GMM-HMM), DNN-HMM and End-to-End models. In addition, we assess recent NN-based ASR research using active learning methods, investigating current state-of-the- art intersection of AL, NNs and related recent advances in ASR. Finally, the research analyzes recent work on AL for ASR according to different kinds of query strategies, and outlines commonalities and weaknesses of previous experiments. Therefore, we raise current research gaps and open research questions. We propose that the comparison of AL methods is difficult due to inconsistent datasets and models and the uncertainty of NNs still remains challenging. The existing approach for low resources conditions is a combination of semi-supervised/unsupervised learning and AL.