Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

Conference: Speech Communication - 15th ITG Conference
09/20/2023 - 09/22/2023 at Aachen


Proceedings: ITG-Fb. 312: Speech Communication

Pages: 5Language: englishTyp: PDF

Luescher, Christoph; Zeineldeen, Mohammad; Yang, Zijian; Schlueter, Ralf; Ney, Hermann (Machine Learning and Human Language Technology, RWTH Aachen University, Aachen, Germany & AppTek GmbH, Aachen, Germany)
Raissi, Tina; Vieting, Peter; Le-Duc, Khai (Machine Learning and Human Language Technology, RWTH Aachen University, Aachen, Germany)
Wang, Weiyue (AppTek GmbH, Aachen, Germany)

Language barriers present a great challenge in our increasingly connected and global world. Especially within the medical domain, e.g. hospital or emergency room, communication difficulties, and delays may lead to malpractice and non-optimal patient care. In the HYKIST project, we consider patient-physician communication, more specifically between a German-speaking physician and an Arabic-, Vietnamese-, or Ukrainian-speaking patient. Currently, a doctor can call the Triaphon service to get assistance from an interpreter in order to help facilitate communication. The HYKIST goal is to support the usually non-professional bilingual interpreter with an automatic speech translation system to improve patient care and help overcome language barriers. In this work, we present our ASR system development efforts for this conversational telephone speech translation task in the medical domain for two language pairs, data collection, various acoustic model architectures, and dialect-induced difficulties.