Comparative Study of LC3plus and Lyra codec on DNN-based Source Localisation for Hearing Aids
Konferenz: Speech Communication - 15th ITG Conference
20.09.2023-22.09.2023 in Aachen
doi:10.30420/456164013
Tagungsband: ITG-Fb. 312: Speech Communication
Seiten: 5Sprache: EnglischTyp: PDF
Autoren:
Song, Siyuan; Kindt, Stijn; Maes, Jasper; Bohlender, Alexander; Madhu, Nilesh (IDLab, Department of Electronics and Information Systems, Ghent University - imec, Belgium)
Inhalt:
Lossy codecs are often used to exchange audio data in bandwidthconstrained applications. However, this can have a detrimental effect on the subsequent signal processing stages - especially with regard to multichannel source localisation and enhancement. Understanding and circumventing these effects is, therefore, crucial. We contrast the effect of LC3plus (developed for Bluetooth Low Energy (BLE) communications) against Lyra, a recently proposed neural codec, when used for audio exchange in hearing aids. Specifically, the effect of these codecs is benchmarked on the binaural localisation task, using a state-of-the-art deep learning network (DNN). We first examine codec influence on models trained with unencoded data. Next, we investigate to which extent localisation is possible when using codec-in-training. Model generalisation across bitrates and codecs is also studied. Evaluation results indicate that while lossy codecs significantly degrade localisation, including codec during training can recover performance. For LC3plus, performance comparable to unencoded data is obtainable even at the lowest bitrate. Lyra, in contrast, only significantly recovers performance in less challenging, single-source situations - implying stronger distortion of localisation related cues, especially in multi-source scenarios. This is likely a consequence of the generative model used in Lyra, and further improvement of this codec is crucial for multi-channel applications.