Investigating disentanglement of speaker identity and characteristics through user experience

Conference: Speech Communication - 15th ITG Conference
09/20/2023 - 09/22/2023 at Aachen


Proceedings: ITG-Fb. 312: Speech Communication

Pages: 5Language: englishTyp: PDF

Rallabandi, Sai Sirisha (Quality and Usability Lab, Technische Universität Berlin, Germany)
Moeller, Sebastian (Quality and Usability Lab, Technische Universität Berlin, Germany & Speech and Language Technology, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Berlin, Germany)

In this paper, we investigate the disentanglement of speakerspecific information in voice-converted female synthetic voices. We categorize this speaker-specific information into a) speaker identity and b) social speaker characteristics. The separability and inter-dependence of these two categories were investigated based on the user experience using five different evaluation methods namely, a) speech quality, b) intelligibility, c) semantic differential scaling test, d) speaker similarity test, and e) characteristic similarity test. The analysis of the subjective results shows that intelligibility significantly impacts the perceptions of other evaluation methods. We have also observed statistically significant differences between speaker similarity and characteristic similarity tests.