Intelligibility of low bit rate mpeg-coded japanese speech in virtual 3d audio space
MetadataShow full item record
In this paper, we investigated the influence of stereo coding on Japanese speech localized in 3-D virtual space. We encoded localized speech using Joint Stereo and Parametric Stereo modes within the HE-AAC (High-Efficiency Advanced Audio Coding) encoder at identical data rates. First, the sound quality of the localized speech signal was checked using MUSHRA subjective tests. The result showed that the speech quality for Joint Stereo is higher than Parametric Stereo when localized at 45 (where 0 refers to localization directly in front of the listener) by 20 to 30 MUSHRA score points. The scores for Joint Stereo were relatively proportional to bit rate. However, Parametric Stereo scores were not proportional to bit rate, and remained fairly constant with bit rate. Next, the Japanese word intelligibility tests were conducted using the Japanese Diagnostic Rhyme Tests (JDRT). Test speech was localized in front, while competing noise were localized at various angles. The result showed that speech could not be separated from the noise for Joint Stereo when the noise was in located in the frontal region, from 45 to 45, and intelligibility degrades significantly. However at other azimuth, the intelligibility improves dramatically. On the other hand, intelligibility with Parametric Stereo remained constant, at about 70 to 80%.