Spemoticons: Text to Speech Based Emotional Auditory Cues
Abstract
There are various methods of providing auditory cues in human-computer user interfaces. The two basic traditional methods are the application of real-life sounds (auditory icons) and artificially generated audio signals (earcons). Recently in- between solutions have been developed based on text-to-speech (TTS) technology. Spearcons are speeded-up versions of TTS output of a particular text-template while spindex cues are generated as auditory index items from the first letter of menu list elements. Auditory emoticons are the non-verbal human sound based audible equivalents of emoticons. Auditory emoticons are the non-verbal human sound based audible equivalents of emoticons. However we are not aware of any attempt for generating auditory emotional and intentional state representation (comparable to emoticon characters) based on a TTS solution. We denote these meaningless cues as spemoticons. The interactive development environment of our TTS system is applied as a modification tool for generating spemoticons. The intensity, duration and pitch structure of the generated speech is manipulated. An experimental sound inventory of 44 elements was compiled and tested by 54 adult subjects for the selection of spemoticons.