• Login
    View Item 
    •   SMARTech Home
    • International Conference on Auditory Display (ICAD)
    • International Conference on Auditory Display, 2003
    • View Item
    •   SMARTech Home
    • International Conference on Auditory Display (ICAD)
    • International Conference on Auditory Display, 2003
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Discriminating visible speech tokens using multi-modality

    Thumbnail
    View/Open
    CampbellShafae2003.pdf (74.50Kb)
    Date
    2003-07
    Author
    Campbell, Christopher S
    Shafae, Michael M
    Lodha, Suresh K
    Massaro, Dominic W
    Metadata
    Show full item record
    Abstract
    We present a multimodal interactive data exploration tool that facilitates discrimination between visible speech tokens. The multimodal tool uses visualization and sonification (non-speech sound) of data. Visible speech tokens is a class of multidimensional data that have been used extensively in designing talking head that has been used in training of deaf individuals by watching speech [1]. Visible speech tokens (consonants), referred to as categories, differ along a set of pre-measured feature dimensions such as mouth height, mouth narrowing, jaw rotation and upper-lip retraction. The data set was visualized with a series of 1D scatter-plots that differed in color for each category. Sonification was performed by mapping three qualities of the data (within-category variability, between category variability, and category identity) to three sound parameters (noise amplitude, duration, and pitch). An experiment was conducted to assess the utility of multimodal information compared to visual information alone for exploring this multidimensional data set. Tasks involved answering a series of questions to determine how well each feature or a set of features discriminate among categories, which categories are discriminated and how many. Performance was assessed by measuring accuracy and reaction time to 36 questions varying in scale of understanding and level of dimension integrality. Scale varied at three levels (ratio, ordinal, and nominal) and integrality also varied at three levels (1, 2 , and 3 dimensions). A between-subjects design was used by assigning subjects to either the multimodal group or visual only group. Results show that accuracy is better for the multimodal group as the number of dimensions required to answer a question (integrality) increased. Also, accuracy was 10% better for the multimodal group for ordinal questions. For discriminating visible speech tokens, sonification provides useful information in addition to that given by visualization, particularly for representing three dimensions simultaneously.
    URI
    http://hdl.handle.net/1853/50455
    Collections
    • International Conference on Auditory Display, 2003 [72]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    • About
    • Terms of Use
    • Contact Us
    • Emergency Information
    • Legal & Privacy Information
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    • Login
    Georgia Tech

    © Georgia Institute of Technology

    • About
    • Terms of Use
    • Contact Us
    • Emergency Information
    • Legal & Privacy Information
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    • Login
    Georgia Tech

    © Georgia Institute of Technology