Towards Automatic Analysis of Audio Recordings from Children with Autism Spectrum Disorder
MetadataShow full item record
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that can negatively impact learning, behavior, and social communication and interaction. In the United States, 1 in 59 children aged eight were diagnosed with ASD, according to the CDC’s 2014 report. Unfortunately, manual analysis of recordings of children with ASD is expensive, time-consuming, and does not scale well. This dissertation addresses general approaches for automatic analysis of audio recordings of children with ASD. First, we demonstrate that environmental feature representation in the i-vector space can be used to improve the diarization of the audio recordings. Next, we address the issue of diarizing audio recordings of infants and toddlers. We design a fine-tuning mechanism that is applied to a time-delay neural network (TDNN) to improve the classification accuracy of recordings from infants and toddlers. One metric of interest for clinicians is the child’s response rate to questions from parents. We build an interrogative utterance detector that features a stack of convolutional neural network (CNN) layers with a self-attention mechanism. We can identify question segments from parents with this proposed architecture and subsequently analyze response rates to those questions from the child. Other vocalization metrics evaluated here are conversational turns, child utterance frequency and duration, and adult question rates.