Audio diarization for LENA data and its application to computing language behavior statistics for individuals with autism
Pawar, Rahul Shivaji
MetadataShow full item record
The objective of this dissertation is to develop diarization algorithms for LENA data and study its application to compute language behavior statistics for individuals with autism. LENA device is one of the most commonly used devices to collect audio data in autism and language development studies. LENA child and adult detector algorithms were evaluated for two different datasets: i) older children dataset consisting of children already diagnosed with autism spectrum disor- der and ii) infants dataset consisting of infants at risk for autism. I-vector based diarization algorithms were developed for the two datasets to tackle two scenarios: a) some amount of labeled data is present for every speaker present in the audio recording and b) no labeled data is present for the audio recording to be diarized. Further, i-vector based diarization methods were applied to compute objective measures of assessment. These objective measures of assessment were analyzed to show they can reveal some aspects of autism severity. Also, a method to extract a 5 minute high child vocalization audio window from a 16 hour day long recording was developed, which was then used to compute canonical babble statistics using human annotation.