Acoustic segment modeling and preference ranking for music information retrieval

Show full item record

Please use this identifier to cite or link to this item: http://hdl.handle.net/1853/37189

Title: Acoustic segment modeling and preference ranking for music information retrieval
Author: Reed, Jeremy T.
Abstract: This dissertation focuses on improving content-based recommendation systems for music. Specifically, progress in the development in music content-based recommendation systems has stalled in recent years due to some faulty assumptions: 1. most acoustic content-based systems for music information retrieval (MIR) assume a bag-of-frames model, where it is assumed that a song contains a simplistic, global audio texture 2. genre, style, mood, and authors are appropriate categories for machine-oriented recommendation 3. similarity is a universal construct and does not vary among different users The main contribution of this dissertation is to address these faulty assumptions by describing a novel approach in MIR that provides user-centric, content-based recommendations based on statistics of acoustic sound elements. First, this dissertation presents the acoustic segment modeling framework that describes a piece of music as a temporal sequence of acoustic segment models (ASMs), which represent individual polyphonic sound elements. A dictionary of ASMs generated in an unsupervised process defines a vocabulary of acoustic tokens that are able to transcribe new musical pieces. Next, standard text-based information retrieval algorithms use statistics of ASM counts to perform various retrieval tasks. Despite a simple feature set compared to other content-based genre recommendation algorithms, the acoustic segment modeling approach is highly competitive on standard genre classification databases. Fundamental to the success of the acoustic segment modeling approach is the ability to model acoustical semantics in a musical piece, which is demonstrated by the detection of musical attributes on temporal characteristics. Further, it is shown that the acoustic segment modeling procedure is able to capture the inherent structure of melody by providing near state-of-the-art performance on an automatic chord recognition task. This dissertation demonstrates that some classification tasks, such as genre, possess information that is not contained in the acoustic signal; therefore, attempts at modeling these categories using only the acoustic content is ill-fated. Further, notions of music similarity are personal in nature and are not derived from a universal ontology. Therefore, this dissertation addresses the second and third limitation of previous content-based retrieval approaches by presenting a user-centric preference rating algorithm. Individual users possess their own cognitive construct of similarity; therefore, retrieval algorithms must demonstrate this flexibility. The proposed rating algorithm is based on the principle of minimum classification error (MCE) training, which has been demonstrated to be robust against outliers and also minimizes the Parzen estimate of the theoretical classification risk. The outlier immunity property limits the effect of labels that arise from non-content-based sources. The MCE-based algorithm performs better than a similar ratings prediction algorithm. Further, this dissertation discusses extensions and future work.
Type: Dissertation
URI: http://hdl.handle.net/1853/37189
Date: 2010-10-27
Publisher: Georgia Institute of Technology
Subject: Acoustic modeling
Music information retrieval
Preference ranking
Unsupervised learning
Acoustic segment modeling
Music and technology
Music and the Internet
Automatic speech recognition
Acoustic models
Department: Electrical and Computer Engineering
Advisor: Committee Member: Anderson, David; Committee Member: Chordia, Parag; Committee Member: Clements, Mark; Committee Member: Hunt, William; Committee Member: Lee, Chin-Hui
Degree: Ph.D.

All materials in SMARTech are protected under U.S. Copyright Law and all rights are reserved, unless otherwise specifically indicated on or in the materials.

Files in this item

Files Size Format View
reed_jeremy_t_201012_phd.pdf 1.362Mb PDF View/ Open

This item appears in the following Collection(s)

Show full item record