Towards expressive melodic accompaniment using parametric modeling of continuous musical elements in a multi-attribute prediction suffix trie framework
MetadataShow full item record
Elements of continuous variation such as tremolo, vibrato and portamento enable dimensions of their own in expressive melodic music in styles such as in Indian Classical Music. There is published work on parametrically modeling some of these elements individually, and to apply the modeled parameters to automatically generated musical notes in the context of machine musicianship, using simple rule-based mappings. There have also been many systems developed for generative musical accompaniment using probabilistic models of discrete musical elements such as MIDI notes and durations, many of them inspired by computational research in linguistics. There however doesn't seem to have been a combined approach of parametrically modeling expressive elements in a probabilistic framework. This documents presents a real-time computational framework that uses a multi-attribute trie / n-gram structure to model parameters like frequency, depth and/or lag of the expressive variations such as vibrato and portamento, along with conventionally modeled elements such as musical notes, their durations and metric positions in melodic audio input. This work proposes storing the parameters of expressive elements as metadata in the individual nodes of the traditional trie structure, along with the distribution of their probabilities of occurrence. During automatic generation of music, the expressive parameters as learned in the above training phase are applied to the associated re-synthesized musical notes. The model is aimed at being used to provide automatic melodic accompaniment in a performance scenario. The parametric modeling of the continuous expressive elements in this form is hypothesized to be able to capture deeper temporal relationships among musical elements and thereby is expected to bring about a more expressive and more musical outcome in such a performance than what has been possible using other works of machine musicianship using only static mappings or randomized choice. A system was developed on Max/MSP software platform with this framework, which takes in a pitched audio input such as human singing voice, and produces a pitch track which may be applied to synthesized sound of a continuous timbre. The system was trained and tested with several vocal recordings of North Indian Classical Music, and a subjective evaluation of the resulting audio was made using an anonymous online survey. The results of the survey show the output tracks generated from the system to be as musical and expressive, if not more, than the case where the pitch track generated from the original audio was directly rendered as output, and also show the output with expressive elements to be perceivably more expressive than the version of the output without expressive parameters. The results further suggest that more experimentation may be required to conclude the efficacy of the framework employed in relation to using randomly selected parameter values for the expressive elements. This thesis presents the scope, context, implementation details and results of the work, suggesting future improvements.