Search
Now showing items 1-6 of 6
Multi-tree Monte Carlo methods for fast, scalable machine learning
(Georgia Institute of Technology, 2009-01-09)
As modern applications of machine learning and data mining are forced to deal with ever more massive quantities of data, practitioners quickly run into difficulty with the scalability of even the most basic and fundamental ...
Robust clustering algorithms
(Georgia Institute of Technology, 2011-04-05)
One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across any different fields ranging from computational biology to social sciences to computer ...
New support vector machine formulations and algorithms with application to biomedical data analysis
(Georgia Institute of Technology, 2011-06-13)
The Support Vector Machine (SVM) classifier seeks to find the separating hyperplane wx=r that maximizes the margin distance 1/||w||2^2. It can be formalized as an optimization problem that minimizes the hinge loss Ʃ[subscript ...
A distributed kernel summation framework for machine learning and scientific applications
(Georgia Institute of Technology, 2012-05-11)
The class of computational problems I consider in
this thesis share the common trait of requiring
consideration of pairs (or higher-order tuples)
of data points. I focus on the problem of kernel
summation operations ...
Generalized N-body problems: a framework for scalable computation
(Georgia Institute of Technology, 2013-08-26)
In the wake of the Big Data phenomenon, the computing world has seen a number of computational paradigms developed in response to the sudden need to process ever-increasing volumes of data. Most notably, MapReduce has ...
New formulations for active learning
(Georgia Institute of Technology, 2014-01-10)
In this thesis, we provide computationally efficient algorithms with provable statistical guarantees, for the problem of active learning, by using ideas from sequential analysis. We provide a generic algorithmic framework ...