• Login
    View Item 
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    •   SMARTech Home
    • Georgia Tech Theses and Dissertations
    • Georgia Tech Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Multi-tree algorithms for computational statistics and phyiscs

    Thumbnail
    View/Open
    MARCH-DISSERTATION-2013.pdf (2.398Mb)
    Date
    2013-07-02
    Author
    March, William B.
    Metadata
    Show full item record
    Abstract
    The Fast Multipole Method of Greengard and Rokhlin does the seemingly impossible: it approximates the quadratic scaling N-body problem in linear time. The key is to avoid explicitly computing the interactions between all pairs of N points. Instead, by organizing the data in a space-partitioning tree, distant interactions are quickly and efficiently approximated. Similarly, dual-tree algorithms, which approximate or eliminate parts of a computation using distance bounds, are the fastest algorithms for several fundamental problems in statistics and machine learning -- including all nearest neighbors, kernel density estimation, and Euclidean minimum spanning tree construction. We show that this overarching principle -- that by organizing points spatially, we can solve a seemingly quadratic problem in linear time -- can be generalized to problems involving interactions between sets of three or more points and can provide orders-of-magnitude speedups and guarantee runtimes that are asymptotically better than existing algorithms. We describe a family of algorithms, multi-tree algorithms, which can be viewed as generalizations of dual-tree algorithms. We support this thesis by developing and implementing multi-tree algorithms for two fundamental scientific applications: n-point correlation function estimation and Hartree-Fock theory. First, we demonstrate multi-tree algorithms for n-point correlation function estimation. The n-point correlation functions are a family of fundamental spatial statistics and are widely used for understanding large-scale astronomical surveys, characterizing the properties of new materials at the microscopic level, and for segmenting and processing images. We present three new algorithms which will reduce the dependence of the computation on the size of the data, increase the resolution in the result without additional time, and allow probabilistic estimates independent of the problem size through sampling. We provide both empirical evidence to support our claim of massive speedups and a theoretical analysis showing linear scaling in the fundamental computational task. We demonstrate the impact of a carefully optimized base case on this computation and describe our distributed, scalable, open-source implementation of our algorithms. Second, we explore multi-tree algorithms as a framework for understanding the bottleneck computation in Hartree-Fock theory, a fundamental model in computational chemistry. We analyze existing fast algorithms for this problem, and show how they fit in our multi-tree framework. We also show new multi-tree methods, demonstrate that they are competitive with existing methods, and provide the first rigorous guarantees for the runtimes of all of these methods. Our algorithms will appear as part of the PSI4 computational chemistry library.
    URI
    http://hdl.handle.net/1853/49116
    Collections
    • Georgia Tech Theses and Dissertations [22398]
    • School of Computational Science and Engineering Theses and Dissertations [76]
    • College of Computing Theses and Dissertations [1071]

    Browse

    All of SMARTechCommunities & CollectionsDatesAuthorsTitlesSubjectsTypesThis CollectionDatesAuthorsTitlesSubjectsTypes

    My SMARTech

    Login

    Statistics

    View Usage StatisticsView Google Analytics Statistics
    • About
    • Terms of Use
    • Contact Us
    • Emergency Information
    • Legal & Privacy Information
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    • Login
    Georgia Tech

    © Georgia Institute of Technology

    • About
    • Terms of Use
    • Contact Us
    • Emergency Information
    • Legal & Privacy Information
    • Accessibility
    • Accountability
    • Accreditation
    • Employment
    • Login
    Georgia Tech

    © Georgia Institute of Technology