Parallel algorithms for generalized N-body problem in high dimensions and their applications for bayesian inference and image analysis
MetadataShow full item record
In this dissertation, we explore parallel algorithms for general N-Body problems in high dimensions, and their applications in machine learning and image analysis on distributed infrastructures. In the first part of this work, we proposed and developed a set of basic tools built on top of Message Passing Interface and OpenMP for massively parallel nearest neighbors search. In particular, we present a distributed tree structure to index data in arbitrary number of dimensions, and a novel algorithm that eliminate the need for collective coordinate exchanges during tree construction. To the best of our knowledge, our nearest neighbors package is the first attempt that scales to millions of cores in up to a thousand dimensions. Based on our nearest neighbors search algorithms, we present "ASKIT", a parallel fast kernel summation tree code with a new near-far field decomposition and a new compact representation for the far field. Specially our algorithm is kernel independent. The efficiency of new near far decomposition depends only on the intrinsic dimensionality of data, and the new far field representation only relies on the rand of sub-blocks of the kernel matrix. In the second part, we developed a Bayesian inference framework and a variational formulation for a MAP estimation of the label field for medical image segmentation. In particular, we propose new representations for both likelihood probability and prior probability functions, as well as their fast calculation. Then a parallel matrix free optimization algorithm is given to solve the MAP estimation. Our new prior function is suitable for lots of spatial inverse problems. Experimental results show our framework is robust to noise, variations of shapes and artifacts.