High-Performance Software for Quantum Chemistry and Hierarchical Matrices
Erlandson, Lucas Alden
MetadataShow full item record
Linear algebra is the underpinning of a significant portion of the computation done in the modern age. Applications relying on linear algebra include physical and chemical simulations, machine learning, artificial intelligence, optimization, partial differential equations, and many more. However, the direct use of mathematically exact linear algebra is often infeasible for the large problems of today. Numerical and iterative methods provide a way of solving the underlying problems only to the required accuracy, allowing problems that are many magnitudes larger to be solved magnitudes more quickly than if the problems were to be solved using exact linear algebra. In this dissertation, we discuss, test existing methods, and develop new high-performance numerical methods for scientific computing kernels, including matrix-multiplications, linear solves, and eigensolves, which accelerate applications including Gaussian processes and quantum chemistry simulations. Notably, we use preconditioned hierarchical matrices for the hyperparameter optimization and prediction phases of Gaussian process regression, develop a sparse triple matrix product on GPUs, and investigate 3D matrix-matrix multiplications for Chebyshev-filtered subspace iteration for Kohn-Sham density functional theory calculations. The exploitation of the structural sparsity of many practical scientific problems can achieve a significant speedup over the dense formulations of the same problems. Even so, many problems cannot be accurately represented or approximated in a structurally sparse manner. Many of these problems, such as kernels arising from machine learning and the Electronic-Repulsion-Integral (ERI) matrices from electronic structure computations, can be accurately represented in data-sparse structures, which allows for rapid calculations. We investigate hierarchical matrices, which provide a data-sparse representation of kernel matrices. In particular, our SMASH approximation can construct and provide matrix multiplications in near-linear time, which can then be used in matrix-free methods to find the optimal hyperparameters for Gaussian processes and to do prediction asymptotically more rapidly than direct methods. To accelerate the use of hierarchical matrices further, we provide a data-driven approach (where we consider the distribution of the data points associated with a kernel matrix) that reduces a given problem's memory and computation requirements. Furthermore, we investigate the use of preconditioning in Gaussian process regression. We can use matrix-free algorithms for hyperparameter optimization and prediction phases of Gaussian process. This provides a framework for Gaussian process regression that scales to large-scale problems and is asymptotically faster than state-of-the-art methods. We provide an exploration and analysis of the conditioning and numerical issues that arise from the near-rank-deficient matrices that occur during hyperparameter optimizations. Density Functional Theory (DFT) is a valuable method for electronic structure calculations for simulating quantum chemical systems due to its high accuracy to cost ratio. However, even with the computational power of modern computers, the O(n^3) complexity of the eigensolves and other kernels mandate that new methods are developed to allow larger problems to be solved. Two promising methods for tackling these problems are using modern architectures (including state-of-the-art accelerators and multicore systems) and 3D matrix-multiplication algorithms. We investigate these methods to determine if using these methods will result in an overall speedup. Using these kernels, we provide a high-performance framework for Chebyshev-filtered subspace iteration. GPUs are a family of accelerators that provide immense computational power but must be used correctly to achieve good efficiency. In algebraic multigrid, there arises a sparse triple matrix product, which due to the sparse (and relatively unstructured) nature, is challenging to perform efficiently on GPUs, and is typically done as two successive matrix-matrix products. However, by doing a single triple-matrix product, reducing the overhead associated with sparse matrix-matrix products on the GPU may be possible. We develop a sparse triple-matrix product that reduces the computation time required for a few classes of problems.