EFFICIENTLY ACCELERATING SPARSE PROBLEMS BY ENABLING STREAM ACCESSES TO MEMORY USING HARDWARE/SOFTWARE TECHNIQUES
MetadataShow full item record
The objective of this research is to improve the performance of sparse problems that have a wide range of applications but still, suffer from serious challenges when running on modern computers. In summary, the challenges include the underutilization of available memory bandwidth because of lack of spatial locality, dependencies in computation, or slow mechanisms for decompressing the sparse data, and the underutilization of concurrent compute engines because of the distribution of non-zero values in sparse data. Our key insight to address the aforementioned challenges is that based on the type of the problem, we either use an intelligent reduction tree near memory to process data while gathering them from random locations of memory, transform the computations mathematically to extract more parallelism, modify the distribution of non-zero elements, or change the representation of sparse data. By applying such techniques, the execution adapts more effectively to given hardware resources. To this end, this research introduces hardware/software techniques to enable stream accesses to memory for accelerating four main categories of sparse problems including the inference of recommendation systems, iterative solvers of partial differential equations (PDEs), deep neural networks (DNNs), and graph algorithms.