Show simple item record

dc.contributor.advisorYalamanchili, Sudhakar
dc.contributor.authorSaeed, Ifrah
dc.date.accessioned2014-06-09T18:05:36Z
dc.date.available2014-06-09T18:05:36Z
dc.date.issued2014-04-09
dc.identifier.urihttp://hdl.handle.net/1853/51967
dc.description.abstractA growing number of industries are turning to data warehousing applications such as forecasting and risk assessment to process large volumes of data. These data warehousing applications, which utilize queries comprised of a mix of arithmetic and relational algebra (RA) operators, currently run on systems that utilize commodity multi-core CPUs. If we acknowledge the data-intensive nature of these applications, general purpose graphics processing units (GPUs) with high throughput and memory bandwidth seem to be natural candidates to host these applications. However, since such relational queries exhibit irregular parallelism and data accesses, their efficient implementation on GPUs remains challenging. Thus, although tailored solutions for individual processors using their native programming environments have evolved, these solutions are not accessible to other processors. This thesis addresses this problem by providing a portable implementation of RA, mathematical, and related primitives required to implement and accelerate relational queries over large data sets in the form of the library. These primitives can run on any modern multi- and many-core architecture that supports OpenCL, thereby enhancing the performance potential of such architectures for warehousing applications. In essence, this thesis describes the implementation of primitives and the results of their performance evaluation on a range of platforms and concludes with insights, the identification of opportunities, and lessons learned. One of the major insights from our analysis is that for complex relational queries, the time taken to transfer data between host CPUs and discrete GPUs can render the performance of discrete and integrated GPUs comparable in spite of the higher computing power and memory bandwidth of discrete GPUs. Therefore, data movement optimization is the key to eff ectively harnessing the high performance of discrete GPUs; otherwise, cost eff ectiveness would encourage the use of integrated GPUs. Furthermore, portability also enables the complete utilization of all GPUs and CPUs in the system at run time by opportunistically using any type of available processor when a kernel is ready for execution.en_US
dc.language.isoen_USen_US
dc.publisherGeorgia Institute of Technologyen_US
dc.subjectData-intensive query processingen_US
dc.subjectRA operatorsen_US
dc.subjectOpenCLen_US
dc.subjectGPUsen_US
dc.subjectCPUsen_US
dc.subject.lcshGraphics processing units
dc.subject.lcshData warehousing
dc.subject.lcshBig data
dc.subject.lcshRelation algebras
dc.titleA portable relational algebra library for high performance data-intensive query processingen_US
dc.typeThesisen_US
dc.description.degreeM.S.
dc.contributor.departmentElectrical and Computer Engineering
dc.embargo.termsnullen_US
thesis.degree.levelMasters
dc.contributor.committeeMemberRiley, George F.
dc.contributor.committeeMemberKim, Hyesoon


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record