On contention management for data accesses in parallel and distributed systems
MetadataShow full item record
Data access is an essential part of any program, and is especially critical to the performance of parallel computing systems. The objective of this work is to investigate factors that affect data access parallelism in parallel computing systems, and design/evaluate methods to improve such parallelism - and thereby improving the performance of corresponding parallel systems. We focus on data access contention and network resource contention in representative parallel and distributed systems, including transactional memory system, Geo-replicated transactional systems and MapReduce systems. These systems represent two widely-adopted abstractions for parallel data accesses: transaction-based and distributed-system-based. In this thesis, we present methods to analyze and mitigate the two contention issues. We first study the data contention problem in transactional memory systems. In particular, we present a queueing-based model to evaluate the impact of data contention with respect to various system configurations and workload parameters. We further propose a profiling-based adaptive contention management approach to choose an optimal policy across different benchmarks and system platforms. We further develop several analytical models to study the design of transactional systems when they are Geo-replicated. For the network resource contention issue, we focus on data accesses in distributed systems and study opportunities to improve upon the current state-of-art MapReduce systems. We extend the system to better support map task locality for dual-map-input applications. We also study a strategy that groups input blocks within a few racks to balance the locality of map and reduce tasks. Experiments show that both mechanisms significantly reduce off-rack data communication and thus alleviate the resource contention on top-rack switch and reduce job execution time. In this thesis, we show that both the data contention and the network resource contention issues are key to the performance of transactional and distributed data access abstraction and our mechanisms to estimate and mitigate such problems are effective. We expect our approaches to provide useful insight on future development and research for similar data access abstractions and distributed systems.