Show simple item record

dc.contributor.advisorPu, Calton
dc.contributor.authorLi, Jack
dc.date.accessioned2016-08-22T12:22:27Z
dc.date.available2016-08-22T12:22:27Z
dc.date.created2016-08
dc.date.issued2016-05-20
dc.date.submittedAugust 2016
dc.identifier.urihttp://hdl.handle.net/1853/55597
dc.description.abstractModern data centers are shifting to shared clusters where the resources are shared among multiple users and frameworks. A key enabler for such shared clusters is a cluster resource management system which allocates resources among different frameworks. One key problem in these shared clusters is how to efficiently share cluster resources between multiple applications and users in an elastic and non-disruptive manner. Current cluster schedulers typically utilize kill-based preemption to coordinate resource sharing, achieve fairness and satisfy SLOs during resource contention by simply killing low priority jobs and restarting them later when resources are available. This simple preemption policy ensures fast service times of high priority jobs and prevents a single user/application from occupying too many resources and starving others; however, without saving the progress of preempted jobs, this policy causes significant resource waste and delays the response time of long running or low priority jobs. The issue of dynamic resource sharing becomes even more problematic when there are different types of applications running on the same cluster (e.g., batch processing systems running alongside real-time streaming systems). Different application types will often have varying quality of service metrics (e.g., higher throughput versus lower latency) which can make resource sharing among these applications contentious. In this dissertation, we show the impact of kill-based preemption in modern shared clusters and propose two solutions to more efficiently share resources in shared cluster environments by utilizing checkpoint-based preemption and supporting elasticity in distributed data stream processing systems.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherGeorgia Institute of Technology
dc.subjectShared clusters
dc.subjectResource management
dc.subjectCloud
dc.subjectPreemption
dc.subjectScheduling
dc.subjectMulti-tenancy
dc.subjectDistributed stream processing
dc.subjectElasticity
dc.titleEfficient resource sharing for big data applications in shared clusters
dc.typeDissertation
dc.description.degreePh.D.
dc.contributor.departmentComputer Science
thesis.degree.levelDoctoral
dc.contributor.committeeMemberLiu, Lin
dc.contributor.committeeMemberNavathe, Shamkant B.
dc.contributor.committeeMemberOmiecinski, Edward R.
dc.contributor.committeeMemberWang, Qingyang
dc.date.updated2016-08-22T12:22:27Z


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record