Show simple item record

dc.contributor.authorGedik, Bugra
dc.contributor.authorWu, Kun-Lung
dc.contributor.authorYu, Philip S.
dc.contributor.authorLiu, Ling
dc.date.accessioned2006-02-07T19:38:47Z
dc.date.available2006-02-07T19:38:47Z
dc.date.issued2005
dc.identifier.urihttp://hdl.handle.net/1853/7711
dc.description.abstractDropping tuples has been commonly used for load shedding. However, tuple dropping generally is inadequate to shed load for multiway windowed stream joins. The output rate can be unnecessarily and severely degraded because tuple dropping does not recognize time correlations likely to exist among the streams. This paper introduces GrubJoin: an adaptive multi-way windowed stream join that efficiently performs time correlation-aware CPU load shedding. GrubJoin maximizes the output rate by achieving nearoptimal window harvesting within an operator throttling framework, i.e., regulating the fractions of the join windows that are processed by the multi-way join. Window harvesting performs the join using only certain more useful segments of the join windows. Due mainly to the combinatorial explosion of possible multi-way join sequences involving various segments of individual join windows, GrubJoin faces a set of unique challenges, such as determining the optimal window harvesting configuration and learning the time correlations among the streams. To tackle these challenges, we formalize window harvesting as an optimization problem, develop greedy heuristics to determine near-optimal window harvesting configurations and use approximation techniques to capture the time correlations among the streams. Experimental results show that GrubJoin is vastly superior to tuple dropping when time correlations exist among the streams and is equally effective as tuple dropping in the absence of time correlations.en
dc.format.extent693904 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoen_USen
dc.publisherGeorgia Institute of Technologyen
dc.relation.ispartofseriesCERCS;GIT-CERCS-05-19en
dc.subjectData streams
dc.subjectGrubJoin
dc.subjectHeuristics
dc.subjectLoad shedding
dc.subjectJoin algorithms
dc.subjectTime correlation-aware CPU load shedding
dc.subjectTuples
dc.titleGRUBJOIN: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Sheddingen
dc.typeTechnical Reporten


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record