Show simple item record

dc.contributor.authorTang, Yuzhe
dc.contributor.authorIyengar, Arun
dc.contributor.authorTan, Wei
dc.contributor.authorFong, Liana
dc.contributor.authorLiu, Ling
dc.date.accessioned2015-06-09T16:58:46Z
dc.date.available2015-06-09T16:58:46Z
dc.date.issued2014
dc.identifier.urihttp://hdl.handle.net/1853/53629
dc.description.abstractThe recent shift towards write-intensive workload on big data (e.g., financial trading, social user-generated data streams) has pushed the proliferation of the log-structured key-value stores, represented by Google’s BigTable, HBase and Cassandra; these systems optimize write performance by adopting a log-structured merge design. While providing key-based access methods based on a Put/Get interface, these key-value stores do not support value-based access methods, which significantly limits their applicability in many web and Internet applications, such as real-time search for all tweets or blogs containing “government shutdown”. In this paper, we present HINDEX, a write-optimized indexing scheme on the log-structured key-value stores. To index intensively updated big data in real time, the index maintenance is made lightweight by a design tailored to the unique characteristic of the underlying log-structured key-value stores. Concretely, HINDEX performs append-only index updates, which avoids the reading of historic data versions, an expensive operation in the log-structure store. To fix the potentially obsolete index entries, HINDEX proposes an offline index repair process through tight coupling with the routine compactions. HINDEX’s system design is generic to the Put/Get interface; we implemented a prototype of HINDEX based on HBase without internal code modification. Our experiments show that the HINDEX offers significant performance advantage for the write-intensive index maintenance.en_US
dc.language.isoen_USen_US
dc.publisherGeorgia Institute of Technologyen_US
dc.relation.ispartofseriesCERCS ; GIT-CERCS-14-01en_US
dc.subjectIndex maintenanceen_US
dc.subjectKey-based access methodsen_US
dc.subjectKey-value storesen_US
dc.subjectValue-based access methodsen_US
dc.subjectWrite-intensive index maintenanceen_US
dc.titleWrite-Optimized Indexing for Log-Structured Key-Value Storesen_US
dc.typeTechnical Reporten_US
dc.contributor.corporatenameGeorgia Institute of Technology. Center for Experimental Research in Computer Systemsen_US
dc.contributor.corporatenameGeorgia Institute of Technology. College of Computingen_US
dc.contributor.corporatenameIBM Thomas J. Watson Research Centeren_US
dc.embargo.termsnullen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record