SMARTech   Library Home
 

Georgia Tech's Institutional Repository >
Georgia Tech Theses and Dissertations >
Georgia Tech Theses and Dissertations >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1853/5281

Full metadata record

DC FieldValueLanguage
dc.contributor.authorTang, Weien_US
dc.date.accessioned2005-03-04T16:41:32Z-
dc.date.available2005-03-04T16:41:32Z-
dc.date.issued2003-12-08en_US
dc.identifier.urihttp://hdl.handle.net/1853/5281-
dc.description.abstractInformation monitoring systems are publish-subscribe systems that continuously track information changes and notify users (or programs acting on behalf of humans) of relevant updates according to specified thresholds. Internet-scale information monitoring presents a number of new challenges. First, automated change detection is harder when sources are autonomous and updates are performed asynchronously. Second, information source heterogeneity makes the problem of modelling and representing changes harder than ever. Third, efficient and scalable mechanisms are needed to handle a large and growing number of users and thousands or even millions of monitoring triggers fired at multiple sources. In this dissertation, we model users' monitoring requests using continual queries (CQs) and present a suite of efficient and scalable solutions to large scale information monitoring over structured or semi-structured data sources. A CQ is a standing query that monitors information sources for interesting events (triggers) and notifies users when new information changes meet specified thresholds. In this dissertation, we first present the system level facilities for building an Internet-scale continual query system, including the design and development of two operational CQ monitoring systems OpenCQ and WebCQ, the engineering issues involved, and our solutions. We then describe a number of research challenges that are specific to large-scale information monitoring and the techniques developed in the context of OpenCQ and WebCQ to address these challenges. Example issues include how to efficiently process large number of continual queries, what mechanisms are effective for building a scalable distributed trigger system that is capable of handling tens of thousands of triggers firing at hundreds of data sources, how to effectively disseminate fresh information to the right users at the right time. We have developed a suite of techniques to optimize the processing of continual queries, including an effective CQ grouping scheme, an auxiliary data structure to support group-based indexing of CQs, and a differential CQ evaluation algorithm (DRA). The third contribution is the design of an experimental evaluation model and testbed to validate the solutions. We have engaged our evaluation using both measurements on real systems (OpenCQ/WebCQ) and simulation-based approach. To our knowledge, the research documented in this dissertation is to date the first one to present a focused study of research and engineering issues in building large-scale information monitoring systems using continual queries.en_US
dc.format.extent3199538 bytes-
dc.format.mimetypeapplication/pdf-
dc.language.isoen_US-
dc.publisherGeorgia Institute of Technologyen_US
dc.subjectDifferential re-evaluationen_US
dc.subjectContinual queries-
dc.subjectWeb page monitoring-
dc.subjectSemi-structured data-
dc.subjectInformation monitoring-
dc.subject.lcshWeb sites Managementen_US
dc.subject.lcshInternet programmingen_US
dc.subject.lcshInformation technologyen_US
dc.titleInternet-Scale Information Monitoring: A Continual Query Approachen_US
dc.typeDissertationen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentComputingen_US
dc.description.advisorCommittee Chair: Ling Liu; Committee Member: Calton Pu; Committee Member: Constantinos Dovrolis; Committee Member: Edward Omiecinski; Committee Member: Leo Mark; Committee Member: Thomas E. Potoken_US
Appears in Collections:Georgia Tech Theses and Dissertations
College of Computing Theses and Dissertations

Files in This Item:

File Description SizeFormat
tang_wei_200312.pdf3.12 MBAdobe PDFView/Open

Items in SMARTech are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2007 MIT and Hewlett-Packard - Feedback