Processing count queries over event streams at multiple time granularities

Date
2006-07-22
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
Information Sciences
Print ISSN
0020-0255
Electronic ISSN
Publisher
Elsevier Inc.
Volume
176
Issue
14
Pages
2066 - 2096
Language
English
Type
Article
Journal Title
Journal ISSN
Volume Title
Series
Abstract

Management and analysis of streaming data has become crucial with its applications to web, sensor data, network traffic data, and stock market. Data streams consist of mostly numeric data but what is more interesting are the events derived from the numerical data that need to be monitored. The events obtained from streaming data form event streams. Event streams have similar properties to data streams, i.e., they are seen only once in a fixed order as a continuous stream. Events appearing in the event stream have time stamps associated with them at a certain time granularity, such as second, minute, or hour. One type of frequently asked queries over event streams are count queries, i.e., the frequency of an event occurrence over time. Count queries can be answered over event streams easily, however, users may ask queries over different time granularities as well. For example, a broker may ask how many times a stock increased in the same time frame, where the time frames specified could be an hour, day, or both. Such types of queries are challenging especially in the case of event streams where only a window of an event stream is available at a certain time instead of the whole stream. In this paper, we propose a technique for predicting the frequencies of event occurrences in event streams at multiple time granularities. The proposed approximation method efficiently estimates the count of events with a high accuracy in an event stream at any time granularity by examining the distance distributions of event occurrences. The proposed method has been implemented and tested on different real data sets including daily price changes in two different stock exchange markets. The obtained results show its effectiveness. © 2005 Elsevier Inc. All rights reserved.

Course
Other identifiers
Book Title
Keywords
Count queries, Data streams, Event streams, Time granularity, Association rules, Data mining, Approximation theory
Citation
Published Version (Please cite this version)