Continuously maintaining quantile summaries of the most recent N elements over a data stream
Refereed conference paper presented and published in conference proceedings

替代計量分析
.

其它資訊
摘要Statistics over the most recently observed data elements are often required in applications involving data streams, such as intrusion detection in network monitoring, stock price prediction in financial markets, web log mining for access prediction, and user click stream mining for personalization. Among various statistics, computing quantile summary is probably most challenging because of its complexity. In this paper, we study the problem of continuously maintaining quantile summary of the most recently observed N elements over a stream so that quantile queries can be answered with a guaranteed precision of εN. We developed a space efficient algorithm for pre-defined N that requires only one scan of the input data stream and O(log(ε2N)/ ε + 1/ε2) space in the worst cases. We also developed an algorithm that maintains quantile summaries for most recent N elements so that quantile queries on any most recent n elements (n ≤ N) can be answered with a guaranteed precision of εn. The worst case space requirement for this algorithm is only O(log2(εN)/ε2). Our performance study indicated that not only the actual quantile estimation error is far below the guaranteed precision but the space requirement is also much less than the given theoretical bound.
著者Lin X., Lu H., Xu J., Yu J.X.
會議名稱Proceedings - 20th International Conference on Data Engineering - ICDE 2004
會議開始日30.03.2004
會議完結日02.04.2004
會議地點Boston, MA.
會議國家/地區美國
出版年份2004
月份6
日期1
卷號20
頁次362 - 373
國際標準書號0-7695-2065-0
國際標準期刊號1063-6382
語言英式英語

上次更新時間 2021-03-12 於 00:12