Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1183614.1183663acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Efficient range-constrained similarity search on wavelet synopses over multiple streams

Published: 06 November 2006 Publication History

Abstract

Due to the resource limitation in the data stream environment, it has been reported that answering user queries according to the wavelet synopsis of a stream is an essential ability of a Data Stream Management System (DSMS). In this paper, motivated by the fact that a user may be interested in an arbitrary range of the data streams, we investigate two important types of range-constrained queries in time series streaming environments: the distance queries (which aim at obtaining the Euclidean distance between two streams) and the kNN queries (which aim at discovering k nearest neighbors to a reference stream). To achieve high efficiency in processing these two types of queries, we propose procedure RED (standing for Range-constrained Euclidean Distance) and algorithm EKS (standing for Enhanced KNN Search). Compared to the existing methods in the prior research, the advantageous features of our approaches are in two folds. First, our approaches are capable of processing the queries directly from the wavelet synopses retained in the main memory without using IDWT to reconstruct the data cells. This feature allows us to save the complexity in both memory and time. Moreover, our approaches enable the users to query the DSMS within their range of interest. Unlike the conventional methods which only support the full-range query processing, this feature will enhance the flexibility at the client side. We evaluate procedure RED and algorithm EKS on live and synthetic datasets empirically and show that the proposed approaches are efficient in similarity search and kNN discovery within arbitrary ranges in the time series streaming environments.

References

[1]
S. Arya, D. M. Mount, R. Silverman, and A. Y. Wu. An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions. JACM, 45(6), 1998.
[2]
F. K.-P. Chan, A. W.-C. Fu, and C. Yu. Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping. IEEE TKDE, 15(3), 2003.
[3]
K.-P. Chan and A. W.-C. Fu. Efficient Time Series Matching by Wavelets. In Proc. of IEEE ICDE, 1999.
[4]
L. Gao, Z. Yao, and X. S. Wang. Evaluating Continuous Nearest Neighbor Queries for Streaming Time Series via Pre-fetching. In Proceedings of ACM CIKM, 2002.
[5]
A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. J. Strauss. One-Pass Wavelet Decompositions of Data Streams. IEEE TKDE, 15(3), 2003.
[6]
S. Guha and B. Harb. Wavelet Synopsis for Data Streams: Minimizing Non-Euclidean Error. In Proc. of KDD, 2005.
[7]
P. Karras and N. Mamoulis. One-Pass Wavelet Synopses for Maximum-Error Metrics. In Proceedings of VLDB, 2005.
[8]
F. Korn, S. Muthukrishnan, and D. Srivastava. Reverse Nearest Neighbor Aggregates over Data Streams. In Proc. VLDB, 2002.
[9]
N. Koudas, B. C. Ooi, K.-L. Tan, and R. Zhang. Approximate NN queries on Streams with Guaranteed Error/performance Bounds. In Proc. of VLDB, 2004.
[10]
I. Liabotis, B. Theodoulidis, and M. Saraaee. Improving Similarity Search in Time Series Using Wavelets. International Journal of Data Warehousing and Mining, 2(2), 2006.
[11]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-Based Histograms for Selectivity Estimation. In Proceedings of ACM SIGMOD, 1998.
[12]
K. Mouratidis, M. Hadjieleftheriou, and D. Papadia. Conceptual Partitioning: An Efficient Method for Continuous Nearest Neighbor Monitoring. In Proc. ACM SIGMOD, 2005.
[13]
I. Popivanov and R. J. Miller. Similarity Search Over Time-Series Data Using Wavelets. In Proc. of IEEE ICDE, 2002.
[14]
N. Roussopoulos and S. K. F. Vincent. Nearest Neighbor Queries. In Proc. ACM SIGMOD, 1995.
[15]
E. J. Stollnitz, T. D. Derose, and D. H. Salesin. Wavelets for Computer Graphics: Theory and Application. Morgan Kaufmann, 1996.
[16]
C. Wang and X. S. Wang. Supporting Subseries Nearest Neighbor Search via Approximation. In Proc. of ACM CIKM, 2000.
[17]
Y. Zhao and S. Zhang. Generalized Dimension-Reduction Framework for Recent-Biased Time Series Analysis. IEEE Trans. on Knowledge and Data Engineering, 18(2), 2006.

Cited By

View all
  • (2013)Communication-Efficient Distributed Multiple Reference Pattern Matching for M2M Systems2013 IEEE 13th International Conference on Data Mining10.1109/ICDM.2013.161(787-796)Online publication date: Dec-2013
  • (2012)Probabilistic distance based abnormal pattern detection in uncertain series dataKnowledge-Based Systems10.1016/j.knosys.2012.06.00336(182-190)Online publication date: 1-Dec-2012
  • (2010)Research on interval skyline over data stream2010 2nd International Conference on Future Computer and Communication10.1109/ICFCC.2010.5497539(V3-484-V3-488)Online publication date: May-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
November 2006
916 pages
ISBN:1595934332
DOI:10.1145/1183614
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data stream
  2. similarity search
  3. wavelet synopses

Qualifiers

  • Article

Conference

CIKM06
CIKM06: Conference on Information and Knowledge Management
November 6 - 11, 2006
Virginia, Arlington, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2013)Communication-Efficient Distributed Multiple Reference Pattern Matching for M2M Systems2013 IEEE 13th International Conference on Data Mining10.1109/ICDM.2013.161(787-796)Online publication date: Dec-2013
  • (2012)Probabilistic distance based abnormal pattern detection in uncertain series dataKnowledge-Based Systems10.1016/j.knosys.2012.06.00336(182-190)Online publication date: 1-Dec-2012
  • (2010)Research on interval skyline over data stream2010 2nd International Conference on Future Computer and Communication10.1109/ICFCC.2010.5497539(V3-484-V3-488)Online publication date: May-2010
  • (2009)PROUDProceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology10.1145/1516360.1516439(684-695)Online publication date: 24-Mar-2009
  • (2008)LeeWaveProceedings of the VLDB Endowment10.14778/1453856.14539211:1(586-597)Online publication date: 1-Aug-2008
  • (2007)Approximate Query Processing in Cube StreamsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2007.19062219:11(1557-1570)Online publication date: 1-Nov-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media