Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1055558.1055596acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Flexible time management in data stream systems

Published: 14 June 2004 Publication History

Abstract

Continuous queries in a Data Stream Management System (DSMS) rely on time as a basis for windows on streams and for defining a consistent semantics for multiple streams and updatable relations. The system clock in a centralized DSMS provides a convenient and well-behaved notion of time, but often it is more appropriate for a DSMS application to define its own notion of time---its own clock(s), sequence numbers, or other forms of ordering and times-tamping. Flexible application-defined time poses challenges to the DSMS, since streams may be out of order and uncoordinated with each other, they may incur latency reaching the DSMS, and they may pause or stop. We formalize these challenges and specify how to generate heartbeats so that queries can be evaluated correctly and continuously in an application-defined time domain. Our heartbeat generation algorithm is based on parameters capturing skew between streams, unordering within streams, and latency in streams reaching the DSMS. We also describe how to estimate these parameters at run-time, and we discuss how heartbeats can be used for processing continuous queries.

References

[1]
A. Arasu, S. Babu, and J. Widom. CQL: A Language for Continuous Queries over Streams and Relations. In Proc. of the Ninth Intl. Conf. on Database Programming Languages, September 2003.
[2]
B. Babcock, S. Babu, M. Datar, and R. Motwani. Chain: Operator scheduling for memory minimization in data stream systems. In Proc. of the 2003 ACM SIGMOD Intl. Conf. on Management of Data, pages 253--264, June 2003.
[3]
S. Babu, U. Srivastava, and J. Widom. Exploiting k-constraints to reduce memory overhead in continuous queries over data streams. Technical report, Stanford University Database Group, November 2002. Available at http://dbpubs.stanford.edu/pub/2002--52.
[4]
S. Babu and J. Widom. Continuous queries over data streams. SIGMOD Record, 30(3): 109--120, September 2001.
[5]
J. Bolot. End-to-end packet delay and loss behavior in the internet. In Proc. of the 1993 ACM SIGCOMM, pages 289--298, September 1993.
[6]
S. Chandrasekaran, S. Krishnamurthy, et al. Windows Explained, Windows Expressed. Available at http://www.cs.berkeley.edu/~sirish/research/ streaquel.pdf.
[7]
J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 379--390, May 2000.
[8]
C. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk. The Gigascope Stream Database. IEEE Data Engineering Bulletin, 26(1):27--32, March 2003.
[9]
M. Garey, R. Graham, and D. Johnson. Performance guarantees for scheduling algorithms. Operations Research, 26(1):3--21, January 1978.
[10]
J. Gehrke. Special issue on data stream processing. IEEE Data Engineering Bulletin, 26(1), March 2003.
[11]
L. Golab and M. Ozsu. Issues in data stream management. SIGMOD Record, 32(2):5--14, June 2003.
[12]
S. Krishnamurthy, S. Chandrasekaran, et al. TelegraphCQ: An Architectural Status Report. IEEE Data Engineering Bulletin, 26(1):11--18, March 2003.
[13]
L. Lamport. Time, clocks, and the ordering of events in distributed systems. Communications of the ACM, 21(7):558--565, July 1978.
[14]
A. Mukherjee. On the dynamics and significance of low-frequency components of internet load. Internetworking: Research and Experience, 5(4):163--205, December 1994.
[15]
K. Ramamritham. Real-time databases. Distributed and Parallel Databases, 1(2):199--226, 1993.
[16]
P. Seshadri, M. Livny, and R. Ramakrishnan. Sequence query processing. In Proc. of the 1994 ACM SIGMOD Intl. Conf. on Management of Data, pages 430--441, May 1994.
[17]
R. Snodgrass and I. Ahn. A taxonomy of time in databases. In Proc. of the 1985 ACM SIGMOD Intl. Conf. on Management of Data, pages 236--245, 1985.
[18]
SQR A Stream Query Repository. http://www-db.stanford.edu/stream/sqr.
[19]
The STREAM Group. STREAM: The Stanford Stream Data Manager. IEEE Data Engineering Bulletin, 26(1):19--26, March 2003.
[20]
P. Tucker, D. Maier, T. Sheard, and L. Fegaras. Exploiting punctuation semantics in continuous data streams. IEEE Transactions on Knowledge and Data Engineering, 15(3):1--14, June 2003.
[21]
S. Zdonik, M. Stonebraker, et al. The Aurora and Medusa Projects. IEEE Data Engineering Bulletin, 26(1):3--10, March 2003.

Cited By

View all
  • (2024)Differentially Private Stream Processing at ScaleProceedings of the VLDB Endowment10.14778/3685800.368583317:12(4145-4158)Online publication date: 1-Aug-2024
  • (2023)PRISM: A Hierarchical Intrusion Detection Architecture for Large-Scale Cyber NetworksIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.324031520:6(5070-5086)Online publication date: Nov-2023
  • (2023)Enhanced Machine Learning Sketches for Network MeasurementsIEEE Transactions on Computers10.1109/TC.2022.318556072:4(957-970)Online publication date: 1-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2004
350 pages
ISBN:158113858X
DOI:10.1145/1055558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2004

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS04

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)50
  • Downloads (Last 6 weeks)3
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Differentially Private Stream Processing at ScaleProceedings of the VLDB Endowment10.14778/3685800.368583317:12(4145-4158)Online publication date: 1-Aug-2024
  • (2023)PRISM: A Hierarchical Intrusion Detection Architecture for Large-Scale Cyber NetworksIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.324031520:6(5070-5086)Online publication date: Nov-2023
  • (2023)Enhanced Machine Learning Sketches for Network MeasurementsIEEE Transactions on Computers10.1109/TC.2022.318556072:4(957-970)Online publication date: 1-Apr-2023
  • (2023)Indexing for Near-Sorted Data2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00117(1475-1488)Online publication date: Apr-2023
  • (2023)A survey on the evolution of stream processing systemsThe VLDB Journal10.1007/s00778-023-00819-833:2(507-541)Online publication date: 22-Nov-2023
  • (2023)Anforderungsanalyse für ein System zur automatisierten Ereignisdetektion in marinen UmgebungenUmweltinformationssysteme – Vielfalt, Offenheit, Komplexität10.1007/978-3-658-39796-8_10(149-165)Online publication date: 22-Mar-2023
  • (2022)DLACEP: A Deep-Learning Based Framework for Approximate Complex Event ProcessingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526136(340-354)Online publication date: 10-Jun-2022
  • (2022)A new benchmark harness for systematic and robust evaluation of streaming state storesProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519592(559-574)Online publication date: 28-Mar-2022
  • (2022)D3Proceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519576(453-471)Online publication date: 28-Mar-2022
  • (2022)Unleashing the power of querying streaming data in a temporal database worldInformation Systems10.1016/j.is.2021.101872103:COnline publication date: 1-Jan-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media