Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11890850_6guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Towards low overhead provenance tracking in near real-time stream filtering

Published: 03 May 2006 Publication History

Abstract

Data streams flowing from the physical environment are as unpredictable as the environment itself. Radars go down, long haul networks drop packets, and readings are corrupted on the wire. Yet the data driven scientific models and data mining algorithms do not necessarily account for the inaccuracies when assimilating the data. Low overhead provenance collection partially solves this problem. We propose a data model and collection model for near real time provenance collection. We define a system architecture for stream provenance tracking and motivate with a real-world application in meteorology forecasting.

References

[1]
Abadi, D. J. et. al.: The Design of the Borealis Stream Processing Engine. Conference on Innovative Data Systems Research (CIDR) (2005).
[2]
Babcock, B., Babu, S., Datar, M., Motwani, R., and Widom, J.: Models and issues in data stream systems, ACM Symposium on Principles of Database Systems, (2002).
[3]
Chandrasekaran, S., et. al.: TelegraphCQ: continuous dataflow processing. International conference on Management of Data (SIGMOD) (2003).
[4]
Chen, L., Reddy, K., and Agrawal, G.: GATES: A Grid-Based Middleware for Processing Distributed Data Streams. IEEE International Symposium on High-Performance Distributed Computing,(2004).
[5]
Droegemeier, K., et al.: Service-oriented environments in research and education for dynamically interacting with mesoscale weather. IEEE Computing in Science and Engineering, (2005), 7(6).
[6]
Foster, I., Vockler, J., Wilde, M., and Zhao, Y.: The Virtual Data Grid: A new model and architecture for data-intensive collaboration. Conference on Innovative Data Systems Research, (2003).
[7]
Groth, P., Luck, M., and Moreau, L.: A protocol for recording provenance in service-oriented Grids. International Conference on Principles of Distributed Systems (2004).
[8]
Log4j. Apache Software Foundation. http://logging.apache.org/log4j/
[9]
Myers, J.D., Chappell, A., Elder M., Geist A., and Schwidder, J.: Re-Integrating the Research Record. IEEE Computing in Science and Engineering (2003) 5(3):44- 50.
[10]
The OGSA-DAI Project. http://www.ogsadai.org.uk/
[11]
Plale, B., Schwan, K.: Dynamic querying of streaming data with the dQUOB system. IEEE Transactions on Parallel and Distributed Systems, 14(4):422-432, (2003).
[12]
Simmhan, Y. L., Plale, B., and Gannon, D.: A survey of data provenance in escience. ACM SIGMOD Record, (2005) 34(3):31-36.
[13]
Simmhan, L. Yogesh, Plale, B., Gannon, D., and Marru, S.: Performance Evaluation of the Karma Provenance Framework for Scientific Workflows. Intl Provenance and Annotation Workshop (2006).
[14]
Szomszor, M., and Moreau, L.: Recording and reasoning over data provenance in web and grid services. Int. conference on ontologies, databases and applications of semantics (2003).
[15]
TeraGrid. http://www.teragrid.org.
[16]
Tan, V., Groth, P., Miles, S., Jiang, S., Munroe, S., Tsasakou, S., and Moreau, L: Security Issues in a SOA-based Provenance System. Intl Provenance and Annotation Workshop (2006).
[17]
Vijayakumar, N., Liu, Y., Plale, B.: Calder query grid service: Insights and experimental evaluation. To appear in CCGrid (2006).
[18]
Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. Conference on Innovative Data Systems Research (2005).

Cited By

View all

Index Terms

  1. Towards low overhead provenance tracking in near real-time stream filtering

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IPAW'06: Proceedings of the 2006 international conference on Provenance and Annotation of Data
    May 2006
    288 pages
    ISBN:354046302X
    • Editors:
    • Luc Moreau,
    • Ian Foster

    Sponsors

    • Springer
    • Microsoft: Microsoft

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 03 May 2006

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 21 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)AnankeProceedings of the VLDB Endowment10.14778/3430915.343092814:3(391-403)Online publication date: 9-Dec-2021
    • (2020)Incremental Inference of Provenance TypesProvenance and Annotation of Data and Processes10.1007/978-3-030-80960-7_9(145-162)Online publication date: 22-Jun-2020
    • (2018)GeneaLogProceedings of the 19th International Middleware Conference10.1145/3274808.3274826(227-238)Online publication date: 26-Nov-2018
    • (2016)Analysis of Memory Constrained Live ProvenanceProceedings of the 6th International Workshop on Provenance and Annotation of Data and Processes - Volume 967210.5555/3090188.3090193(42-54)Online publication date: 7-Jun-2016
    • (2014)Efficient Stream Provenance via Operator InstrumentationACM Transactions on Internet Technology10.1145/263368914:1(1-26)Online publication date: 7-Aug-2014
    • (2013)AriadneProceedings of the 7th ACM international conference on Distributed event-based systems10.1145/2488222.2488256(39-50)Online publication date: 29-Jun-2013
    • (2012)Demonstrating a lightweight data provenance for sensor networksProceedings of the 2012 ACM conference on Computer and communications security10.1145/2382196.2382312(1022-1024)Online publication date: 16-Oct-2012
    • (2011)Provenance security guarantee from origin up to now in the e-Science environmentJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2010.04.00657:4(425-440)Online publication date: 1-Apr-2011
    • (2010)Visual debugging for stream processing applicationsProceedings of the First international conference on Runtime verification10.5555/1939399.1939403(18-35)Online publication date: 1-Nov-2010
    • (2010)Assuring data trustworthinessProceedings of the 7th VLDB conference on Secure data management10.5555/1889159.1889161(1-12)Online publication date: 17-Sep-2010
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media