Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1989323.1989375acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Labeling recursive workflow executions on-the-fly

Published: 12 June 2011 Publication History

Abstract

This paper presents a compact labeling scheme for answering reachability queries over workflow executions. In contrast to previous work, our scheme allows nodes (processes and data) in the execution graph to be labeled on-the-fly, i.e., in a dynamic fashion. In this way, reachability queries can be answered as soon as the relevant data is produced. We first show that, in general, for workflows that contain recursion, dynamic labeling of executions requires long (linear-size) labels. Fortunately, most real-life scientific workflows are linear recursive, and for this natural class we show that dynamic, yet compact (logarithmic-size) labeling is possible. Moreover, our scheme labels the executions in linear time, and answers any reachability query in constant time. We also show that linear recursive workflows are, in some sense, the largest class of workflows that allow compact, dynamic labeling schemes. Interestingly, the empirical evaluation, performed over both real and synthetic workflows, shows that our proposed dynamic scheme outperforms the state-of-the-art static scheme for large executions, and creates labels that are shorter by a factor of almost 3.

References

[1]
R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient management of transitive relationships in large data and knowledge bases. In SIGMOD Conference, pages 253--262, 1989.
[2]
S. Alstrup, P. Bille, and T. Rauhe. Labeling schemes for small distances in trees. In SODA, pages 689--698, 2003.
[3]
S. Alstrup and T. Rauhe. Improved labeling scheme for ancestor queries. In SODA, pages 947--953, 2002.
[4]
I. Altintas, C. Berkley, E. Jaeger, M. B. Jones, B. Ludascher, and S. Mock. Kepler: An extensible system for design and execution of scientific workflows. In SSDBM, pages 423--424, 2004.
[5]
Z. Bao, S. B. Davidson, S. Khanna, and S. Roy. An optimal labeling scheme for workflow provenance using skeleton labels. In SIGMOD Conference, pages 711--722, 2010.
[6]
S. P. Callahan, J. Freire, E. Santos, C. E. Scheidegger, C. T. Silva, and H. T. Vo. Vistrails: visualization meets data management. In SIGMOD Conference, pages 745--747, 2006.
[7]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. In SODA, pages 937--946, 2002.
[8]
E. Cohen, H. Kaplan, and T. Milo. Labeling dynamic xml trees. In PODS, pages 271--281, 2002.
[9]
C. Demetrescu and G. F. Italiano. Fully dynamic transitive closure: Breaking through the o(n2) barrier. In FOCS, pages 381--389, 2000.
[10]
P. Fraigniaud and A. Korman. Compact ancestry labeling schemes for xml trees. In SODA, pages 458--466, 2010.
[11]
T. Heinis and G. Alonso. Efficient lineage tracking for scientific workflows. In SIGMOD Conference, pages 1007--1018, 2008.
[12]
D. Hull, K. Wolstencroft, R. Stevens, C. A. Goble, M. R. Pocock, P. Li, and T. Oinn. Taverna: a tool for building and running workflows of services. Nucleic Acids Research, 34(Web-Server-Issue):729--732, 2006.
[13]
H. V. Jagadish. A compression technique to materialize transitive closure. ACM Trans. Database Syst., 15(4):558--598, 1990.
[14]
R. Jin, Y. Xiang, N. Ruan, and D. Fuhry. 3-hop: a high-compression indexing scheme for reachability query. In SIGMOD Conference, pages 813--826, 2009.
[15]
R. Jin, Y. Xiang, N. Ruan, and H. Wang. Efficiently answering reachability queries on very large directed graphs. In SIGMOD Conference, pages 595--608, 2008.
[16]
H. Kaplan, T. Milo, and R. Shabo. A comparison of labeling schemes for ancestor queries. In SODA, pages 954--963, 2002.
[17]
V. King and G. Sagert. A fully dynamic algorithm for maintaining the transitive closure. In STOC, pages 492--498, 1999.
[18]
P. E. O'Neil, E. J. O'Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury. Ordpaths: Insert-friendly xml node labels. In SIGMOD Conference, pages 903--908, 2004.
[19]
D. D. Roure, C. A. Goble, and R. Stevens. The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Future Generation Comp. Syst., 25(5):561--567, 2009.
[20]
N. Santoro and R. Khatib. Labelling and implicit routing in networks. Comput. J., 28(1):5--8, 1985.
[21]
L. Xu, T. W. Ling, H. Wu, and Z. Bao. Dde: from dewey to a fully dynamic xml labeling scheme. In SIGMOD Conference, pages 719--730, 2009.
[22]
H. Yildirim, V. Chaoji, and M. J. Zaki. Grail: Scalable reachability index for large graphs. PVLDB, 3(1):276--284, 2010.

Cited By

View all
  • (2023)Efficiently Cleaning Structured Event Logs: A Graph Repair ApproachACM Transactions on Database Systems10.1145/357128148:1(1-44)Online publication date: 13-Mar-2023
  • (2018)Efficient provenance tracking for datalog using top-k queriesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0496-727:2(245-269)Online publication date: 1-Apr-2018
  • (2017)Privacy-preserving network provenanceProceedings of the VLDB Endowment10.14778/3137628.313766110:11(1550-1561)Online publication date: 1-Aug-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
June 2011
1364 pages
ISBN:9781450306614
DOI:10.1145/1989323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dynamic labeling scheme
  2. reachability query
  3. recursive workflow

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Efficiently Cleaning Structured Event Logs: A Graph Repair ApproachACM Transactions on Database Systems10.1145/357128148:1(1-44)Online publication date: 13-Mar-2023
  • (2018)Efficient provenance tracking for datalog using top-k queriesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0496-727:2(245-269)Online publication date: 1-Apr-2018
  • (2017)Privacy-preserving network provenanceProceedings of the VLDB Endowment10.14778/3137628.313766110:11(1550-1561)Online publication date: 1-Aug-2017
  • (2015)Answering regular path queries on workflow provenance2015 IEEE 31st International Conference on Data Engineering10.1109/ICDE.2015.7113299(375-386)Online publication date: Apr-2015
  • (2015)Cleaning structured event logs: A graph repair approach2015 IEEE 31st International Conference on Data Engineering10.1109/ICDE.2015.7113270(30-41)Online publication date: Apr-2015
  • (2013)Search and result presentation in scientific workflow repositoriesProceedings of the 25th International Conference on Scientific and Statistical Database Management10.1145/2484838.2484847(1-12)Online publication date: 29-Jul-2013
  • (2012)Labeling workflow views with fine-grained dependenciesProceedings of the VLDB Endowment10.14778/2350229.23502405:11(1208-1219)Online publication date: 1-Jul-2012

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media