The Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data

Peer C. Brauer¹⁵,
Florian Fittkau¹⁵ &
Wilhelm Hasselbring¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8628))

Included in the following conference series:

International Provenance and Annotation Workshop

1674 Accesses
7 Citations

Abstract

With aspect-oriented programming techniques, modularity may be achieved via separating cross-cutting concerns. Data provenance can be considered as a cross-cutting concern: code for collecting provenance data is usually scattered across various places in a software system. Aspect-oriented programming allows to seamlessly integrate cross-cutting concerns into existing software applications without interference with the original system.

You have full access to this open access chapter, Download conference paper PDF

Bringing Semantics to Aspect-Oriented Business Process Management

Avoiding Code Pitfalls in Aspect-Oriented Programming

Towards a Flexible and Transparent Database Evolution

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Following this approach, CAPS^{Footnote 1} is a framework to weave provenance-capturing mechanisms into existing Java applications, which are not yet provenance aware. The CAPS framework employs AspectJ [5],^{Footnote 2} the Kieker framework [4, 7],^{Footnote 3} the Java Management Extensions JMX,^{Footnote 4} and some Java security mechanisms to automatically collect the provenance information. Woven inside the application as a minimal-invasive integration of the provenance capturing mechanisms, CAPS monitors the execution of the software. Whenever a data set is processed, CAPS creates the corresponding provenance graph entry. The graph itself is stored in an integrated provenance archive build on top of the Neo4j graph database.^{Footnote 5} CAPS is implemented and evaluated in the context of the PubFlow workflow system for semi-automatic research data publication [2]. In particular, workflow-generated provenance data is automatically gathered via CAPS, without mixing program logic with provenance mechanisms.

For deployment, CAPS provides a GWT-based web interface,^{Footnote 6} which allows the user to upload his own scientific Java applications to the CAPS runtime environment. While uploading the application, the user has to provide basic information about the application and its runtime environment. These include:

the deployment type of the application (e.g., web based, Java archive),
virtual machine parameters,
application parameters and
the URL of an existing CAPS Provenance Archive instance in case of standalone applications.

Based on the provided information, CAPS suggests so-called application profiles for the application to be deployed. A profile contains a predefined selection of aspects and Kieker monitoring probes, that are applicable to the type of the given application. CAPS also provides profiles for Java-based workflow systems such as jBPM.^{Footnote 7} The user can refine the suggested profile or switch to another profile that collects more detailed information profile.

After selection of the profile to be applied to the application, CAPS creates a runtime configuration based on the provided information. After the creation of the profile, the user may check the configuration via a profiling run.

If the user chooses to initiate a profiling run, the system starts the application and displays the provenance information, captured by CAPS. This provides the user the opportunity to check, whether all relevant aspects of the system are under surveillance, and whether the monitoring level should be increased or decreased. The user can repeat this process to optimize the provenance trace produced by CAPS.

CAPS uses the Java sandbox security mechanism to intercept I/O and network calls.^{Footnote 8} We employ these components by weaving our monitoring probes directly into those methods that are responsible for checking the applications’ calls against the JVM security constrains. CAPS also alters the configuration of the JVM for the client application which always activates the sandbox, whenever the application starts. It also obtains additional basic runtime information about the client application by querying the JMX interface.

Next, the user has to decide, whether the application should be exported as a standalone application, such that it can be used without CAPS, or whether the application should be added to the CAPS application library. For standalone applications, CAPS creates a so-called CAPS connector and embeds it into the application. The connector is responsible for connecting the application to the CAPS server, so the provenance data created by the application can be analyzed and archived.

To extract the provenance information from the collected monitoring data, CAPS utilizes the existing data analysis functionality of the Kieker framework, i.e. the analysis framework and the Kieker WebGUI [3].

CAPS provides specific Kieker filters, that can be used to filter the provenance data from the stream of monitoring records. These filters is described in [1]. CAPS comes with predefined analysis components, and offers the user to create her own analysis components. Predefined analyses are, for example, available for creating the PROV-O^{Footnote 9} provenance graph or for reconstructing workflows in scientific workflow environments.

To store the provenance information collected by the framework, CAPS uses an integrated provenance archive. The archive is built on top of the Eclipse Modeling Framework Project (EMF),^{Footnote 10} the Google Web Toolkit (GWT)^{Footnote 11} the PubFlow Graphframework,^{Footnote 12} and Neo4j. It was a result of the W3C call for implementations of the PROV-O data model.^{Footnote 13} The provenance archive is developed based on an extended version of the PROV-DM [6], implemented with the Eclipse Modeling Framework. We made small additions to the PROV-DM model, such that we can store some additional information, like execution time stamps and user roles. However, we keep our model compatible to the original W3C PROV-DM. As persistence layer for our provenance archive we chose a Neo4j graph database. This offers the advantage of benefiting from the specific graph algorithms provided by the database engine. To store our EMF model in the graph database we are currently building a new persistence layer based on neo4emf,^{Footnote 14} a framework that allows mapping an EMF model to a Neo4j database.

Notes

1.
CAPS stands for Capturing and Archiving Provenance in Scientific workflows.
2.
http://eclipse.org/aspectj/.
3.
http://kieker-monitoring.net/.
4.
http://docs.oracle.com/javase/tutorial/jmx/.
5.
http://www.neo4j.org/.
6.
http://www.gwtproject.org/.
7.
http://www.jboss.org/jbpm.
8.
http://docs.oracle.com/javase/7/docs/technotes/guides/security/spec/security-spec.doc1.html.
9.
http://www.w3.org/TR/prov-o/.
10.
http://www.eclipse.org/modeling/emf/.
11.
http://www.gwtproject.org/.
12.
http://www.pubflow.uni-kiel.de/en/the-framework/the-graphframework.
13.
http://www.w3.org/TR/2013/NOTE-prov-implementations-20130430/.
14.
http://neo4emf.com/.

References

Brauer, P.C., Hasselbring, W.: Capturing provenance information with a workflow monitoring extension for the Kieker framework. In: Proceedings of the 3rd International Workshop on Semantic Web in Provenance Management, CEUR-WS, May 2012. http://eprints.uni-kiel.de/19636/
Brauer, P.C., Hasselbring, W.: PubFlow: a scientific data publication framework for marine science. In: Proceedings of the International Conference on Marine Data and Information Systems (IMDIS 2013), vol. 54, pp. 29–31, September 2013. http://eprints.uni-kiel.de/22399/
Ehmke, N.C.: Everything in sight: Kieker’s WebGUI in action. In: Proceedings of the Symposium on Software Performance: Joint Kieker/Palladio Days 2013, pp. 11–19. CEUR-WS, Nov 2013. http://eprints.uni-kiel.de/22528/
van Hoorn, A., Waller, J., Hasselbring, W.: Kieker: A framework for application performance monitoring and dynamic software analysis. In: Proceedings of the 3rd joint ACM/SPEC International Conference on Performance Engineering (ICPE 2012), pp. 247–248. ACM, April 2012. http://eprints.uni-kiel.de/14418/
Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., Griswold, W.G.: An overview of aspectJ. In: Lindskov Knudsen, J. (ed.) ECOOP 2001. LNCS, vol. 2072, p. 327. Springer, Heidelberg (2001)
Google Scholar
Moreau, L., Missier, P.: PROV-DM: The prov data model. Technical report, World Wide Web Consortium (2013)
Google Scholar
Rohr, M., van Hoorn, A., Matevska, J., Sommer, N., Stoever, L., Giesecke, S., Hasselbring, W.: Kieker: Continuous monitoring and on demand visualization of Java software behavior. In: Proceedings of the IASTED International Conference on Software Engineering 2008 (SE’08), pp. 80–85, Feb 2008
Google Scholar

Download references

Author information

Authors and Affiliations

Software Engineering Group, Kiel University, Kiel, Germany
Peer C. Brauer, Florian Fittkau & Wilhelm Hasselbring

Authors

Peer C. Brauer
View author publications
You can also search for this author in PubMed Google Scholar
Florian Fittkau
View author publications
You can also search for this author in PubMed Google Scholar
Wilhelm Hasselbring
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wilhelm Hasselbring .

Editor information

Editors and Affiliations

University of Illinois, Urbana-Champaign, USA
Bertram Ludäscher
Indiana University, Bloomington, USA
Beth Plale

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brauer, P.C., Fittkau, F., Hasselbring, W. (2015). The Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data. In: Ludäscher, B., Plale, B. (eds) Provenance and Annotation of Data and Processes. IPAW 2014. Lecture Notes in Computer Science(), vol 8628. Springer, Cham. https://doi.org/10.1007/978-3-319-16462-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-16462-5_19
Published: 21 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16461-8
Online ISBN: 978-3-319-16462-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data

Abstract

Similar content being viewed by others

Bringing Semantics to Aspect-Oriented Business Process Management

Avoiding Code Pitfalls in Aspect-Oriented Programming

Towards a Flexible and Transparent Database Evolution

Keywords

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The Aspect-Oriented Architecture of the CAPS Framework for Capturing, Analyzing and Archiving Provenance Data

Abstract

Similar content being viewed by others

Bringing Semantics to Aspect-Oriented Business Process Management

Avoiding Code Pitfalls in Aspect-Oriented Programming

Towards a Flexible and Transparent Database Evolution

Keywords

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation