Proceedings of the 2006 international conference on Provenance and Annotation of Data

IPAW'06: Proceedings of the 2006 international conference on Provenance and Annotation of Data

May 2006

2006 Proceeding

Editors:
Luc Moreau
School of Electronics and Computer Science, University of Southampton, Southhampton, UK
,
Ian Foster
School of Electronics and Computer Science, Mathematics & Computer Science Division, Argonne National Laboratory, Southhampton

Publisher:

Springer-Verlag
Berlin, Heidelberg

Conference:

Chicago IL May 3 - 5, 2006

ISBN:

978-3-540-46302-3

Published:

03 May 2006

Sponsors:

Springer, Microsoft

Bibliometrics

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Preface)

PDFBack matter (Author Index)

Select All

Export Citations Save to Binder

SECTION: Keynotes

Article

Automatic generation of workflow provenance

Pages 1–9https://doi.org/10.1007/11890850_1

While workflow is playing an increasingly important role in e-Science, current systems lack support for the collection of provenance data. We argue that workflow provenance data should be automatically generated by the enactment engine and managed over ...

Article

Managing rapidly-evolving scientific workflows

Pages 10–18https://doi.org/10.1007/11890850_2

We give an overview of VisTrails, a system that provides an infrastructure for systematically capturing detailed provenance and streamlining the data exploration process. A key feature that sets VisTrails apart from previous visualization and scientific ...

SECTION: Applications

Article

Virtual logbooks and collaboration in science and software development

Pages 19–27https://doi.org/10.1007/11890850_3

A key feature of collaboration is having a log of what and how is being done – for private use/reuse and for sharing selected parts with collaborators in today's complex, large scale scientific/software environments. Even better if this log is automatic,...

Article

Applying provenance in distributed organ transplant management

Pages 28–36https://doi.org/10.1007/11890850_4

The use of ICT solutions applied to Healthcare in distributed scenarios should not only provide improvements in the distributed processes and services they are targeted to assist but also provide ways to trace all the meaningful events and decisions ...

Article

Provenance implementation in a scientific simulation environment

Pages 37–45https://doi.org/10.1007/11890850_5

Many of today's engineering applications for simulations are lacking machanisms to trace the generation of results and the underlying processes. Especially computations conducted in distribued computing environments as Grids are lacking suitable means ...

Article

Towards low overhead provenance tracking in near real-time stream filtering

Pages 46–54https://doi.org/10.1007/11890850_6

Data streams flowing from the physical environment are as unpredictable as the environment itself. Radars go down, long haul networks drop packets, and readings are corrupted on the wire. Yet the data driven scientific models and data mining algorithms ...

Article

Enabling provenance on large scale e-science applications

Pages 55–63https://doi.org/10.1007/11890850_7

Large-scale e-Science experiments present unprecedented data handling requirements with their multi-petabyte data storages. Complex software applications, such as the ATLAS High Energy Physics experiment at CERN, run throughout Grid computing sites ...

SECTION: Semantics 1

Article

Harvesting RDF triples

Joe Futrelle

Pages 64–72https://doi.org/10.1007/11890850_8

Managing scientific data requires tools that can track complex provenance information about digital resources and workflows. RDF triples are a convenient abstraction for combining independently-generated factual statements, including statements about ...

Article

Mapping physical formats to logical models to extract data and metadata: the defuddle parsing engine

Pages 73–81https://doi.org/10.1007/11890850_9

Scientists, motivated by the desire for systems-level understanding of phenomena, increasingly need to share their results across multiple disciplines. Accomplishing this requires data to be annotated, contextualized, and readily searchable and ...

Article

Annotation and provenance tracking in semantic web photo libraries

Pages 82–89https://doi.org/10.1007/11890850_10

As the volume of digital images available on the Web continues to increase, there is a clear need for more advanced techniques for their effective retrieval and management. In this paper, we present a domain independent framework for both annotating and ...

Article

Metadata catalogs with semantic representations

Pages 90–100https://doi.org/10.1007/11890850_11

Metadata catalogs store descriptive information about logical data items. These catalogs can then be queried to retrieve the particular logical data item that matches the criteria. However, the query has to be formulated in terms of the metadata ...

Article

Combining provenance with trust in social networks for semantic web content filtering

Jennifer Golbeck

Pages 101–108https://doi.org/10.1007/11890850_12

Social networks are a popular movement on the web. On the Semantic Web, it is simple to make trust annotations to social relationships. In this paper, we present a two level approach to integrating trust, provenance, and annotations in Semantic Web ...

SECTION: Workflow

Article

Recording actor state in scientific workflows

Pages 109–117https://doi.org/10.1007/11890850_13

The process which leads to a particular data item, or its provenance, may be documented in a number of ways. The recording of actor state assertions – essentially data that a client or service actor may assert about itself regarding an interaction, is ...

Article

Provenance collection support in the kepler scientific workflow system

Pages 118–132https://doi.org/10.1007/11890850_14

In many data-driven applications, analysis needs to be performed on scientific information obtained from several sources and generated by computations on distributed resources. Systematic analysis of this scientific information unleashes a growing need ...

Article

A model for user-oriented data provenance in pipelined scientific workflows

Pages 133–147https://doi.org/10.1007/11890850_15

Integrated provenance support promises to be a chief advantage of scientific workflow systems over script-based alternatives. While it is often recognized that information gathered during scientific workflow execution can be used automatically to ...

Article

Applying the virtual data provenance model

Pages 148–161https://doi.org/10.1007/11890850_16

In many domains of science, engineering, and commerce, data analysis systems are employed to derive new data (and ultimately, one hopes, knowledge) from datasets describing experimental results or simulated phenomena. To support such analyses, we have ...

SECTION: Models of provenance, annotations and processes

Article

A provenance model for manually curated data

Pages 162–170https://doi.org/10.1007/11890850_17

Many curated databases are constructed by scientists integrating various existing data sources “by hand”, that is, by manually entering or copying data from other sources. Capturing provenance in such an environment is a challenging problem, requiring a ...

Article

Issues in automatic provenance collection

Pages 171–183https://doi.org/10.1007/11890850_18

Automatic provenance collection describes systems that observe processes and data transformations inferring, collecting, and maintaining provenance about them. Automatic collection is a powerful tool for analysis of objects and processes, providing a ...

Article

Electronically querying for the provenance of entities

Simon Miles

Pages 184–192https://doi.org/10.1007/11890850_19

The provenance of entities, whether electronic data or physical artefacts, is crucial information in practically all domains, including science, business and art. The increased use of software in automating activities provides the opportunity to add ...

Article

AstroDAS: sharing assertions across astronomy catalogues through distributed annotation

Pages 193–202https://doi.org/10.1007/11890850_20

As diverse scientific data collections migrate online, researchers want the ability to share their assertions regarding the entities that span these disparate databases. We focus on a case study provided by the astronomical community's Virtual ...

SECTION: Systems

Article

Security issues in a SOA-Based provenance system

Pages 203–211https://doi.org/10.1007/11890850_21

Recent work has begun exploring the characterization and utilization of provenance in systems based on the Service Oriented Architecture (such as Web Services and Grid based environments). One of the salient issues related to provenance use within any ...

Article

Implementing a secure annotation service

Pages 212–221https://doi.org/10.1007/11890850_22

Annotation systems enable “value-adding” to digital resources by the attachment of additional data in the form of comments, explanations, references, reviews and other types of external, subjective remarks. They facilitate group discourse and capture ...

Article

Performance evaluation of the karma provenance framework for scientific workflows

Pages 222–236https://doi.org/10.1007/11890850_23

Provenance about workflow executions and data derivations in scientific applications help estimate data quality, track resources, and validate in silico experiments. The Karma provenance framework provides a means to collect workflow, process, and data ...

Article

Exploring provenance in a distributed job execution system

Pages 237–245https://doi.org/10.1007/11890850_24

We examine provenance in the context of a distributed job execution system. It is crucial to capture provenance information during the execution of a job in a distributed environment because often this information is lost once the job has finished. In ...

Article

gLite job provenance

Pages 246–253https://doi.org/10.1007/11890850_25

The Job Provenance (JP) service is designed to automate keeping track of computations on large scale Grids, giving thus users a tool to correctly archive information about their jobs and to re-submit any job in a reconstructed environment. JP provides a ...

SECTION: Semantics 2

Article

An identity crisis in the life sciences

Pages 254–269https://doi.org/10.1007/11890850_26

^myGrid is an e-Science project assisting life scientists to build workflows that gather data from distributed, autonomous, replicated and heterogeneous resources. The provenance logs of workflow executions are recorded as RDF graphs. The log of one ...

Article

CombeChem: a case study in provenance and annotation using the semantic web

Pages 270–277https://doi.org/10.1007/11890850_27

The CombeChem e-Science project has demonstrated the advantages of using Semantic Web technology, in particular RDF and triplestores, to describe and link diverse and complex chemical information, covering the whole process of the generation of chemical ...

Article

Principles of high quality documentation for provenance: a philosophical discussion

Pages 278–286https://doi.org/10.1007/11890850_28

Computer technology enables the creation of detailed documentation about the processes that create or affect entities (data, objects, etc.). Such documentation of the past can be used to answer various kinds of questions regarding the processes that led ...

Cited By

Contributors

L. Moreau
King's College London
- Publication Years
- Publication counts0
- Citation count1,871
- Available for Download0
- Downloads (cumulative)37,350
- Downloads (12 months)1,939
- Downloads (6 weeks)262
- Average Downloads per Article0
- Average Citation per Article0
View Full Profile
Ian Foster
Argonne National Laboratory
- Publication Years
- Publication counts0
- Citation count11,581
- Available for Download0
- Downloads (cumulative)142,813
- Downloads (12 months)18,719
- Downloads (6 weeks)1,896
- Average Downloads per Article0
- Average Citation per Article0
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Browse Proceedings

Sections

Proceeding Downloads

Automatic generation of workflow provenance

Managing rapidly-evolving scientific workflows

Virtual logbooks and collaboration in science and software development

Applying provenance in distributed organ transplant management

Provenance implementation in a scientific simulation environment

Towards low overhead provenance tracking in near real-time stream filtering

Enabling provenance on large scale e-science applications

Harvesting RDF triples

Mapping physical formats to logical models to extract data and metadata: the defuddle parsing engine

Annotation and provenance tracking in semantic web photo libraries

Metadata catalogs with semantic representations

Combining provenance with trust in social networks for semantic web content filtering

Recording actor state in scientific workflows

Provenance collection support in the kepler scientific workflow system

A model for user-oriented data provenance in pipelined scientific workflows

Applying the virtual data provenance model

A provenance model for manually curated data

Issues in automatic provenance collection

Electronically querying for the provenance of entities

AstroDAS: sharing assertions across astronomy catalogues through distributed annotation

Security issues in a SOA-Based provenance system

Implementing a secure annotation service

Performance evaluation of the karma provenance framework for scientific workflows

Exploring provenance in a distributed job execution system

gLite job provenance

An identity crisis in the life sciences

CombeChem: a case study in provenance and annotation using the semantic web

Principles of high quality documentation for provenance: a philosophical discussion

Cited By

Save to Binder

Sections

Proceeding Downloads

Cited By

Save to Binder

Recommendations