In April 2016 Manchester eScholar was replaced by the University of Manchester’s new Research Information Management System, Pure. In the autumn the University’s research outputs will be available to search and browse via a new Research Portal. Until then the University’s full publication record can be accessed via a temporary portal and the old eScholar content is available to search and browse via this archive.

Enhancing and abstracting scientific workflow provenance for data publishing

Pinar Alper, Khalid Belhajjame, Carole A. Goble, Pinar Karagoz

In: EDBT '13 Proceedings of the Joint EDBT/ICDT 2013 Workshops : EDBT '13 Proceedings of the Joint EDBT/ICDT 2013 Workshops ; 18 Mar 2013-22 Mar 2013; Genoa, Italy. ACM; 2013.

Access to files

Abstract

Many scientists are using workflows to systematically design and run computational experiments. Once the workflow is executed, the scientist may want to publish the dataset generated as a result, to be, e.g., reused by other scientists as input to their experiments. In doing so, the scientist needs to curate such dataset by specifying metadata information that describes it, e.g. its derivation history, origins and ownership. To assist the scientist in this task, we explore in this paper the use of provenance traces collected by workflow management systems when enacting workflows. Specifically, we identify the shortcomings of such raw provenance traces in supporting the data publishing task, and propose an approach whereby distilled, yet more informative, provenance traces that are fit for the data publishing task can be derived.

Keyword(s)

data; provenance; workflow

Bibliographic metadata

Type of resource:
Content type:
Type of conference contribution:
Publication date:
Conference title:
EDBT '13 Proceedings of the Joint EDBT/ICDT 2013 Workshops
Conference venue:
Genoa, Italy
Conference start date:
2013-03-18
Conference end date:
2013-03-22
Publisher:
ACM
Abstract:
Many scientists are using workflows to systematically design and run computational experiments. Once the workflow is executed, the scientist may want to publish the dataset generated as a result, to be, e.g., reused by other scientists as input to their experiments. In doing so, the scientist needs to curate such dataset by specifying metadata information that describes it, e.g. its derivation history, origins and ownership. To assist the scientist in this task, we explore in this paper the use of provenance traces collected by workflow management systems when enacting workflows. Specifically, we identify the shortcomings of such raw provenance traces in supporting the data publishing task, and propose an approach whereby distilled, yet more informative, provenance traces that are fit for the data publishing task can be derived.
Keyword(s):
Digtial Object Identifier:
http://dx.doi.org/10.1145/2457317.2457370
Language:
eng
Related website(s):
  • eScience-WF-Motifs-Taverna-DataSet http://www.myexperiment.org/files/789.html
  • Motif ontology source (outdated) https://github.com/wf4ever/ro/blob/master/motifs.owl
  • The Workflow Motif Ontology http://purl.org/net/wf-motifs

Institutional metadata

University researcher(s):

Record metadata

Manchester eScholar ID:
uk-ac-man-scw:282297
Created by:
Soiland-Reyes, Stian
Created:
7th December, 2015, 13:46:25
Last modified by:
Soiland-Reyes, Stian
Last modified:
7th December, 2015, 14:23:00

Can we help?

The library chat service will be available from 11am-3pm Monday to Friday (excluding Bank Holidays). You can also email your enquiry to us.