Provenance for distributed biomedical workflow execution

S. Madougou, M. Santcroos, A. Benabdelkader, B.D. van Schaik, S. Shahand, V. Korkhov, A.H.C. van Kampen, S.D. Olabarriaga

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)

Abstract

Scientific research has become very data and compute intensive because of the progress in data acquisition and measurement devices, which is particularly true in Life Sciences. To cope with this deluge of data, scientists use distributed computing and storage infrastructures. The use of such infrastructures introduces by itself new challenges to the scientists in terms of proper and efficient use. Scientific workflow management systems play an important role in facilitating the use of the infrastructure by hiding some of its complexity. Although most scientific workflow management systems are provenance-aware, not all of them come with provenance functionality out of the box. In this paper we describe the improvement and integration of a provenance system into an e-infrastructure for biomedical research based on the MOTEUR workflow management system. The main contributions of the paper are: presenting an OPM implementation using relational database backend for the provenance store, providing an e-infrastructure with a comprehensive provenance system, defining a generic approach to provenance implementation, potentially suitable for other workflow systems and application domains and demonstrating the value of this system based on use cases presenting the provenance data through a user-friendly web interface.
Original languageEnglish
Pages (from-to)91-100
JournalStudies in health technology and informatics
Volume175
DOIs
Publication statusPublished - 2012

Cite this