IIC Seminar: Provenance: an open approach to experiment validation
in e-Science
Date: Wednesday, March
1, 2006
Time: 4-5pm,
refreshments at 3:45
Location: 60 Oxford Street,
Room 330.
Parking: Event parking is
available at the 52 Oxford Street Garage. Please inform the parking attendant
you are attending the IIC Seminar.
Speaker: Luc Moreau,
Electronics and Computer Science, University of
Southampton
Abstract:
The importance of understanding the process by which a
result was generated in a computation is fundamental to science, engineering
or business. For example, without such information, other
scientists cannot reproduce, analyse or validate experiments.
Likewise,
businesses must demonstrate their systems' results
were produced in a regulatory-compliant manner. Provenance is therefore
important to enable users to trace how a
particular result has been arrived at.
Based on the common sense definition of provenance, we
propose a new definition of provenance that is suited to the computational
model underpinning service-oriented architectures: the
provenance of a piece of data is the process that led to the data. Since our
aim is to conceive a computer-based
representation of provenance that allows us to perform useful reasoning about
the origin of results, we examine the nature of such
representation, which is articulated around the documentation of
execution.
We then examine the architecture of a provenance
system, centered around the notion of a provenance store designed to support
the
provenance lifecycle: during a recording phase some
documentation of execution is archived in the provenance store, whereas a
reasoning phase operates over the
archived documentation. Then, we successively discuss a protocol for recording
execution documentation, a query facility to gain
access to the contents of the store, and a reasoning system to make
inferences. The realisation of such an architecture is
particularly challenging in the presence of e-Science
experiments since it must be scalable.
The presentation will draw upon our experience in the
PASOA (www.pasoa.org) and EU Provenance (www.gridprovenance.org) projects and
will rely on explicit use cases derived from e-Science applications in the
domain of bioinformatics, high energy physics, organ transplant management and
aerospace engineering.
For more information about the IIC seminar series,
please check our website at iic.harvard.edu