Audit Service Phase1: Stakeholder Feedback

Andrew Woods

unread,

Apr 16, 2015, 1:28:17 PM4/16/15

to Critchlow, Matt, Nick Ruest, Mark Jordan, Joshua Westgard, fedora-...@googlegroups.com, fedor...@googlegroups.com

Hello Audit Service Stakeholders,

As you likely know, the first sprint of the Audit Service was completed last week. It is now time to get your feedback on whether this service meets your expectations. To facilitate your acceptance testing, we have created a Vagrant box (fcrepo4-vagrant [1]) that contains the following:

- Fedora4

- Fuseki triplestore

- Camel-based Fedora4/Fuseki integration application

You should simply be able to start fcrepo4-vagrant, create some resources in Fedora4, and inspect the audit events in the Fuseki triplestore. To further simplify the feedback process, some start-up instructions and interaction recipes have been documented on the wiki [2].

We are currently working on Phase2 of the Audit Service, which is focused on optionally persisting audit events within the repository. However, since the Audit Service sprint team is available through the end of next week (4/24), any feedback you have now could potentially be addressed immediately.

Thanks in advance for your testing and feedback.

Andrew

[1] https://github.com/fcrepo4-labs/fcrepo4-vagrant

[2] https://wiki.duraspace.org/display/FF/Using+Audit+Events+Phase+1

Nick Ruest

unread,

Apr 17, 2015, 12:04:42 PM4/17/15

to Andrew Woods, Critchlow, Matt, Mark Jordan, Joshua Westgard, fedora-...@googlegroups.com, fedor...@googlegroups.com

...err I should send this from my Gmail account, not YorkU.

Hi Andrew-

I can verify that I was able to fire up the vagrant machine, and was
able to verify and perform all the queries listed. In addition, I added
some of my own objects and modified the queries, and was able to verify
and perform all the queries successfully.

cheers!

-nruest

Joshua Allan Westgard

unread,

Apr 17, 2015, 3:49:32 PM4/17/15

to Critchlow, Matt, Andrew Woods, Nick Ruest, Mark Jordan, fedora-...@googlegroups.com, fedor...@googlegroups.com

I've also been testing the audit service and have found one unexpected thing so far. I uploaded a binary to my test instance, and fixity checks are not appearing in the Fuseki list of all events. So far I've only been using the "Fixity" button in the GUI interface to perform the check. Will try one via curl as well.

Josh

On Apr 17, 2015, at 13:08, Critchlow, Matt <MCrit...@ucsd.edu> wrote:

> Hi Andrew,
>
> Like Nick, I was also able to successfully setup the vagrant machine and verify the sample instructions and queries worked as expected. I also tried a few more variations, and thus far everything is working great.
>
> Thanks to you and the Sprint team!
>
> Matt

Esmé Cowles

unread,

Apr 17, 2015, 4:27:12 PM4/17/15

to fedor...@googlegroups.com, Matt Critchlow, Andrew Woods, Nick Ruest, Mark Jordan, fedora-...@googlegroups.com

Josh-

We forgot to mention that, at the 4/9/15 committers call[1], we decided to hold off on generating fixity events. There is a PR ready to add fixity event generation[2], but we decided it would be better to rethink how we are handling fixity a little more thoroughly before proceeding with that approach.

An alternate approach proposed by Ben Armintor was to have a fixity GET request just report the most recent fixity check status, and have a POST do a new fixity check and save the outcome. In this scenario, we wouldn't need to generate fixity events, they would create nodes which would automatically create events.

-Esme

1. https://wiki.duraspace.org/display/FF/2015-04-09+-+Fedora+Tech+Meeting
2. https://github.com/fcrepo4/fcrepo4/pull/766

> --
> You received this message because you are subscribed to the Google Groups "Fedora Tech" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech...@googlegroups.com.
> To post to this group, send email to fedor...@googlegroups.com.
> Visit this group at http://groups.google.com/group/fedora-tech.
> For more options, visit https://groups.google.com/d/optout.

Joshua Westgard

unread,

Apr 17, 2015, 4:36:28 PM4/17/15

to fedor...@googlegroups.com, fedora-...@googlegroups.com, MCrit...@ucsd.edu, rue...@gmail.com, awo...@duraspace.org, mjo...@sfu.ca

Thanks, Esme. I figured there must have been some change I was missing, since everything else seemed to be working exactly as expected. Cheers, Josh

Eric James

unread,

Apr 22, 2015, 5:03:54 PM4/22/15

to fedor...@googlegroups.com, fedora-...@googlegroups.com, MCrit...@ucsd.edu, rue...@gmail.com, Andrew Woods, mjo...@sfu.ca

Hi fedora-tech,

I know this is so April 16th, but after being away, kicked the tires on the fcrepo4-vagrant audit impl, and FWIW it behaves pretty much as advertised. Was thinking about plugging in the fixity, but saw Esme's post of Ben's approach of creating fixity nodes to leverage existing jcr event structure rather than a custom fixity event, which sounds good to me, but also considering an option that would fake the event-node-creation just to trigger the jcr event, short of persisting an event object (for those that would rather not have the event node baggage in the repo (if there is anyone that considers it excess baggage?)).

Also was looking at Nick's islandora 3->4 event type mapping vs the premis import event types [1] [2]:

Where the 3->4 mapping is defined:

addDatastream->premis:create

modifyDatastreamByReference->audit:contentModification/metadataModification

modifyObject->audit:resourceModification

modifyObject (checksum validation) -> premis:validation

modifyDatastreamByValue ->audit:contentModification/metadataModification

purgeDatastream -> audit:contentRemoval

3 Comments:

1) audit:contentRemoval could use #del (which it is in AuditSparqlProcessor):

http://id.loc.gov/vocabulary/preservation/eventType/del

2) for premis:validation, there is a distinction between:

http://id.loc.gov/vocabulary/preservation/eventType/fix (proving object is the same)

http://id.loc.gov/vocabulary/preservation/eventType/mes (that a digest was calculated)

http://id.loc.gov/vocabulary/preservation/eventType/val (a validation of some kind, not necessarily a checksum (maybe file format))

3) It is surprising this vocab is lacking a modification event (the closest thing is migration) (not a problem, use audit:http://fedora.info/definitions/v4/audit#contentModification/metadataModification) just surprising.

-Eric

[1] https://github.com/Islandora-Labs/islandora/blob/7.x-2.x/docs/technical-documentation/migration.md#audit-log-migration

[2] http://id.loc.gov/vocabulary/preservation/eventType.html

Nick Ruest

unread,

Apr 22, 2015, 5:53:14 PM4/22/15

to Eric James, fedor...@googlegroups.com, fedora-...@googlegroups.com, MCrit...@ucsd.edu, Andrew Woods, mjo...@sfu.ca

Hi Eric-

Thanks for taking a look at the auditTrail mappings!

I have a couple follow-up questions/clarifications.

1. audit:contentRemoval could use #del (which it is in
AuditSparqlProcessor):
http://id.loc.gov/vocabulary/preservation/eventType/del

Are you suggesting we should use that mapping as well? Maybe owl:sameAs
or skos:closeMatch to link them?

2. Instead of using premis:validation, use premis:fixityCheck and/or
premis:messageDigestCalculation?

cheers!

-nruest

On 15-04-22 05:03 PM, Eric James wrote:
> Hi fedora-tech,
>
> I know this is so April 16th, but after being away, kicked the tires on
> the fcrepo4-vagrant audit impl, and FWIW it behaves pretty much as
> advertised. Was thinking about plugging in the fixity, but saw Esme's
> post of Ben's approach of creating fixity nodes to leverage existing jcr
> event structure rather than a custom fixity event, which sounds good to
> me, but also considering an option that would fake the
> event-node-creation just to trigger the jcr event, short of persisting
> an event object (for those that would rather not have the event node
> baggage in the repo (if there is anyone that considers it excess baggage?)).
>
> Also was looking at Nick's islandora 3->4 event type mapping vs the
> premis import event types [1] [2]:
>
> Where the 3->4 mapping is defined:
> addDatastream->premis:create
> modifyDatastreamByReference->audit:contentModification/metadataModification
> modifyObject->audit:resourceModification
> modifyObject (checksum validation) -> premis:validation
> modifyDatastreamByValue ->audit:contentModification/metadataModification

> purgeDatastream-> audit:contentRemoval

> > For more options, visithttps://groups.google.com/d/optout.

>
> --
> You received this message because you are subscribed to the Google
> Groups "Fedora Tech" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to fedora-tech...@googlegroups.com

> <mailto:fedora-tech...@googlegroups.com>.

> To post to this group, send email to fedor...@googlegroups.com

> <mailto:fedor...@googlegroups.com>.

John Doyle

unread,

Apr 22, 2015, 6:03:47 PM4/22/15

to fedor...@googlegroups.com, awo...@duraspace.org, MCrit...@ucsd.edu, fedora-...@googlegroups.com, mjo...@sfu.ca, rue...@gmail.com

We at NLM were also just looking at Nick's audit event mappings and were also wondering if PREMIS, or PREMIS + PROVO, might overlap with some of the current Fedora audit: events.

Being behind in the list posts, we weren't sure if this was already hashed out however.

Could we use PREMIS:deletion in place of audit:contentRemoval (rather than in addition to), for both F3 event mappings and for F4-managed delete events?

John

Eric James

unread,

Apr 23, 2015, 5:32:27 PM4/23/15

to fedor...@googlegroups.com, Andrew Woods, MCrit...@ucsd.edu, fedora-...@googlegroups.com, mjo...@sfu.ca, rue...@gmail.com

Hi Nick,

Yes that's what I was suggesting on both your questions. But your follow up on inference mappings is interesting. I know Ben Armintor and Aaron Coburn have started work on pulling these together [1] [2]. Loading a triplestore with inferences seems useful especially for federation.

-Eric

[1] https://github.com/duraspace/pcdm/pull/2

[2] https://github.com/AmherstCollege/acdc-ontology/blob/master/rdf/objectTypes.rdf

Mark Jordan

unread,

Apr 27, 2015, 1:31:09 PM4/27/15

to Andrew Woods, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com, fedor...@googlegroups.com

Hi Andrew,

Sorry this feedback is late. Running the sample REST requests works as expected, but it I have a suggestion for a feature. I notice that on object creation, the API returns the object's URI. Along the same lines, it might be useful to return in the response body the UUID of a successful event. External services that add events might find this convenient for internal record keeping.

Mark

Andrew Woods

unread,

Apr 27, 2015, 5:50:34 PM4/27/15

to Mark Jordan, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com, fedor...@googlegroups.com

Hello Mark,

Thanks for the feedback. If I understand you correctly, you are saying that based on the requirements for the Audit Service that we collectively defined, Phase 1 works as expected. Is that right?

Additionally, you are saying that a previously undiscussed feature around returning event UUIDs would be useful in a subsequent iteration. In terms of this new requirement, I think some further elaboration would be helpful. Currently, when any update action happens in the repository, an event is emitted. Those events contain multiple header values with the event-specific information defined in the wiki:

https://wiki.duraspace.org/display/FF/Audit+Service+Repository+Events+and+Agents

When you say, "the API returns the object's URI", are you calling the content of the emitted message "the API"? You go on to say "return in the response body...". However, the Phase 1 flow is:

* Action in repository happens

* Repository emits event

* External Apache Camel component gets emitted event

* External Apache Camel component generates an event UUID

* External Apache Camel component publishes event details into other connected components (Fuseki and Solr, in this case).

Again, are you calling the content of the emitted message the "response body"?

You likely have a useful addition to the Audit Service in mind. If you could elaborate (and potentially create a JIRA ticket), that would be very helpful.

Regards,

Andrew

p.s. Stay tuned for a Phase 2 update soon!

Andrew Woods

unread,

Apr 27, 2015, 6:41:45 PM4/27/15

to Mark Jordan, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com, fedor...@googlegroups.com

Hello Mark,

On further reflection, I think I understand your suggestion. When a client creates a new resource in Fedora via POST (or PUT), the response contains HTTP headers and an HTTP body of the URI of the created resource. Are you suggesting that in addition to the URI of the resource, that an event ID also be included in the response body?

If that is indeed the correct interpretation of your suggestion, it is something we could discuss. There would be potential complications on other update requests that do not or should not contain HTTP response bodies. Another complication is that, as noted previously, there currently is no "event ID" from Fedora's perspective. The UUID that you likely saw was the one generated by the Apache Camel component.

It strikes me as a slightly unconventional use of HTTP interactions (maybe HTTP headers would be more appropriate). But I would be interested to see if this addition would generally be viewed as important from other Audit Service stakeholders.

Regards,

Andrew

Esmé Cowles

unread,

Apr 27, 2015, 7:05:59 PM4/27/15

to fedor...@googlegroups.com, Mark Jordan, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com

I think including the eventID in the response (either in the body or a header) would require a pretty big design change. Right now, the events are processed at the event-handling layer, and there is no way to pass any information back to the REST API where the responses are generated (in fact, the response is probably already finished before the event processing happens). This allows even the in-repository storage mechanism to be in a completely separate module, with no kernel or REST API code.

Related to this, I started to write up a ticket for this earlier. But by the time I got to the end of writing up the ticket, I realized I had gone in a different direction: at the event-handling layer, we should generate an ID for the audit node, and use that consistently in both the external triplestore and repository storage of the events:

https://jira.duraspace.org/browse/FCREPO-1507

Right now, the external triplestore and in-repository storage mechanisms both generate their own random IDs, leading to different identifiers being assigned to the same event.

-Esme

> Hello Audit Service Stakeholders,
> As you likely know, the first sprint of the Audit Service was completed last week. It is now time to get your feedback on whether this service meets your expectations. To facilitate your acceptance testing, we have created a Vagrant box (fcrepo4-vagrant [1]) that contains the following:
> - Fedora4
> - Fuseki triplestore
> - Camel-based Fedora4/Fuseki integration application
>
> You should simply be able to start fcrepo4-vagrant, create some resources in Fedora4, and inspect the audit events in the Fuseki triplestore. To further simplify the feedback process, some start-up instructions and interaction recipes have been documented on the wiki [2].
>
> We are currently working on Phase2 of the Audit Service, which is focused on optionally persisting audit events within the repository. However, since the Audit Service sprint team is available through the end of next week (4/24), any feedback you have now could potentially be addressed immediately.
>
> Thanks in advance for your testing and feedback.
> Andrew
> [1] https://github.com/fcrepo4-labs/fcrepo4-vagrant
> [2] https://wiki.duraspace.org/display/FF/Using+Audit+Events+Phase+1
>
>
>
>
>
>

Andrew Woods

unread,

Apr 27, 2015, 7:10:14 PM4/27/15

to fedor...@googlegroups.com, Mark Jordan, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com

Hello Esmé,

It may not exactly address Mark's suggestion, but I think your new ticket (FCREPO-1507) is a logical improvement. Thanks!

Andrew

Mark Jordan

unread,

Apr 28, 2015, 12:31:49 AM4/28/15

to Andrew Woods, fedor...@googlegroups.com, Matt Critchlow, Nick Ruest, Joshua Westgard, fedora-...@googlegroups.com

Andrew, Esmé,

Since I'm responding after so many messages, I hope you don't mind if I contextualize my responses with URLs to the relevant entries from the fedora-tech list archives:

In https://groups.google.com/d/msg/fedora-tech/-7xfBGLdAaA/dhK5DcvzAdoJ Andrew states:

"If I understand you correctly, you are saying that based on the requirements for the Audit Service that we collectively defined, Phase 1 works as expected. Is that right?"

Correct.

In https://groups.google.com/d/msg/fedora-tech/-7xfBGLdAaA/_vkOBVBzcuYJ, Esmé says:

"I think including the eventID in the response (either in the body or a header) would require a pretty big design change. Right now, the events are processed at the event-handling layer, and there is no way to pass any information back to the REST API where the responses are generated (in fact, the response is probably already finished before the event processing happens). This allows even the in-repository storage mechanism to be in a completely separate module, with no kernel or REST API code."

I appreciate that what I am describing is a new feature, and that implementing it given the current codebase may require architectural changes that are more a of liability than a benefit. But I have to ask that if the response is already finished before the event processing happens, how does the HTTP response inform a client that the event processing has encountered an error? I have created a new feature JIRA ticket at https://jira.duraspace.org/browse/FCREPO-1508 to allow he conversation to continue there in a more focused manner, or to die on the table based on support from the community.

Regardless of the outcome of the above question, I completely agree with Andrew's opinion in https://groups.google.com/d/msg/fedora-tech/-7xfBGLdAaA/KHxLLuOjlVYJ that "(FCREPO-1507) is a logical improvement.".

Mark

aj...@virginia.edu

unread,

Apr 28, 2015, 9:22:58 AM4/28/15

to fedor...@googlegroups.com

+1 to this. Coupling on the event up into the HTTP layer would be be difficult now and would become a overflowing bug bin as time goes by. That's partially because our use of HTTP is fundamentally synchronous. That is unlikely to change anytime soon.

---
A. Soroka
The University of Virginia Library

Reply all

Reply to author

Forward