Request for Input: Approach for Ingesting External Content

23 views
Skip to first unread message

Andrew Woods

unread,
Jun 5, 2017, 5:08:49 PM6/5/17
to fedor...@googlegroups.com, fedora-specifi...@googlegroups.com
Hello All,
During the process of editing the Fedora API Specification [1], it has surfaced that there are potentially two mechanisms for creating binary resources where the body of which is provided with a URL reference during the creation request.

The supported scenario is, "Create this binary resource and ingest as the body of the binary the content found at this URL".

There is a proposal to the Fedora API Specification to support this scenario with the use of the "message/external-body" Content-Type combined with an "expires" parameter:
https://github.com/fcrepo/fcrepo-specification/pull/121/files#diff-eacf331f0ffc35d4b482f1d15a887d3bR931

However, there is an existing capability in the Fedora implementation that supports this same scenario with the use of the Content-Location header on a create request (PUT/POST/PATCH):
https://github.com/fcrepo4/fcrepo4/blob/master/fcrepo-http-commons/src/main/java/org/fcrepo/http/commons/domain/ContentLocationMessageBodyReader.java

The question for fedora-tech is, "Is there anyone who is currently using the second capability (Content-Location) who would not be able to refactor your client-side code to eventually use the 'message/external-body' approach"?

Thanks,
Andrew

Joshua Westgard

unread,
Jun 6, 2017, 9:19:20 AM6/6/17
to Fedora Tech, fedora-specifi...@googlegroups.com
We would be fine with having just option #1 available. // Josh

Aaron Birkland

unread,
Jun 6, 2017, 12:26:25 PM6/6/17
to Andrew Woods, fedor...@googlegroups.com, fedora-specifi...@googlegroups.com

Hi Andrew,

 

I’m having a hard time understanding what the spec is saying about “ingest as the body of the binary the content found at the URL”, understanding what the concept of “expiration” means, and figuring out how the two are related. 

 

It looks to me like the thing that is supposed to trigger Fedora to fetch external content and place it into a resource is the presence of an “expiration” parameter, but that does not make logical sense to me, and I fear I’m reading the section wrong.  Can you state, in plain English, what the distinguishing factor is in order to trigger (or not trigger) this fetching behavior?

 

Secondly, what does “expiration” mean in the context of external content, and what does it mean to “respect” it?  Does it have any bearing on the content once it’s in Fedora (i.e. does Fedora have to make it unavailable once it reaches the expiration date, or relay that expiration date to the user, etc).

 

Thanks,

 

  -Aaron

--
You received this message because you are subscribed to the Google Groups "Fedora Tech" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech...@googlegroups.com.
To post to this group, send email to fedor...@googlegroups.com.
Visit this group at https://groups.google.com/group/fedora-tech.
For more options, visit https://groups.google.com/d/optout.

Benjamin Armintor

unread,
Jun 6, 2017, 12:47:09 PM6/6/17
to fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com
From RFC 1521 section 7.3.3:

"EXPIRATION -- The date (in the RFC 822 "date-time" syntax, as
      extended by RFC 1123 to permit 4 digits in the year field) after
      which the existence of the external data is not guaranteed."

In the context of external content (here, content which is not provided as the request entity), it could mean a temporary file, or a one-time-use URL, or any number of other things. The spec is only that message/external-body's EXPIRATION parameter should be respected, IE: the content should be fetched and stored, or the implementation should somehow expire the resource, or some other behavior. The EXPIRATION parameter shouldn't be ignored, but it doesn't specify what an implementation should do.

- Ben

To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech+unsubscribe@googlegroups.com.

To post to this group, send email to fedor...@googlegroups.com.
Visit this group at https://groups.google.com/group/fedora-tech.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Fedora Tech" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech+unsubscribe@googlegroups.com.

To post to this group, send email to fedor...@googlegroups.com.
Visit this group at https://groups.google.com/group/fedora-tech.

Benjamin Armintor

unread,
Jun 6, 2017, 1:28:31 PM6/6/17
to Aaron Birkland, fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com
Hi Aaron-

The critical thing here is that the EXPIRATION parameter is in the RFC for message/external-body, and it's applicable to all access-types, so I wanted to make sure its implications were in the spec. I would really caution against regarding expiration as a new feature, since that behavior may not be how an implementation deals with the issue.

There's no explicit mechanism in the spec for what you describe. I'd argue that the using Content-Location is a little strange, in fact (since it is supposed to accompany an entity that it also identifies, and seems a bit more like a redirect instruction than a copy instruction to me).

- Ben

On Tue, Jun 6, 2017 at 1:04 PM, Aaron Birkland <a...@jhu.edu> wrote:

Hi Ben,

 

Ah, so maybe I was confused with the implication that the “expires” header was somehow related to the notion of instructing Fedora to download an external resource and create an LDP-NR from its content.  The notion of “expiration” seems like the important thing here, not “fetch content”.  i.e. a Fedora impl could simply redirect to that URL until the expiration date is hit, then respond with 410 once it expires, for example?

 

It seems like the concept of expiration of a resource (or resource’s content) is a new feature/capability.  Is it applicable only to external resources, or is the intent to make other sorts of content expirable?

 

As far as Andrew’s original phrasing of the question, is there a mechanism that specifically/explicitly supports "Create this binary resource and ingest as the body of the binary the content found at this URL".?

 

Thanks!

 

  -Aaron

Aaron Birkland

unread,
Jun 6, 2017, 1:54:59 PM6/6/17
to Benjamin Armintor, fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com

Hi Ben,

 

In any case, it seems the notion of expiration is unrelated to Andrew’s question about whether message/external-body is sufficient to support the stated scenario: “Create this binary resource and ingest as the body of the binary the content found at this URL”.  Since there’s no explicit mechanism in the spec to do so, then the answer is “no”?

Benjamin Armintor

unread,
Jun 6, 2017, 2:06:14 PM6/6/17
to Aaron Birkland, fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com
Aaron,

I think Andrew is asking something slightly different - he would know better than I, but I think his message intends to:

1. suggest that the MODE Fedora implementation would move to implementing "Content-Type: message/external-body;expiration=X" as "ingest to Fedora's binary store from this URL"

2. note that this is currently handled with a different header interaction ("Content-Location: X")

3. ask if clients of the MODE Fedora implementation that make use of this functionality would be in a position to refactor if this interaction pattern is adopted

Since it sounds like JHU isn't using this pattern right now, maybe it's a moot point; however, if you have thoughts about this type of feature more broadly, it would be welcome on the PR currently open to specify it: https://github.com/fcrepo/fcrepo-specification/pull/121

- Ben


On Tue, Jun 6, 2017 at 1:54 PM, Aaron Birkland <a...@jhu.edu> wrote:

Hi Ben,

 

In any case, it seems the notion of expiration is unrelated to Andrew’s question about whether message/external-body is sufficient to support the stated scenario: “Create this binary resource and ingest as the body of the binary the content found at this URL”.  Since there’s no explicit mechanism in the spec to do so, then the answer is “no”?

Andrew Woods

unread,
Jun 6, 2017, 2:29:55 PM6/6/17
to Benjamin Armintor, Aaron Birkland, fedor...@googlegroups.com, fedora-specifi...@googlegroups.com
Thank you, Ben.
Your re-articulation of my question is correct.
...and if no one is using the current Fedora implementation "Content-Location: X" approach, this question is more broadly moot.
Andrew

Aaron Birkland

unread,
Jun 6, 2017, 2:30:43 PM6/6/17
to Benjamin Armintor, fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com

Hi Ben,

 

Ok, so just co clarify point 1 ”suggest that the MODE Fedora implementation would move to implementing ‘Content-Type: message/external-body;expiration=X’ as ‘ingest to Fedora's binary store from this URL’

 

Is this simply saying that “ingest to Fedora’s binary store” is how the mode implementation is deciding to handle content external content with an “expiration” parameter?  Is it correct to say that the specification itself does *not* consider the “expiration” header to indicate that the client wishes to have Fedora ingest the content into its binary store?  So the mode impl will use ‘expiration’ as a defacto instruction to “durably persist that ephemeral content over there”, but the spec doesn’t explicitly treat it that way?

Aaron Birkland

unread,
Jun 7, 2017, 9:33:33 AM6/7/17
to Benjamin Armintor, fedor...@googlegroups.com, Andrew Woods, fedora-specifi...@googlegroups.com

Hi Ben,

 

Ah, so maybe I was confused with the implication that the “expires” header was somehow related to the notion of instructing Fedora to download an external resource and create an LDP-NR from its content.  The notion of “expiration” seems like the important thing here, not “fetch content”.  i.e. a Fedora impl could simply redirect to that URL until the expiration date is hit, then respond with 410 once it expires, for example?

 

It seems like the concept of expiration of a resource (or resource’s content) is a new feature/capability.  Is it applicable only to external resources, or is the intent to make other sorts of content expirable?

 

As far as Andrew’s original phrasing of the question, is there a mechanism that specifically/explicitly supports "Create this binary resource and ingest as the body of the binary the content found at this URL".?

 

Thanks!

 

  -Aaron

 

 

From: Benjamin Armintor
Sent: Tuesday, June 6, 2017 12:47 PM
To: fedor...@googlegroups.com

To unsubscribe from this group and stop receiving emails from it, send an email to fedora-tech...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages