Re: [NISO resourcesync_d2d] ResourceSync specification -- beta draft

17 views
Skip to first unread message

Simeon Warner

unread,
Jan 30, 2013, 2:55:03 PM1/30/13
to resource...@list.niso.org, resour...@googlegroups.com
On 1/29/13 7:36 PM, Martin Klein wrote:
> We would like to share the beta draft of the ResourceSync
> specification with you. The document is currently available at:
>
> http://megalodon.lanl.gov/resync/spec/spec_draft_2013-01-29.html

I think this is in good shape, incorporation of the results of our
recent calls has resolved a number of issues. Suggested minor
corrections and comments below. I point out a few of places were the
language still assumes ZIP-only for dumps (this seems OK in the
walkthrough or specific examples but not elsewhere).

Cheers,
Simeon


1. Introduction

- I still don't think we should use <urlset> etc in the intro. I suggest
simply removing it from the bullet, it is nicely introduced later:

"o Section 3. Sitemap Document Formats describes the way in which the
<urlset> and <sitemapindex> document formats introduced by the Sitemap
protocol are used to convey synchronization related information in all
ResourceSync capabilities."
=>
"o Section 3. Sitemap Document Formats describes the way in which the
document formats introduced by the Sitemap protocol are used to convey
synchronization related information in all ResourceSync capabilities."

1.1 Motivating Examples

Simplify: "Publishing a Resource Dump that points at content packaged in
ZIP files that also contains a Resource Dump Manifest that describes the
packaged content, makes this more efficient for the client and less
burdensome for the server." => "Publishing a Resource Dump that points
to content described and packaged in ZIP files makes this more efficient
for the client and less burdensome for the server.?"

2.1 Walkthrough

I think this section reads well.

"with optimizing" -> "optimize"
"straight-forward" -> "straightforward"

2.2 Overview

"aligned" seems like an odd term to introduce, I suggest sticking with
"synchronized"

3. Sitemap Document Formats

for later: I wonder about urlset/rs:md@modified. If we want this value
inside the sitemap to match the HTTP Last-Modified then, for use cases
where the sitemap is written out to a file, it will be necessary for the
writing app to adjust the mtime (or Windows equivalent) after writing
the file. We should probably talk about this when we come to writing
some implementation guidelines. (Obviously not an issue for a dynamic
response.)

urlset/url/rs:md@change: "Defined values are create, update, and
delete..." => "Defined values are created, updated, and deleted.."

urlset/url/rs:md@etag: should we follow the form of the definition of
length and refer to rfc2616? Perhaps add a second sentence: "The value
of the etag attribute should be equal to the value of the ETag header in
the HTTP response as defined in RFC 2616, Sec. 3.11."

urlset/url/rs:md@path: "...in the ZIP file. That is the relative file
system path where the bitstream would reside if the ZIP were unpacked."
=> "...in the ZIP or other package file. That is the relative file
system path where the bitstream would reside if the package were unpacked."

urlset/usl/rs:ln@{etag, hash, length, modified, path, type} I think we
need to say that these apply to the related resource. Suggest changing
"Optional attributes with meanings as described above" => "Optional
attributes [describing the related resource] with meanings as above".

4.2 Resource List Index

Thanks for incorporating some of my suggestions on introducing the
index, I feel pretty happy with this section now. (I'll drop the notion
of wanting three terms since it seems that nobody else followed up on my
mail about the issue :-) )

missing " and" after the bullet for "http://example.com/resourcelist3.xml,"

5.1 Resource Dump

Should the link for "ZIP file format" be in the references?

5.1.2. Resource Dump Manifest Index

"single ZIP package" => "single package"

6.1 Change List

I think in the current form we miss the idea that a source would update
a changelist, I suggest changing in the intro paragraph: "It is up to
the Source to determine the frequency with which it publishes [or
updates] Change Lists and also the time period that the Change List covers."

7.1.1 Change Dump Manifest

"top level of the ZIP file" => "top level of the ZIP or other format
package"

8.1 Mirrored Content

clarity: "Per mirror location, a <rs:ln> element is introduced." I think
would be clearer as "A <rs:ln> element is used to express each mirror
location for a resource."

should all the rel= and pri= in the example be red?

8.2 Alternate Representations

"Yet another client" => "Another client" -- it is only the second client

"...in a ZIP file." => "in a ZIP or other package file."

8.3 Partial Content

Is "http://www.openarchives.org/rs/terms/diff" the best term? What about
"http://www.openarchives.org/rs/terms/partial" which doesn't make any
suggestions about the format but ties with the section name?

8.4 Resources and Metadata about Resources

missing "as" in "...are regarded [as] resources with distinct URIs"

"describedBy" => "describedby"

would example 8.5 be clearer if the two entries were a symmetric pair?
(ie. m -> describes -> r and r -> describedBy -> m instead of different
m and r at present)

10.2 Capability List Index

Example 10.2
- spacing in first <loc>
- should we include at least one describeby link (to an html doc for
example)? This is something I think we need to encourage (I think we
might want to add even more to the intro of section 10.2 because it is a
new idea really quite different from the rest of the spec -- kinda like
rolling multiple sources into one)

10.3.1 ResourceSync Well-Known URI

Seems wrong to have example.com in the URI, I think it was better before
as a local URI "/.well-known/resourcesync"



Herbert van de Sompel

unread,
Jan 30, 2013, 4:00:53 PM1/30/13
to Simeon Warner, resource...@list.niso.org, resour...@googlegroups.com
Thanks a lot, Simeon. We are implementing the changes.

I actually had been wondering about your following comment too:

<snip>

8.3 Partial Content

Is "http://www.openarchives.org/rs/terms/diff" the best term? What
about "http://www.openarchives.org/rs/terms/partial" which doesn't
make any suggestions about the format but ties with the section name?

</snip>

I came from the perspective that the title "Partial Content" doesn't
really convey what this is about. I mean "Partial Content" could
readily be misinterpreted to be related to "HTTP 206 Partial Content",
which this section is not about. But I'm not sure about another title
and hence another relation URI. I was playing with "Describing
Changes" but that's not OK because we already have that title for the
Change List section. "Describing Resource Changes" could work, maybe,
but then I don't see what relation URI would go with that.

Ideas?

Herbert
> --
> You received this message because you are subscribed to the Google Groups
> "ResourceSync" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to resourcesync...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/

==

Herbert van de Sompel

unread,
Jan 30, 2013, 4:19:26 PM1/30/13
to Simeon Warner, resource...@list.niso.org, resour...@googlegroups.com
hi Simeon, all,

Regarding the recurring "ZIP or other packaging format" theme in your comments:

I seem to have a different recollection of the decision we made with
this respect in the last call. I thought that we had decided to, for
now, put the spec out with just ZIP as the format and then see what
the public feedback to that approach is.

I am in favor of that approach, for now, because:

- It allows us to hear whether people have issues with a single format
and, if so, what their requirements are
- It allows us to, at this point, put a spec out that has no loose
ends regarding packaging formats. The issue with opening the door to
other formats in this version of the spec is that we don't know how to
deal with them yet, cf our conference call, eg how to deal with
multiple formats at the same time? So, we would "allow" something
without knowing how to do it, we provide no examples, etc.
- It allows us to reflect on this matter at the upcoming in-person
meeting, informed by community feedback

So, my proposal is to only refer to ZIP in this version. Can we live with that?

Herbert


On Wed, Jan 30, 2013 at 12:55 PM, Simeon Warner
<simeon...@cornell.edu> wrote:

Simeon Warner

unread,
Jan 30, 2013, 4:46:14 PM1/30/13
to resource...@list.niso.org, resour...@googlegroups.com
On 1/30/13 4:19 PM, Herbert van de Sompel wrote:
> hi Simeon, all,
>
> Regarding the recurring "ZIP or other packaging format" theme in your comments:
>
> I seem to have a different recollection of the decision we made with
> this respect in the last call. I thought that we had decided to, for
> now, put the spec out with just ZIP as the format and then see what
> the public feedback to that approach is.

In Martin's notes from the call:

> 10) is the ZIP format the recommended or THE ONLY format
> - if only recommended, meaning more than one are possible, should we
> include the "type" attribute in link
> decision: ONLY ONE packaging format at a time, in a capability list
> - mandate "type" attribute
> - ZIP is the recommended format (tar.* possible)

> I am in favor of that approach, for now, because:
>
> - It allows us to hear whether people have issues with a single format
> and, if so, what their requirements are
> - It allows us to, at this point, put a spec out that has no loose
> ends regarding packaging formats. The issue with opening the door to
> other formats in this version of the spec is that we don't know how to
> deal with them yet, cf our conference call, eg how to deal with
> multiple formats at the same time? So, we would "allow" something
> without knowing how to do it, we provide no examples, etc.
> - It allows us to reflect on this matter at the upcoming in-person
> meeting, informed by community feedback

I agree that there is an awkwardness that resulted in us concluding on
the call that we would permit only one format at a time. However, I
prefer that to allowing just one format.

> So, my proposal is to only refer to ZIP in this version. Can we live with that?

If we want to reverse that decision then we have to change the spec in
the places where it currently says "ZIP or other" (it is mixed at the
moment). I think the change to say "ZIP or other" everywhere is the
easier fix and also the better approach.

I could live with putting something out that has "thou must use ZIP" but
I'll argue again at the in-person meeting for a more extensible approach.

Cheers,
Simeon

Simeon Warner

unread,
Jan 30, 2013, 4:56:53 PM1/30/13
to resource...@list.niso.org, resour...@googlegroups.com
On 1/30/13 4:00 PM, Herbert van de Sompel wrote:
> I actually had been wondering about your following comment too:
>
> <snip>
>
> 8.3 Partial Content
>
> Is "http://www.openarchives.org/rs/terms/diff" the best term? What
> about "http://www.openarchives.org/rs/terms/partial" which doesn't
> make any suggestions about the format but ties with the section name?
>
> </snip>
>
> I came from the perspective that the title "Partial Content" doesn't
> really convey what this is about. I mean "Partial Content" could
> readily be misinterpreted to be related to "HTTP 206 Partial Content",
> which this section is not about.

Good point Herbert, I've worried about that too. I agree that another
name entirely would be good.... but:

> But I'm not sure about another title
> and hence another relation URI. I was playing with "Describing
> Changes" but that's not OK because we already have that title for the
> Change List section. "Describing Resource Changes" could work, maybe,
> but then I don't see what relation URI would go with that.
>
> Ideas?

tricky... how about "Patching Resources" and rel=".../patch". There is
some baggage from the unix patch utility but I think the term is used
for updates more generally. Both of the mime types we quote in that
section have the word "patch" in them.

Cheers,
Simeon

Herbert Van de Sompel

unread,
Jan 30, 2013, 5:04:41 PM1/30/13
to Simeon Warner, resource...@list.niso.org, resour...@googlegroups.com
Simeon

I like Patching Resources.

HTTP Patch also exists, see https://tools.ietf.org/html/rfc5789, but the semantics of "patch" are the same: updating a resource without having to fully replace it.

H

Sent from my iPhone

Martin Klein

unread,
Jan 30, 2013, 7:19:11 PM1/30/13
to Herbert Van de Sompel, Simeon Warner, resource...@list.niso.org, resour...@googlegroups.com
Hi all,

This is just to inform you that we
- incorporated Simeon's comments and
- added 2 editor's notes
to the version available at:

http://megalodon.lanl.gov/resync/spec/spec_draft_2013-01-29.html


cheers
Martin

Simeon Warner

unread,
Jan 31, 2013, 9:29:28 AM1/31/13
to Martin Klein, resource...@list.niso.org, resour...@googlegroups.com
Putting an editors' note about ZIP seems like good compromise for now.
(Since there are multiple editors I think the apostrophe should go after
the s though.)

Unless I hear otherwise I'll plan on getting this version up on the OAI
website as

http://www.openarchives.org/rs/0.5/resourcesync

either late tonight to tomorrow morning (probably not until the morning).

Cheers,
Simeon

Herbert van de Sompel

unread,
Jan 31, 2013, 10:59:32 AM1/31/13
to Simeon Warner, Martin Klein, resource...@list.niso.org, resour...@googlegroups.com
On Thu, Jan 31, 2013 at 7:29 AM, Simeon Warner
<simeon...@cornell.edu> wrote:
> Putting an editors' note about ZIP seems like good compromise for now.
> (Since there are multiple editors I think the apostrophe should go after the
> s though.)
>
> Unless I hear otherwise I'll plan on getting this version up on the OAI
> website as
>
> http://www.openarchives.org/rs/0.5/resourcesync
>
> either late tonight to tomorrow morning (probably not until the morning).
>

That sounds good. We are still making tiny edits. So please download
the version from the megalodon URI at the moment you are ready to put
it up on openarchives. And please let us all know once you did.

A few things:

- If further, substantial, comments come in prior to you publishing on
openarchives, I propose to consider those part of the public feedback
cycle. Smaller issues (e.g. typos) could possibly still be taken into
account.

- May I repeat my request to put a link on the OAI home page under
"Current OAI Projects" that points to the most recent version of the
spec (via a generic URI). Also, I think it would be good to list those
projects in reverse chronological order: ResourceSync, ORE, PMH.

- I will prepare a mail that can be sent to several lists to announce
the availability of the draft spec and to request feedback. I will
share a draft of that mail with the editors. Once we agree on the
wording, we can decide who sends to which lists. We should also put
that text under "News" on the OAI site. This should all be done early
next week.

Cheers

Herbert
Reply all
Reply to author
Forward
0 new messages