Re: The Case for Dereferencable Canvas URIs

Benjamin L Albritton

unread,

Jul 22, 2016, 12:15:48 PM7/22/16

to Jeffrey C. Witt, IIIF Manuscripts, IIIF Discuss

Dear Patrick & Jeff,

I also agree that this is an important service to support from a linked-data point of view - and not just for medievalists, but for institutions that support annotation of any resource (newspapers, art works, archival materials, etc.). Many institutions have not made Canvas and Annotation URIs de-referencable for any number of reasons, including low demand to-date from the community. I notice that you have added some underlying use-cases to https://github.com/IIIF/iiif-stories/issues and already received some comment there, which is an excellent way of gaining some exposure on this issue.

In order to engage the broader community - the community that can encourage institutions into a best-practice, or can exert some influence over the way the spec is currently written - I'm going to bring this conversation over to the more general IIIF-discuss list and I encourage continuing this discussion there.

Best,

Ben

From: iiif-man...@googlegroups.com <iiif-man...@googlegroups.com> on behalf of Jeffrey C. Witt <jeffre...@gmail.com>
Sent: Friday, July 22, 2016 9:07:04 AM
To: IIIF Manuscripts
Subject: Re: The Case for Dereferencable Canvas URIs

Dear Patrick,

I quite agree. Our TEI transcriptions link to canvases at every column break or page break. The problem is these canvas ids are useless unless I also know the manifest in which they live. Usually, I'm interested in using this canvas to find an available image to display. It can also be fairly computationally heavy to have to loop through a long manifest to find the desired canvas and then find an available image.

This problem was part of the impetus of my (likely temporary) decision to take all the manifests relevant to my data and ingest them into a triple store. Now I can simply query for the canvas in my own data set and get the desire data without relying on a given institution to make their canvases de-referencable.

Best,

jw

On Wednesday, July 20, 2016 at 6:43:21 PM UTC-4, Patrick Cuba wrote:

I wanted to start within this group, as I think the case is most applicable to Manuscripts hosting institutions and the pressure would come from users of the manuscript resources. Here's the short version:

The IIIF Presentation API allows "Canvases may be dereferenced separately from the manifest via their URIs" but I would like to strongly encourage the dereferencability of Canvas objects. The "Protocol Behavior" also holds it as a recommendation, even if the API language is weak. Canvas |

The use cases I have already come across are in IIIF-Stories (#53, #54, #55, #56) and individual detail is available. Please comment there to undermine my argument and provide a way out in each case if you can.

The problem in more detail is that the standard misleads a little by assuming that each Canvas is uniquely sequenced and that the Manifest which sequences it is always available. The Canvas @id is always like: "http://example.org/iiif/book1/canvas/p1" because IIIF says so:

Canvases must be identified by a URI and it must be an HTTP(s) URI.

which means that even if http://example.org/iiif/book1/canvas/p1 doesn't go anywhere, it looks like it might. In fact, most repositories I have tested return some version of 404 when you try to resolve their Canvas URIs. A non-dereferencable URI means that any application given only an "oa:Annotation" with an "on" property will never know how to find the Canvas object for display unless also given a Manifest URI. It means that any application seeking to describe, select, comment, transcribe or otherwise annotate a Canvas cannot display the Canvas it intends to annotate. If this application is a service, an iframe, or even an embeddable component, it is not enough to have a Canvas URI string, but must instead know the entire Manifest, the sequences index and the canvases index (or at least the Manifest and a Canvas identifier to which to iterate.

So if you work for a hosting institution or you are using one for your research, please ask them to include the Canvas pattern in their support of IIIF. They don't have to, but it would make me so happy.

Respectfully,

Patrick Cuba

Center for Digital Humanities
Saint Louis University

--
You received this message because you are subscribed to the Google Groups "IIIF Manuscripts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-manuscrip...@googlegroups.com.
To post to this group, send email to iiif-man...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/iiif-manuscripts/724dbaac-38f9-4dc9-bb9b-0ee9aeaa7a1c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Benjamin L Albritton

unread,

Aug 3, 2016, 12:40:49 PM8/3/16

to IIIF Manuscripts, iiif-d...@googlegroups.com

Per discussion on community call today, bumping this up in folks' inboxes. Hopefully we can continue this discussion with the general group.

Best,

B.

From: iiif-d...@googlegroups.com <iiif-d...@googlegroups.com> on behalf of Benjamin L Albritton <blal...@stanford.edu>
Sent: Friday, July 22, 2016 9:15:41 AM
To: Jeffrey C. Witt; IIIF Manuscripts
Cc: IIIF Discuss
Subject: [IIIF-Discuss] Re: The Case for Dereferencable Canvas URIs

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss...@googlegroups.com.

Robert Sanderson

unread,

Aug 4, 2016, 8:03:30 PM8/4/16

to Benjamin L Albritton, IIIF Manuscripts, iiif-d...@googlegroups.com

Trimming the thread down to what I believe the problem statement is:

The problem in more detail is that the standard misleads a little by assuming that each Canvas is uniquely sequenced and that the Manifest which sequences it is always available. The Canvas @id is always like: "http://example.org/iiif/book1/canvas/p1" because IIIF says so:

Canvases MUST be identified by a URI and it must be an HTTP(s) URI.

which means that even if http://example.org/iiif/book1/canvas/p1 doesn't go anywhere, it looks like it might. In fact, most repositories I have tested return some version of 404 when you try to resolve their Canvas URIs.

I don't think that the specification is misleading, it explicitly says:

Canvases MAY be dereferenced separately from the manifest via their URIs [...]

Which is as intended, as there has yet to be any use case articulated that requires them to be separately derefenceable.

A non-dereferencable URI means that any application given only an "oa:Annotation" with an "on" property will never know how to find the Canvas object for display unless also given a Manifest URI.

Correct. And thus we recommend in the Search API to include a within reference from the target Canvas to the Manifest that it is part of.

As per: http://iiif.io/api/search/1.0/#target-resource-structure

It means that any application seeking to describe, select, comment, transcribe or otherwise annotate a Canvas cannot display the Canvas it intends to annotate.

How did the application know the URI of the Canvas in the first place, if it didn't find it from a Manifest?

Even if the canvas were dereferencable, I don't see how that would help you find its Manifest without the above within link?

Also, as a Canvas might be part of multiple Manifests at the same time (not commonly, but possibly) it might be confusing to not pass the manifest URI if the Canvas was actually selected from a particular manifest. If not, then the user might find themselves in a different object than the one they started in. Without the context of the object, as provided by the Manifest, how would the user know what they're looking at in the Canvas? It would go from being "page 16 of The Lord of the Rings by Tolkien" to just "page 16".

Rob

Patrick Cuba

unread,

Aug 5, 2016, 2:15:45 PM8/5/16

to iiif-d...@googlegroups.com, IIIF Manuscripts

Thank you for distilling this, Rob. I regret missing the discussion in the call. Response inline.

The problem in more detail is that the standard misleads a little ...

Canvases MUST be identified by a URI and it must be an HTTP(s) URI.

which means that even if http://example.org/iiif/book1/canvas/p1 doesn't go anywhere, it looks like it might...

I don't think that the specification is misleading, it explicitly says:
Canvases MAY be dereferenced separately from the manifest via their URIs [...]

The point is that it MUST be an HTTP URI, so it always looks like it will resolve, even though it only MAY. This isn't a critical point, because what I am encouraging is for everyone assigning a Canvas URI to come down on the positive side of the MAY option.

Which is as intended, as there has yet to be any use case articulated that requires them to be separately derefenceable.

I have placed several in the IIIF-stories. None of these require more than the `within` from the Search API, but they are all made almost absurdly complex without a dereferenceable Canvas. Here's a summary:

https://github.com/IIIF/iiif-stories/issues/54 - Display for image annotations, currently used at soundingtennyson.org (see comment on issue). With a defererenceable Canvas URI the process is 1) read `on`; 2) load Canvas; 3) crop image to display. Without (and assuming a `within` as below, since it is impossible otherwise) the process is 1) read `within`; 2) load Manifest; 3) loop through [`sequences`] and [`sequences.canvasses`] to find the Canvas from the `on`; and proceed as above. With several annotations displayed from different Manifests, this becomes very unwieldy.
https://github.com/IIIF/iiif-stories/issues/56 - Broken Books, currently used at brokenbooks.org. Reassembling dispersed manuscripts into virtual collections right now (without dereferenceable Canvas URIs) means we have to upload/link the images themselves instead of the Canvases, which is contrary to interoperability. If a quire of 10 at Stanford and Yale each were already catalogued in their own Manifests, I should be able to generate a new Manifest ordering them together through reference to the Canvas URI each repository maintains. Forking them into the new Manifest means the `within` property becomes an unhelpful array and the aggregated version will not update when better description or photography becomes available, for example.
https://github.com/IIIF/iiif-stories/issues/53 - Annotation tool, currently in development at CDH. Any 3rd party annotation tool would need the entire Canvas object passed to it to function, rather than just the URI, as Mirador can. Liz Fischer has shared a great little tool to crop IIIF images, but it is notable that this would not work if you tried to give it a Canvas URI instead of the `images[0].resource.@id` value.
https://github.com/IIIF/iiif-stories/issues/55 - Transcription button, currently in development at CDH and used by the Newberry's French Renaissance Paleography. Like the annotation tool, sending a URI for a single Canvas is preferable for very large Manifests with small areas of interest, such as a single article in a newspaper, a map within a manuscript, a letter in a collection, an entry in a large catalog or index, or any crowdsourced project like Zooniverse where the individual user's path is not linear through the material.

A non-dereferencable URI means that any application given only an "oa:Annotation" with an "on" property will never know how to find the Canvas object for display unless also given a Manifest URI.
Correct. And thus we recommend in the Search API to include a within reference from the target Canvas to the Manifest that it is part of.
As per: http://iiif.io/api/search/1.0/#target-resource-structure

Big fan. I hope people use it as well.

It means that any application seeking to describe, select, comment, transcribe or otherwise annotate a Canvas cannot display the Canvas it intends to annotate.
How did the application know the URI of the Canvas in the first place, if it didn't find it from a Manifest?
Even if the canvas were dereferencable, I don't see how that would help you find its Manifest without the above within link?

Digitally, I believe there is useful freedom from the item-first approach physical libraries require. If I host an index of seals and want to display 116 representations of stags from repositories all over the world, I would love to be able to read an array of Annotations (from storage or query) that clips each of the image fragments needed from someone else's hosted Canvases, so that I would have access to the Canvas URI in the `on` property. Fully compliant IIIF images will simulate this for me, but prevent drilling up to the metadata of the containing Canvas and Manifest, if interested. If I am hosting a collection of Soviet Posters and want to include a button to enable commentary, translation, transcription, or description, it is more complex for my collection software to send both the Manifest (even though it has it available) and the Canvas of interest to the annotation software than it is for the URI to be passed into an annotator as `uri=` is available in many Mirador installs.

Also, as a Canvas might be part of multiple Manifests at the same time (not commonly, but possibly) it might be confusing to not pass the manifest URI if the Canvas was actually selected from a particular manifest. If not, then the user might find themselves in a different object than the one they started in. Without the context of the object, as provided by the Manifest, how would the user know what they're looking at in the Canvas? It would go from being "page 16 of The Lord of the Rings by Tolkien" to just "page 16".

Yes! This is the perfect example of why it is needed. Let there be a Manifest of all of the Lord of the Rings, a Collection of Manifests for each book within, another Collection including books, compendiums, and The Hobbit, or There and Back Again, and a final Manifest comprising only the illustrations from all of the above. A perfect interoperable world would use the same URIs for all identical objects (or at least a reliable `sameAs` reference) and the hosting repository for the first instance would be an ad hoc authority. If my annotation relies on the context of one Manifest over the other, it had best identify it, but if it can stand alone (for example, a transcription of "ERIADOR" on the map), it will be available to every Manifest that includes it. In a world with non-dereferenceable Canvas URIs, even if the same URI is used in every case, the initial object is forked, not referenced, so updating the images or changing dimensions will break all the annotations born in another version.

Patrick

Glen Robson

unread,

Aug 5, 2016, 7:37:37 PM8/5/16

to Patrick Cuba, iiif-d...@googlegroups.com, IIIF Manuscripts

Hi Patrick,

I’m also grappling with this issue with the SimpleAnnotationServer as I’m working on implementing the IIIF Search API. To be able to offer a search endpoint for the annotations I need to know which manifest they apply to. I’m currently working on functionality to register a manifest which will create a Search API endpoint and it will also go through the stored annotations and add ‘within’ links to the manifest. Once the manifest is registered new annotations get a ‘within' field by doing a look up on the cached manifest. If Mirador added the within field when it created and edited annotations and sent it to the annotation store it would simplify the above process. I haven’t looked yet how easy this would be to do in Mirador but at the National Library of Wales we have existing annotations without the within field so have to have this post update functionality anyway.

With dereferencable canvases I was investigating this method but thought if you were transcribing a large manuscript working with the manifest would mean less requests to a host institution as apposed to a request per canvas (the one we’re working on has over 1,000 pages). I didn’t want to make a request to resolve the canvas id at the time the annotation was created either as that would delay the response for the user creating the annotation and if your batching the resolving requests, just getting the manifest seemed easier.

I also worried about if someone re-used canvas ids; if the canvas was present in two manifests I only want to offer the search api on the one thats being transcribed with the current annotation store. I also considered that the context of the annotation may only work in one manifest which is the same as Rob’s '"page 16 of The Lord of the Rings by Tolkien” as opposed to "page 16” ‘ and associating the annotation with the canvas and the manifest the canvas came from made sense.

I haven’t yet completed the implementation of the Search API and may come across problems with the above but hope it will help with the discussion.

Cheers

Glen

--
You received this message because you are subscribed to the Google Groups "IIIF Manuscripts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-manuscrip...@googlegroups.com.
To post to this group, send email to iiif-man...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/iiif-manuscripts/CAOUOa4fdDXdEZMBcRxVWFJ54pZ4CC2c_BqB_7-Z8%3DCvrWKM1kw%40mail.gmail.com.

Robert Sanderson

unread,

Aug 6, 2016, 1:21:07 PM8/6/16

to Patrick Cuba, iiif-d...@googlegroups.com, IIIF Manuscripts

I think I'm still not following some of the rationale, sorry! Hopefully clarifying questions inline below.

But first, it's not that I disagree with you at all! In principle it would be great if everyone would publish all of their resources individually, but the concern is the complexity and costs of doing that versus the benefits.

On Fri, Aug 5, 2016 at 11:15 AM, Patrick Cuba <cu...@slu.edu> wrote:

I have placed several in the IIIF-stories. None of these require more than the `within` from the Search API, but they are all made almost absurdly complex without a dereferenceable Canvas. Here's a summary:
https://github.com/IIIF/iiif-stories/issues/54 - Display for image annotations, currently used at soundingtennyson.org (see comment on issue). With a defererenceable Canvas URI the process is 1) read `on`; 2) load Canvas; 3) crop image to display. Without (and assuming a `within` as below, since it is impossible otherwise) the process is 1) read `within`; 2) load Manifest; 3) loop through [`sequences`] and [`sequences.canvasses`] to find the Canvas from the `on`; and proceed as above. With several annotations displayed from different Manifests, this becomes very unwieldy.

In /this/ case (not the next one), wouldn't you want to have the next and previous canvases from the one that's being annotated? To do that you'd still need to go through the exact same process as you describe in finding the canvas in the sequence(s). And to render the object level metadata, it's actually:

1) read `on`, 2) load Canvas 3) read `within` 4) load Manifest 5) read metadata. So to me, the tradeoff seems to be:

A) Every institution publishes their canvas information separately (if not everyone does it, then you'll still have to do the current steps)

B) Loop through a list of json objects to find a match on a string property (canvas.id)

The second seems lower cost, so I'm not as convinced by this use case as ...

https://github.com/IIIF/iiif-stories/issues/56 - Broken Books, currently used at brokenbooks.org. Reassembling dispersed manuscripts into virtual collections right now (without dereferenceable Canvas URIs) means we have to upload/link the images themselves instead of the Canvases, which is contrary to interoperability. If a quire of 10 at Stanford and Yale each were already catalogued in their own Manifests, I should be able to generate a new Manifest ordering them together through reference to the Canvas URI each repository maintains. Forking them into the new Manifest means the `within` property becomes an unhelpful array and the aggregated version will not update when better description or photography becomes available, for example.

This case is more convincing to me in that there isn't an obvious requirement to retrieve the manifest -- you already have the new one. However, it would mean that when the client goes to render the canvas, it needs to first dereference the URI. We tried this early on when working on the Presentation API and the network costs were pretty high :( Hence the decision to embed all of the canvas information in the Manifest, even if the client only ever renders the first few Canvases.

So what could go wrong with just copying the Canvas into the new manifest?

1 -- the information about the Canvas could change. I think this is very low risk, and is the typical cache invalidation problem. Note that I'm only thinking of the direct properties of the canvas that are in the manifest, not the annotation data.

2 -- there could be more non-image annotations available. This is much much more likely, but these would appear anyway when the annotation lists are dereferenced, so I think not an issue

3 -- there could be more image annotations available. This seems unlikely to occur frequently -- digitization is expensive. This could be solved with either notification of changes, or recrawling periodically. Given the rate of change, recrawling seems reasonable.

I don't think I'm missing anything?

So the choice between copying the Canvas, including its URI, into a new Manifest with periodic freshness checks, versus making all Canvases dereferenceable and fetching them at run time ... I can see both sides as it relies on the pace of change of canvas descriptions and digitization being slow.

https://github.com/IIIF/iiif-stories/issues/53 - Annotation tool, currently in development at CDH. Any 3rd party annotation tool would need the entire Canvas object passed to it to function, rather than just the URI, as Mirador can. Liz Fischer has shared a great little tool to crop IIIF images, but it is notable that this would not work if you tried to give it a Canvas URI instead of the `images[0].resource.@id` value.

I don't follow this one. Annotation tools that don't understand Canvases can't use them. They would probably fall under the first case if they did, in that it would be important to display more than just the image but also information captured in the Manifest. If there was a tool that only wanted to render a single canvas and didn't want to render contextual information, then yes, having it be separately dereferenceable would be significant.

https://github.com/IIIF/iiif-stories/issues/55 - Transcription button, currently in development at CDH and used by the Newberry's French Renaissance Paleography. Like the annotation tool, sending a URI for a single Canvas is preferable for very large Manifests with small areas of interest, such as a single article in a newspaper, a map within a manuscript, a letter in a collection, an entry in a large catalog or index, or any crowdsourced project like Zooniverse where the individual user's path is not linear through the material.

Seems like a subset of the above. If it's internal, it's the first. If it's external, it's the third.

A non-dereferencable URI means that any application given only an "oa:Annotation" with an "on" property will never know how to find the Canvas object for display unless also given a Manifest URI.
Correct. And thus we recommend in the Search API to include a within reference from the target Canvas to the Manifest that it is part of.
As per: http://iiif.io/api/search/1.0/#target-resource-structure

Big fan. I hope people use it as well.

So wouldn't a change of recommending `within` be recorded for the Canvas in the Annotations be sufficient, other then when the Annotation is directly in the Manifest? We kind of already do that in search, we can make it more explicit in the Presentation API too. That's much easier than making all of the Canvases separately dereferenceable AND including it :)

Rob

--

Rob Sanderson

Semantic Architect

The Getty Trust

Los Angeles, CA 90049

David Newbury

unread,

Aug 6, 2016, 10:22:08 PM8/6/16

to iiif-d...@googlegroups.com, Patrick Cuba, IIIF Manuscripts

At the risk of butting into something that I don't fully understand, I feel like issue here is whether canvases are first-class entities or just nodes within a hierarchy.

IIIF is a linked data graph and canvases are an RDF entity, with a type and a URI. As such, they're also a primary source of truth and should only exist once. All references to them should point to the same entity.

IIIF is also a JSON object, somewhat optimized for over-the-wire efficiency, and represents an underlying data store. As such, canvases are just strings representing hierarchical object nodes that can be parsed, composed and combined.

Rob, it sounds like a lot of what you're arguing for assumes that a given canvas should only appear in a single manifest. And if a manifest is fundamentally a JSON file, that's very true—an separate canvas with identical data can exist in a different manifest and be a different canvas. But if a canvas is a RDF entity representing something abstract, that doesn't make sense.

- David Newbury
-----------------------------------
p. (773) 547-2272
e. david....@gmail.com

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

Robert Sanderson

unread,

Aug 7, 2016, 2:15:13 PM8/7/16

to iiif-d...@googlegroups.com, Patrick Cuba, IIIF Manuscripts

Hi David,

On Sat, Aug 6, 2016 at 7:21 PM, David Newbury <david....@gmail.com> wrote:

At the risk of butting into something that I don't fully understand, I feel like issue here is whether canvases are first-class entities or just nodes within a hierarchy.

I think they're both -- from a model perspective, they're certainly first class entities. Arguably the most important ones, that the others are just wrappers around to provide context. However from an API perspective, I'm trying to come to find out whether they need to have their own separate interaction pattern, or whether the use cases can be fulfilled with their role being just nodes in the Manifest rooted hierarchy.

Currently we have three primary API level interactions:

* Collection: Most collections must be individually dereferencable, as their intent is to provide shape over a reasonably large set of other resources. Some collections can be embedded within others, if they're only useful in their parent context. Of course no collections = no implementation requirement.

* Manifest: The primary interaction, as our focus has been on representations of single objects.

* AnnotationList: An optimization, to ensure that the Manifest representation isn't overloaded but has the most useful information to the client embedded (the image annotations, versus all the rest). Given this, if there are only images (the 80% case) even this interaction isn't mandatory as there's likely to be either a separate annotation server that takes care of it, or there's simply no data.

If there are multiple sequences, only the first is embedded, the second and subsequent become separate resources.

But that's a much less than 1% edge case. And one that I don't believe is even implemented by the current viewers?

To me the question is thus: Are the use cases provided sufficient to warrant adding another mandatory interaction pattern for all implementers, given that every implementer will have canvases. I know the request was SHOULD not MUST, but SHOULD for patterns like this isn't actually useful. You have to always assume that it isn't implemented or deal with the errors when it isn't -- at which point you just do that and the canvases are never dereferenced. Given that it's a network request, rather than an optional part of the representation, the cost of it being inconsistent is much higher than just a hasOwnProperty() function call.

So I'm trying to explore whether there are simpler options that would accomplish the same goal.

IIIF is a linked data graph and canvases are an RDF entity, with a type and a URI. As such, they're also a primary source of truth and should only exist once. All references to them should point to the same entity.

IIIF is also a JSON object, somewhat optimized for over-the-wire efficiency, and represents an underlying data store. As such, canvases are just strings representing hierarchical object nodes that can be parsed, composed and combined.

Yes! The trick is finding the appropriate middle ground such that it can be both of these at once :)

Rob, it sounds like a lot of what you're arguing for assumes that a given canvas should only appear in a single manifest. And if a manifest is fundamentally a JSON file, that's very true—an separate canvas with identical data can exist in a different manifest and be a different canvas. But if a canvas is a RDF entity representing something abstract, that doesn't make sense.

They should be the same canvas, with the optimization of being included in two representations of different manifests, to ensure that the network traffic requirements are sustainable, resulting in an acceptable user experience.

If we see a lot of implementations duplicating canvases, and that those canvases then get out of date, that would be a big cause for concern! So far there are use cases from one institution ... if there are other institutions that need this (and simply adding within would not solve the problem) it would be great to hear them :)

Rob

Patrick Cuba

unread,

Aug 7, 2016, 10:39:24 PM8/7/16

to iiif-d...@googlegroups.com, IIIF Manuscripts

Rob's argument makes sense to me. Though it is more complex to go up to the Manifest when the Canvas is not dereferenceable, I admit that it may be much more expensive to consider hosting a `canvases` that is just a list of URIs (though many use cases are only a page at a time). In the land of shims, this seems like it wouldn't be a terrifically expensive effort, since Drupal, ContentDM, and others regard things like Manifests as "compound objects" and have separate records for each page within.

If `within` is always on a Canvas, and `within` refers to the authoritative version of the Canvas (even if it is reused), then being separately dereferenceable is not necessary. In fact, it is redundant because the process to dereference when the Manifest is not in hand and where the Canvas does not hold its own metadata is nearly the same, as Rob has stated. However, if the inclusion of `within` is intended to offer a way to dereference the Canvas, it becomes a defacto URI, replacing the `@id` as the necessary piece of information to have to get the object when it is not already in hand.

At the start, I meant only to enthusiastically encourage a dereferenceable Canvas URI, not change the standard, since I appreciate that the API is designed for maximum adoption and ease of implementation. Seeing this discussion, there are several things within IIIF that have `@id` properties that are not required to be anything but unique. The typical URI pattern looks like `{scheme}://{host}/{prefix}/{identifier}/{resource}/{name}` and for things like the initial Sequence, it does not have to be HTTP(s) because it is always embedded in the Manifest. Canvas resources are special because they are required to have an HTTP(s) scheme even when they do not resolve. As David has pointed out, having a Canvas that cannot be brought into other collections outside of the Manifest or Sequence that completely describes it removes the requirement for dereferenceability, but may also demote it beyond good intent. Perhaps, as an assist to folks like me, the suggestion/requirement should be that only obviously UID values be used ("tag:", for example) if the `@id` is not resolvable and that HTTP(s) be reserved for those that can. Alternately, the URI for the Canvas could include the Manifest ("http://example.org/great-book/manifest.json#page-16), or like a fragment selector, state the parameter ("http://example.org/great-book/manifest.json#canvas=page-16), so one string can find what the application is looking for. At least in the `on` of pointing resources, I would wish for references that were complete enough to find the resource if it is not already in hand. The work that has and continues to go into these standards and APIs is herculean and appreciated. I have been regarding Canvases to be as important as Manifests or AnnotationLists, but my personal projections should not derail the intended arrangement of resources.

We should also be honest, though, that if an AnnotationList must be dereferenceable but its `resources` (Annotations) are not, it means that each Annotation is meaningless as a discrete object. The `on` property, whether attached to the list or its members, as it points to a Canvas, is also meaningless as a discrete reference, since you already have to know the Canvas to make the connection. Similarly, Manifests.Sequences[0] and Canvases need each other (which means /only/ each other) to find meaning. This changes (or makes unreliable) plans for a small raft of tools that allow for dynamic aggregations like Broken Books, authority lists of annotations, annotation categorical collections and indices, rich academic article citations, and lists targeting multiple resources. We can find another way, obviously, but it would have been cool if IIIF+OAC "just worked" for us.

That said, I'm headed on paternity leave within the next 6 weeks, so the burning fuse on this being resolved to a point I can do real work with it is quite long.

Patrick

Patrick Cuba

Center for Digital Humanities

Saint Louis University
314-977-4249

Régis Robineau

unread,

Aug 9, 2016, 9:42:40 AM8/9/16

to iiif-d...@googlegroups.com, Patrick Cuba, IIIF Manuscripts

Hi all,

This topic is highly relevant for a demo that I am currently working on. Its principle is very simple and quite similar to Patrick's use case #54: "I would like to display an image annotation on a website" (https://github.com/IIIF/iiif-stories/issues/54)

The story is: a group of users annotates a set of manuscripts in Mirador in order to identify and provide basic descriptions of bookplates. Users should be able to load in Mirador any object from their manifest url.
(I'm using the simpleAnnotationServer made by Glen Robson as the annotation endpoint and Sesame as the backend where the annotations are stored).
Then on top of that I want to display all the annotations dynamically on a web page (by making a simple Sparql request to the Sesame endpoint). The main elements that I would like to be displayed are:
- the text of the annotation
- the corresponding image region that was originally selected and annotated in Mirador by a given user
- a link to a Mirador view of the entire object initialized at the right page/canvas.

Two pieces of information are missing in the annotation in order to build the web page as I wish:
- the manifest URL
- the IIIF base URL of the image which "paints" the Canvas that is being annotated

A workaround would be to tweak Mirador in order to add two properties to the annotation:
- "within", with the manifest url (as previously pointed out in this thread)
- "service", with the Image API service info of the annotated canvas

Thereby it would be very easy to build a basic UI which displays all the annotations along with the images of the bookplates in a grid view, with a link to the corresponding Mirador view to be able to see each bookplate in context.

I wonder if this approach makes sense and if it is a consistent use case (maybe not... it is only for demo purpose). I have a few doubts about the validity of this solution and how to represent it in the JSON-LD.
Here is an attempt to represent the structure of such an annotation: https://gist.github.com/regisrob/a8c3897ad58ca60a49528b252ccd2c4a

1/ is it correct to add the "within" property into the "on" of an annotation? (as Search API does: http://iiif.io/api/search/1.0/#target-resource-structure) Is it still valid if the canvas ID is in "full"?
2/ is it valid to add a "service" at the annotation level?

If I would like to display a few contextual metadata to my grid of annotations on the web page, I could also embed into the annotation the manifest "label", "description" and so on...
But how would I do that for the canvas label for instance?

Any advice or thoughts would be helpful.

Thanks!

Régis

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss...@googlegroups.com.

Jeffrey C. Witt

unread,

Aug 9, 2016, 12:14:58 PM8/9/16

to IIIF Discuss, iiif-man...@googlegroups.com

Patrick,

Will you be able to be on the Manuscript call tomorrow at 11am EST. If so, I wonder if you could give the group a 5 to 10 minute summary of this thread?

Let me know,

jw

Patrick Cuba

unread,

Aug 9, 2016, 12:18:43 PM8/9/16

to iiif-d...@googlegroups.com

No problem. I plan to be there and will make sure I've got a microphone.

Patrick Cuba

Center for Digital Humanities

Saint Louis University
314-977-4249

Robert Sanderson

unread,

Aug 19, 2016, 6:29:54 PM8/19/16

to iiif-d...@googlegroups.com, Patrick Cuba, IIIF Manuscripts

Hi Regis,

The Image service being associated with the annotation doesn't really make sense. The image is associated with the Canvas not the annotation. There might also be 10 images associated with the canvas, rather than just one, or with part of the canvas such as the reconstruction of the BNF / BVMM manuscripts.

Given that you have the manifest URI, you could retrieve it and then have all of the information about the Canvas with all of the images and other resources to render.

I do agree that if the canvas was dereferencable, you would have the content resources to render it by fetching its representation ... but not the object's description to provide the context. I guess if you only wanted the label of the object you could include that in the manifest (in within) part of the annotation, without having to duplicate everything from the canvas in the annotation. So there would still be two requests (one for the annotation, one for the canvas) but it would be a smaller representation to retrieve.

Rob

-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---

You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---

You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Régis Robineau

unread,

Aug 22, 2016, 3:56:10 AM8/22/16

to iiif-d...@googlegroups.com, Robert Sanderson, Patrick Cuba, IIIF Manuscripts

Hi Rob,

Thanks for your answer.
I was looking for an easy solution for a little experiment but I agree the workaround with "service" is not a proper way and does not handle every case defined by the P-API (even if it would have worked with the manifests involved in my demo).
So I just tweaked Mirador (to add "within" to the annotation) and SimpleAnnotationServer (to retrieve and store the whole manifest into the triplestore), it works fine like that. In this way I have everything in hand to get information about the Canvas or other resources, and thus display contextual metadata along with the annotations if needed.

Cheers,

Régis

-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss...@googlegroups.com.

Ben Brumfield

unread,

Sep 8, 2016, 2:31:22 PM9/8/16

to IIIF Discuss, jeffre...@gmail.com, iiif-man...@googlegroups.com

I'd like to add my voice to a vote for dereferencable canvases.

Because 1) OpenSeadragon can read info.json for all the configuration it needs, and 2) we're using OSD to display pages to be transcribed in FromThePage, then 3) we don't need to persist any attributes of canvases we import beyond @id, label, and image service information.

Now that we're exporting transcripts/translations as derivative manifests (pending widespread support for annotation supplementary layers), we need to hunt down the height and width attributes of the originating canvases in order to generate valid manifests. But how do we get the necessary attributes?

* We could re-query the original manifest, then parse it to look for canvas @ids that match the canvas IDs we've saved (unpleasant and slow in some cases).
* We could parse the image service's info.json for each canvas (mixing the Image API and Presentation API)
* At import time, we could persist the height/width for each canvas, then re-use them at export time (hoping they haven't changed)
* Or we could use the canvas ID we've saved, dereference it, and re-present it.

The last option certainly seems best to me, but won't work if canvases aren't dereferencable.

On the same topic, how can one get from a canvas ID (perhaps occurring within an annotation) to a manifest, or a layer, or an image service, if canvases aren't dereferencable?

Ben

To post to this group, send email to iiif-ma...@googlegroups.com.

Robert Sanderson

unread,

Sep 8, 2016, 2:40:46 PM9/8/16

to iiif-d...@googlegroups.com

On Thu, Sep 8, 2016 at 11:31 AM, Ben Brumfield <benw...@gmail.com> wrote:

Because 1) OpenSeadragon can read info.json for all the configuration it needs, and 2) we're using OSD to display pages to be transcribed in FromThePage, then 3) we don't need to persist any attributes of canvases we import beyond @id, label, and image service information.

That's dangerous, as the height and width of the Canvas might be different from the height and width of any given image. Particularly when there's a choice of image involved, you wouldn't know which set of dimensions might have been chosen for the Canvas. Then it would be impossible to align the annotations again.

Now that we're exporting transcripts/translations as derivative manifests (pending widespread support for annotation supplementary layers), we need to hunt down the height and width attributes of the originating canvases in order to generate valid manifests. But how do we get the necessary attributes?
* We could re-query the original manifest, then parse it to look for canvas @ids that match the canvas IDs we've saved (unpleasant and slow in some cases).

Slow ... but feasible to validate a cache.

* We could parse the image service's info.json for each canvas (mixing the Image API and Presentation API)

Very risky, as above.

* At import time, we could persist the height/width for each canvas, then re-use them at export time (hoping they haven't changed)

This is the way that I would suggest, if you're creating new manifests rather than just publishing annotations. If the dimensions of the Canvas have changed, your annotations would need to be updated for the new dimensions anyway, no? If you don't cache them, how would you know how to transform the target region?

* Or we could use the canvas ID we've saved, dereference it, and re-present it.

And use this (or the manifest) to validate your cache of the dimensions.

The last option certainly seems best to me, but won't work if canvases aren't dereferencable.
On the same topic, how can one get from a canvas ID (perhaps occurring within an annotation) to a manifest, or a layer, or an image service, if canvases aren't dereferencable?

If there's a within link, you could follow that, but the same Canvas might be within multiple Manifests. Deferenceability doesn't solve /that/ of course :)

Rob

--

-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---

You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Ben Brumfield

unread,

Sep 8, 2016, 3:46:58 PM9/8/16

to IIIF Discuss

On Thursday, September 8, 2016 at 1:40:46 PM UTC-5, Rob Sanderson wrote:

On Thu, Sep 8, 2016 at 11:31 AM, Ben Brumfield <benw...@gmail.com> wrote:
Because 1) OpenSeadragon can read info.json for all the configuration it needs, and 2) we're using OSD to display pages to be transcribed in FromThePage, then 3) we don't need to persist any attributes of canvases we import beyond @id, label, and image service information.

That's dangerous, as the height and width of the Canvas might be different from the height and width of any given image. Particularly when there's a choice of image involved, you wouldn't know which set of dimensions might have been chosen for the Canvas. Then it would be impossible to align the annotations again.

As yet, we haven't encountered any cases of transcribable multi-image canvases like fragments of the same manuscript page, so it hasn't come up. I'd love to see some examples of this, however, since it's one of the features of IIIF I always like to talk about.

* At import time, we could persist the height/width for each canvas, then re-use them at export time (hoping they haven't changed)

This is the way that I would suggest, if you're creating new manifests rather than just publishing annotations.

This is what we'll be doing. While attempting other solutions, the first manifest we encountered didn't have dereferencable canvas URIs, so that configuration must be a lot more widespread than I'd assumed.

If the dimensions of the Canvas have changed, your annotations would need to be updated for the new dimensions anyway, no? If you don't cache them, how would you know how to transform the target region?

All of our annotations are on the canvas itself, rather than regions of the canvas. At the hackathon last September, I added full-canvas regions to the annotation targets to get them to show up in Mirador, but I think that may not be necessary any longer. (At least I hope to test canvas annotations on both Mirador and UV and find out.)

Ben

Reply all

Reply to author

Forward