[Discovery] Schema.org and discovery at NL Israel?

67 views
Skip to first unread message

Antoine Isaac

unread,
Jun 12, 2017, 4:44:13 AM6/12/17
to eyal....@nli.org.il, iiif-d...@googlegroups.com
Dear Eyal,

Last week at the IIIF Discovery session at the Vatican conference you mentioned that you've been working with Schema.org. As said, this could be very interesting for the group. Is it possible for you to share more details?

Kind regards,

Antoine

Eyal Reuven

unread,
Jun 14, 2017, 3:56:55 AM6/14/17
to IIIF Discuss, eyal....@nli.org.il
Hi Antoine, Hi all!

It was a pleasure to meet you all last week in the Vatican conference.

I'd love to share our experience thus far with Discovery:

We are currently running a pilot, consisting on a collection or two, for discovery purposes, as following:
* We create Sitemap.xml files (based on https://www.sitemaps.org ) that point to our items, each is a HTML page in our website.
* The HTML page is consisting on a IIIF Image (through an embedded viewer) and metadata + Microdata (based on https://schema.org )

The idea is to let Google and other search engines to get to our items easily (Sitemaps) and "understand" them semantically (Microdata).

How we implemented Microdata and the challenges we have in light of IIIF:

1. The Image

If we could semantically point on the image, it should appear on image searches (such as Google Images)

We were looking into how to reference the IIIF Image in our webpage with Microdata. The issue is that a IIIF image is actually many tiles displayed together. I've opened an issue about that on Schema.org's GitHub but got no response: https://github.com/schemaorg/schemaorg/issues/1521
Therefore we've decided to insert a hidden link pointing to a thumbnail of the image which is displayed in the page, we've done it according to Schema.org's Github issue we've opened: https://github.com/schemaorg/schemaorg/issues/1554
The big con for this is that on an image search, you wouldn't be able to click on "see the full image" in order to see the full image - you'd till get only a thumbnail, your only option is to actually load the webpage in which the image is - and then the viewer will load the tiled-image in a larger scale and better resolution.

Working with Schema.org we might find a solution for that.
I think the solution should consist on marking the size and region parameters of the IIIF Image API URL in a semantic way, so the IIIF Image URL (e.g. /123/full/!100,100/0/default.jpg ) wouldn't be treated just as a text string.

2. The Manifest
We would have wanted to include a link to the IIIF Manifest of the item as part of the metadata. This could enable future uses for engines that would incorporate the IIIF Standard (Google might do that if we're really good) and while crawling and indexing pages, actually get the Manifest and index everything.

Currently, the only API references in Schema.org I could find are:
* https://schema.org/accessibilityAPI
* http://schema.org/EntryPoint
* http://schema.org/Service (which has an interesting pending extension - WebAPI

Working with Schema.org we might be able to have a generic API reference field that also includes the ability to reference IIIF API (Presentation API in this case).

-Eyal

Shaun Ellis

unread,
Jun 14, 2017, 7:10:25 AM6/14/17
to iiif-d...@googlegroups.com
Eyal,
Thanks for sharing your thoughts on your schema.org pilot. Have you looked at Google Custom Search? It works with sitemaps and Microdata (among other structured data formats), and also has a customizable JSON API that I believe will help solve the problems you ran into (i.e., supplying a full Image URL and link to the Manifest). 

Best,
Shaun

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eyal Reuven

unread,
Jun 15, 2017, 11:31:02 AM6/15/17
to IIIF Discuss
Dear Shaun,
Thanks for your link. I was consisting on the Google Developers guide for Google Search (rather than Google Custom Search) but both pages seem to have quite similar information.
How do you think the JSON API (do you mean this?) can help solve the problem?


Additionally, does anyone think it would make more sense just to include the Manifest as an embedded JSON-LD in the HTML page (or a reference to the Manifest)? (see here Google's documentation about JSON-LD )
It seems that Google prefers using JSON-LD than Microdata (according to this)

-Eyal

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss...@googlegroups.com.

Shaun Ellis

unread,
Jun 15, 2017, 12:30:37 PM6/15/17
to iiif-d...@googlegroups.com
Eyal,
Yes, that is the JSON API I was referring to. I do not think it would be wise to include the manifest in the HTML page for indexing purposes. I think it's preferable to include only the information that needs to be indexed. 

Here's how I think it can solve the problem:

Google Custom Search enables you to create a search engine for a collection of websites (or a single website). You can fine-tune the ranking and customize the look and feel of the search results. Custom Search also allows custom data that does not correspond to a defined microformat through the use of PageMaps. Custom Search recognizes this data when indexing your webpages and returns it directly in XML results or in JSON format in the Custom Search element.

So, if IIIF content publishers add data necessary for import to PageMaps to their HTML pages or SiteMaps, then it's fairly easy for IIIF to create a custom "IIIF Universe" search. 

The upfront data necessary for viewers to "import" a Resource are:
  1. URL of the Resource
  2. Presentation API version
  3. Resource type
That is the custom data that IIIF publishers would provide in their PageMaps for indexing (among other non-custom data).

Furthermore, since Custom Search also provides an API, viewers can also implement a search UI for the IIIF Universe. This could be an alternative to drag and drop import for users who don't use a mouse and it would provide all users the ability to import different Resource types for manifest creation in the client. A viewer could then filter or grey out any items it knows it can't render (HTTP/HTTPS, unsupported API versions, a single Canvas, etc.). It basically gives implementers the tools to provide an optimal User Experience for all.

Hope that helps,
Shaun


--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

David Beaudet

unread,
Jun 16, 2017, 4:07:11 PM6/16/17
to IIIF Discuss
+1 with acknowledgement that this approach might carry a significant Google deprecation risk - it seems like it could co-exist with a IIIF Discovery spec and ideally the Discovery spec would benefit in some way by leveraging this approach yet still be perfectly viable should Google pull the plug on custom search.
Reply all
Reply to author
Forward
0 new messages