Tom,
Yes, right now I'm only interested in dynamic ContentAsText
annotations from indexed OCR.
We currently have search inside, but it only gets you to the page and
shows snippets. In order to replace this with a IIIF-compliant
service, I only need to get that far at first. I want the minimal
viable search response, then from there we could add other features
and annotation types.
Try "albee":
http://d.lib.ncsu.edu/collections/catalog/technician-v59n50-1979-01-26
Yes, the pile of OCR search server is what I'm after. As far as I've
gotten in migrating to IIIF search inside is to use an existing API,
our IIIF image server, and tesseract to OCR pages and output a text
file, hOCR and a PDF. [1] Next step is to have a Solr or Elasticsearch
index built from this pile of OCR.
This made me think that serving up search inside for OCR could be a
standalone service--and possibly something useful enough for others to
use for simple use cases. I know I can't get away with just static
content like a level 0 image server, but what's the closest I can get
to a "level 0" IIIF content search server for OCR? Just returning OCR
text hits doesn't seem like it needs to know anything else about the
resource or content so it could stand on its own.
Beyond the basics, highlighting words on the image would be nice.
Eventually we may get to wanting to search other types of annotations,
but that can wait until we've migrated.
So that's what I'm thinking of right now.
Jason
[1] Hope is that as open OCR engines improve (or we license a better
scriptable OCR engine) we can automate updating our OCR.
> --
> -- You received this message because you are subscribed to the IIIF-Discuss
> Google group. To post to this group, send email to
>
iiif-d...@googlegroups.com. To unsubscribe from this group, send email to
>
iiif-discuss...@googlegroups.com. For more options, visit this
> group at
https://groups.google.com/d/forum/iiif-discuss?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "IIIF Discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
iiif-discuss...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.