Extending Image API to support image processing actions

73 views
Skip to first unread message

Javier de la Rosa

unread,
Dec 19, 2016, 11:54:55 AM12/19/16
to IIIF Discuss
Hi,

After some discussion in the #general channel in Slack, I'm writing to the list to have some more feedback.

In one of our projects at Stanford, we are feeding images from IIIF-compliant servers to a series of image processing steps to perform computer vision tasks. Such tasks require the images to be "clean" first. Part of the cleaning actions include:
- Contrast adjustment
- Brightness adjustment
- Color reduction

We also perform bilateral filtering and denoising, but these are way more specific. We know that viewers like Mirador already support contrast and brightness, and thought those could make a good addition to a future version of the Image API, probably something that might depend on the implementation. The questions are, is there any plan for supporting further image processing tasks as part of the specification? Would it be useful for anybody to have a more generalizable plugin framework to extend what the Image API can do?

Cheers.

Shaun Ellis

unread,
Dec 19, 2016, 12:10:25 PM12/19/16
to iiif-d...@googlegroups.com
Javier,
This is certainly an emerging use case.  If we look at where content-analysis innovation is headed, it's easy to forecast the case for automated generation of OCR, translations, transcriptions, content analysis, natural language processing, and image/face recognition of our networked image resources.  In my opinion, the transformations necessary should certainly be considered as part of a future Image API.  

Is a more nuanced "Color reduction" needed for your purposes, or does the existing "grayscale" quality option cover that?  And how does "bitonal" quality compare with the "contrast adjustment" you need?

Best,
Shaun

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en
---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert Casties

unread,
Dec 19, 2016, 12:14:01 PM12/19/16
to iiif-d...@googlegroups.com
On 19.12.16 17:49, Javier de la Rosa wrote:
> In one of our projects at Stanford, we are feeding images from
> IIIF-compliant servers to a series of image processing steps to perform
> computer vision tasks. Such tasks require the images to be "clean" first.
> Part of the cleaning actions include:
> - Contrast adjustment
> - Brightness adjustment
> - Color reduction

Our image server digilib also does image manipulation on the server via
its own API:

http://digilib.sourceforge.net/scaler-api.html

The implementation does simple pixel value addition (brgt) and
multiplication (2^cont) for all color channels as well as separate
(rgba, rgbm) it also does color manipulation (colop) like reduction to
grayscale (GRAYSCALE, NTSC_GRAY, BITONAL), color inversion (INVERT), and
level-to-false-color mapping (MAP_GRAY_BGR).

> We also perform bilateral filtering and denoising, but these are way more
> specific. We know that viewers like Mirador already support contrast and
> brightness, and thought those could make a good addition to a future
> version of the Image API, probably something that might depend on the
> implementation. The questions are, is there any plan for supporting further
> image processing tasks as part of the specification? Would it be useful for
> anybody to have a more generalizable plugin framework to extend what the
> Image API can do?

I would really like to have an extension mechanism in IIIF for optional
image processing operations. Those may be implementation-specific but
maybe we could find some common ground for many implementations :-)

Cheers
Robert

David Newbury

unread,
Dec 19, 2016, 12:44:30 PM12/19/16
to iiif-d...@googlegroups.com
The way that I've thought about this sort of process is as a functional transform, or as a pipeline step—you take a IIIF image, you process it, and you return either another IIIF image (with a new URL) or a new manifest.  

Some of the things that we're talking about here (facial recognition, palette detection, OCR, content analysis)  don't modify the image pixels, they just add contextual information to it.   To me, that means that you provide it an image (or canvas?), and it returns a manifest with annotations.  There's definitely a use case here.

For each type of transformation that does modify pixels (brightness, contrast, posterization, gamma correction, filtering, sharpening, blur, etc), I worry that the existing image API will be quickly overwhelmed.  There are hundreds of parameters that you could tweak to adjust pixels, and a general-purpose image-manipulation API seems potentially out of scope.  I could see, however, a service that can take a IIIF Image URL and a set of properties and return a new IIIF Image API URL to the new, processed image.  

I guess I see both of these as potentially massive API surfaces, and I think that rather than try and add everything to the API, coming up with patterns for additional applications within the ecosystem that augment the existing APIs seems like a better idea than baking them all in.  Otherwise, we're talking about writing "ImageMagick As A Service", and if you've ever looked at that documentation, you'd know how horrible that could be.

Javier de la Rosa

unread,
Dec 19, 2016, 6:45:04 PM12/19/16
to IIIF Discuss
Hi,

Thanks for the replies. I looked at the grayscale and bitonal qualities, but by color reduction I meant posterization. While bitonal is a nice feature, we found CLAHE to be more useful. Again, this is all for our use case: we basically turn images of maps into edges and nodes. I understand and agree on that "ImageMacgick as a service" is out of scope, but it seems rather arbitrary to draw the line in bitonal or greyscale. Why not allow for Otsu threshold binarization for example? Although I'm sure there is a rationale behind :)

There is definitively a difference between processes that take in an image and produce another, and processes that add metadata of some sort. But just this simple distinction is a nice thing to bear in mind. In the lack of image processing support, maybe a general framework or a recommended way of extending the API would make a good addition to the API itself: things like format of URLs, format of parameters, format of output, how to pass in matrices, etc. That way I could plug in digilib or any other image processing lib. What about a possible Transformation API?

Cheers.


PS: digilib seems superb, thanks!


On Monday, December 19, 2016 at 9:10:25 AM UTC-8, Shaun Ellis wrote:
Javier,
This is certainly an emerging use case.  If we look at where content-analysis innovation is headed, it's easy to forecast the case for automated generation of OCR, translations, transcriptions, content analysis, natural language processing, and image/face recognition of our networked image resources.  In my opinion, the transformations necessary should certainly be considered as part of a future Image API.  

Is a more nuanced "Color reduction" needed for your purposes, or does the existing "grayscale" quality option cover that?  And how does "bitonal" quality compare with the "contrast adjustment" you need?

Best,
Shaun
On Mon, Dec 19, 2016 at 11:49 AM, Javier de la Rosa <ver...@gmail.com> wrote:
Hi,

After some discussion in the #general channel in Slack, I'm writing to the list to have some more feedback.

In one of our projects at Stanford, we are feeding images from IIIF-compliant servers to a series of image processing steps to perform computer vision tasks. Such tasks require the images to be "clean" first. Part of the cleaning actions include:
- Contrast adjustment
- Brightness adjustment
- Color reduction

We also perform bilateral filtering and denoising, but these are way more specific. We know that viewers like Mirador already support contrast and brightness, and thought those could make a good addition to a future version of the Image API, probably something that might depend on the implementation. The questions are, is there any plan for supporting further image processing tasks as part of the specification? Would it be useful for anybody to have a more generalizable plugin framework to extend what the Image API can do?

Cheers.

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss...@googlegroups.com.

Jack Reed

unread,
Dec 20, 2016, 4:48:14 PM12/20/16
to iiif-d...@googlegroups.com

I think the idea behind standardized image processing services is a good one. It sounds like several IIIF adopters are already doing processing in custom one off ways. What if there was at least a common framework to provide these services? I see a vendor opportunity here also to provide IIIF processing services. A corresponding standard in the geo world is the Web Processing Standard (WPS) http://www.opengeospatial.org/standards/wps .

 

I could see IIIF servers enabling this type of functionality and workflow:

 

Clip and process a region of an image. What about an OCR process on a region? Or even a shape recognition service?

 

We see this type of proprietary API popping up (https://cloud.google.com/vision/ anyone?). Does it make sense to collaborate here as a community (on software at least?) or are proprietary one-offs more appropriate in this regard?

 

I see this as problematic as an extension or enhancement of the Image API. But what about an ImageProcessing API?

 

-Jack

Companjen, B.A.

unread,
Dec 21, 2016, 10:44:07 AM12/21/16
to iiif-d...@googlegroups.com

I'm probably not seeing the whole picture, but here are some ideas.

 

What about the Profile Description in the Image API? http://iiif.io/api/image/2.1/#profile-description

It allows features that are not part of the compliance document to be specified as URIs.

Or related services? http://iiif.io/api/image/2.1/#related-services

This could allow a client to discover processing servers and how to get processed images from the Image API server.

 

If a client wants to process IIIF images, well, it just needs to speak the Image API, get the image and process it.

 

Regards,

 

Ben

Robert Sanderson

unread,
Dec 21, 2016, 11:18:52 AM12/21/16
to iiif-d...@googlegroups.com

Thanks, Ben, for bringing up services and the profile pattern, and in particular "supports".  And thanks to Javier for bringing up the question on the list :)

To expand a little, the Image API spec says:

> To allow for extensions, this specification does not define the server behavior when it receives requests that do not match either the base URI or one of the described URI syntaxes below.
(Section 2)

> URIs MAY be added to the supports list of a profile to cover features not defined in this specification.
(Section 5.3)

Together, these give a way to arbitrarily extend the API and advertise the availability of the extension.  The easiest way to do that is via query parameters tacked on the end of the existing URI structure, unless the extension's transformation fits into one of the existing parameter slots.  For example, a specific algorithm for color reduction could go into the quality slot, as falls into the same bucket as bitonal, gray and color, but brightness seems orthogonal to the current parameters.

So you could, for example, do:

And then add http://some.domain/iiif/extensions/brightness to the list of supported features.  At the URI of the extension you should have human readable documentation for the feature, so that developers who see the feature can find out how to use it.

Services are a possible way to do extensions as well, but I think David Newbury's palette detection service is a good example of their intent.  The result isn't an image, it's information that can be calculated about the image.  We don't have a "registry" of extensions (as there aren't any known yet!) but we do have a short list of known services that we should update:



One consideration for whether a parameter makes sense to add to the spec is whether the results of the parameter are actually predictable to the point of validating whether the server did the right thing or not.

For example, we do not specify the algorithm to reduce a color image to bitonal.  It could be a threshold, it could be dithering, etc. However the results are able to be validated by the simple test of "is every pixel either white or black?".  The concern with contrast, brightness and even [actually especially!] compression, is that the results between two implementations could be very different and both still claim compliance.

Not to say that's the only consideration, and if there were several independent implementations of a feature in both client and server, that would certainly trump any validation concerns.  


Hope that helps!

Rob



--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en


---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.


--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en


---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.

To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.

--
-- You received this message because you are subscribed to the IIIF-Discuss Google group. To post to this group, send email to iiif-d...@googlegroups.com. To unsubscribe from this group, send email to iiif-discuss+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/iiif-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "IIIF Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to iiif-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Rob Sanderson
Semantic Architect
The Getty Trust
Los Angeles, CA 90049
Reply all
Reply to author
Forward
0 new messages