img_as_float

134 views
Skip to first unread message

Johannes Schönberger

unread,
Feb 17, 2016, 4:21:12 PM2/17/16
to scikit...@googlegroups.com
Hi everyone,

There has been a little discussion in #1945 about our range conversion and I think that we should try to avoid `img_as_float` in our functions as much as possible, unless really necessary for the correctness of the algorithm. In most cases, there is no reason to tinker around with the range of values of the input image. Some examples:

skimage.filter.gaussian_filter
skimage.transform.warp et al.
skimage.filter.sobel et al.
skimage.restoration.denoise_tv_chambolle

What is your opinion?

Best, Johannes

Robin Wilson

unread,
Feb 17, 2016, 5:30:30 PM2/17/16
to scikit-image
Hi,

For what it's worth, I agree with you entirely. I process satellite imagery with scikit-image, and usually want to keep the values exactly how they are (as they represent actual physical measurements in SI units) - and so converting to an arbitrary range always causes me problems.

(I'm probably not representative of the typical skimage user though - so don't take my opinion too seriously!)

Robin

Johannes Schönberger

unread,
Feb 17, 2016, 5:39:57 PM2/17/16
to scikit...@googlegroups.com
Thanks for your feedback!

> (I'm probably not representative of the typical skimage user though - so don't take my opinion too seriously!)

I am not sure if there is a typical skimage user... at least we don't have a good picture of what people are using the library at this point. Maybe we should do a survey at some point...

Michael Aye

unread,
Feb 17, 2016, 6:36:57 PM2/17/16
to scikit-image
I agree with Robin that changing the range of values silently (if that's still the case) is a problem for users like us where the pixel values are actually some kind of physical measures and not just a RGB value of a photograph.
One quick fix would be to be 'quite loud' about it and provide user feedback when it happens.
I would welcome an effort to remove this where possible.

Michael

Stéfan van der Walt

unread,
Feb 17, 2016, 8:07:14 PM2/17/16
to scikit-image
On 17 February 2016 at 15:36, Michael Aye <kmicha...@gmail.com> wrote:
> I agree with Robin that changing the range of values silently (if that's
> still the case) is a problem for users like us where the pixel values are
> actually some kind of physical measures and not just a RGB value of a
> photograph.
> One quick fix would be to be 'quite loud' about it and provide user feedback
> when it happens.
> I would welcome an effort to remove this where possible.

I am not opposed to supporting range preservation, but we do have to
think a bit about the implications:

- what do you do when the data-type of values change?
- what do you do when your operation due to, e.g., rounding issues
push values outside the input range?
- what do you do when you need to know the full potential range of the data?

The ``preserve_range`` flag has allowed us to do whatever we do
normally, unless the user gave explicit permission to change data
types, ranges, etc. It also serves as a nice tag for "I, the
developer, thought about this issue".

Stéfan

Michael Aye

unread,
Feb 18, 2016, 6:00:04 PM2/18/16
to scikit-image
I am not opposed to supporting range preservation, but we do have to 
think a bit about the implications:

- what do you do when the data-type of values change?

What are the situations where they *have* to change? 
 
- what do you do when your operation due to, e.g., rounding issues
push values outside the input range?

Return a new object instead of changing the original, maybe?
 
- what do you do when you need to know the full potential range of the data?

 don't understand, do you mean the full potential range per data-type? isn't that defined by the data-type the input image has?

The ``preserve_range`` flag has allowed us to do whatever we do
normally, unless the user gave explicit permission to change data
types, ranges, etc.  It also serves as a nice tag for "I, the
developer, thought about this issue".

And that's quite cool that that's offered, but the question is, I guess, which default is best and why? 
Which default setting would confuse the least new (and old) users?

Michael

Josh Warner

unread,
Feb 18, 2016, 7:10:07 PM2/18/16
to scikit-image
Sometimes the input dtype needs to change, at least along the way. As just one example:
  • uint8 or uint16 inputs with a chain of calculations, including transformations or exposure tweaks. In this instance, all intermediate calculations should be carried out with full floating-point precision. If forced back into their originating dtype at each step, the result would have terrible compounded error. 
Returning to the original dtype at the end would be reasonable, but you only want to do this once. Because of our functional approach (vs. VTK's pipelining or similar), there is no way for us to know which step is the final one. So - if desired - the user needs to handle this, because from such functions we'll always return the higher precision.

We always return a new object, unless the function explicitly operates on the input. When this is possible it is enabled by a standard `out=None` kwarg like in numpy/scipy.

One of the biggest things the "float images are on range [0, 1]" saves us from is worrying about aliasing. At all. We just do calculations, it doesn't matter if the input image gets squared a few times along the way. Try to do a few simple numpy operations on a uint8 array and see how fast the results aren't what you expect. Now, we can relax this and still be mostly OK because float64 is big. But concerns like this are a huge potential maintenance headache. I think what Stefan means by "full potential range" is that you have to plan calculations in advance, examining every intermediate step for its maximum potential range, against your dtype.

Certain exposure calculations are explicitly defined with normalized images on the range [0, 1], because they heavily use exponential functions. An input with a greater range must be handled carefully by any such function. This is the greatest danger in simply removing the normalization step from the package, IMO. A lot of things will break, and depending on the algorithm the fix may vary.

Perhaps that helps pull back the curtain a little...

Josh

Stéfan van der Walt

unread,
Apr 29, 2016, 4:05:43 AM4/29/16
to scikit...@googlegroups.com
One more comment on this issue:

a) *Most* of the time we want to use floating point numbers so that we can do operations like image / 2 without worrying about rounding.
b) Some users have very large images and *must* use ubytes to fit their data into memory.

I therefore see a fairly fundamental clash in requirements that cannot easily be removed.

As things stand, however, we probably already cause headaches for users with (2), because many of our intermediate stages convert images to floating point format.  So if we had to settle on a single format, floating point is what I'd go for.

But then, of course, you open the lovely can of worms of reading in your uint64 stored images (which sometimes happen to store values between 0 and 255 only), and having no idea why you are dealing with some crazy floating point number.

So, let's keep thinking about this one!

Stéfan

--
You received this message because you are subscribed to the Google Groups "scikit-image" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image...@googlegroups.com.
To post to this group, send email to scikit...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/scikit-image/455ac4c1-d145-4b30-93da-217adb817538%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages