function names and their signatures

69 views
Skip to first unread message

Almar Klein

unread,
May 23, 2012, 4:28:45 AM5/23/12
to ima...@googlegroups.com

Let's take a moment to think about what names our (small) API should expose, and what their signatures should be. I'll start with an overview of the names and sigs for some popular software. Note that most allow additional arguments.


Matlab: 

imread(filename,fmt)

imwrite(m, filename, fmt)


PIL: 

open(filename, mode=None)

im.save(filename, format=None) # Method of the Image class


visvis: 

imread(filename, format=None)

imwrite(filename, array, format=None)


mahotas: 

imread(filename, as_grey=False, formatstr=None)

imsave(filename, array, formatstr=None)


tiffile: 

imread(filename)

imsave(filename, array)


I would prefer to use names that match up as much as possible with existing packages. Existing Python packages are probably more important than matching with Matlab. Interestinglt, in all Python "write" functions the filename comes before the array to save, while in Matlab the array comes first. For reading, "imread" clearly wins, and the signature is no problem. For writing, both "imread" and "imwrite" are used. I think I like imsave, because that avoids confusion with Matlabs "imread". (I whish I'd thought of that when I made visvis.imwrite.)


It has been argued that we could just use "read" and "write" (or "save"). But I prefer the "im" prefix. Not only for compatibility with existing packages, but also because it allows a user to do "from imageio import imread, imsave".


I think it's probably a good idea to allow the user to explicitly specify the format. Especially when we are going to introduce new specific formats, there may come a day when two formats use the same extension, and some formats (e.g. DICOM) don't usually stick to one extension. Therefore, my proposal is:


imread(filename, format=None, **kwargs)

imsave(filename, array, format=None, **kwargs)


The (optional) keyword arguments are passed to the plugins. For FreeImage formats, they will mostly be converted to FreeImage flags.


(This post is about functionality for reading/writing 2D images. For animated/multipage images (and maybe volume data) I think I like to introduce separate functions. But lets discuss that in another thread.)

Almar Klein

unread,
May 30, 2012, 6:26:50 AM5/30/12
to ima...@googlegroups.com
For reading/writing images:
If there are no comments, let's go with imread(filename, format=None) and imsave(filename, img, format=None). 
For the record, this is also how Zach originally wrote it (until I broke it because I was used to using imwrite).

Almar Klein

unread,
May 30, 2012, 7:07:59 AM5/30/12
to ima...@googlegroups.com
Hi,

Some formats can store multiple images (e.g. in TIFF, GIF, ICO), and some formats even  support volumes. Let's think about how to expose this to the user.
I think it's best to expose separate functions for this. In theory you could check if a file has multiple images and then return a list instead of an array. But "explicit is better than implicit".

Some suggestion for multiple images: imsread, imread_multiple, imreadm.
For volumes: volread

I like 'imsread', although the 's' can be easy to miss, which might lead to confusion. 

  Almar

Thouis (Ray) Jones

unread,
May 30, 2012, 7:47:08 AM5/30/12
to ima...@googlegroups.com
On Wed, May 30, 2012 at 1:07 PM, Almar Klein <almar...@gmail.com> wrote:
> I think it's best to expose separate functions for this. In theory you could
> check if a file has multiple images and then return a list instead of an
> array. But "explicit is better than implicit".

I think there are arguments in favor of not using different functions,
but rather handling multiple-image formats using keyword arguments to
fetch particular subsets of the data (or even only a single image at a
time). In particular, I think it allows developers using the library
to use fewer code paths, as there can be a single place in the code
where images are read, and the code leading up to it builds up the
keyword arguments as necessary.

For efficiency, it might be worth separating reading/writing image
into separate phases: opening the file and reading/writing the data,
similar to how h5py wraps HDF5 files or matplotlib handles multipage
PDFs. It might be limiting to require all of the data be ready in
order to write a multi-image format.

Ray Jones

Almar Klein

unread,
May 30, 2012, 4:28:44 PM5/30/12
to ima...@googlegroups.com
Thanks for the response.

 
I think there are arguments in favor of not using different functions, 
but rather handling multiple-image formats using keyword arguments to 
fetch particular subsets of the data (or even only a single image at a 
time).  In particular, I think it allows developers using the library 
to use fewer code paths, as there can be a single place in the code 
where images are read, and the code leading up to it builds up the 
keyword arguments as necessary. 

I am in favor of reducing the number of code paths. And part of me likes the idea of just having two powerful functions (imread and imsave). But in that "single place where images are read" the code for reading one image will be quite different from reading multiple images (see e.g. the code for the freeimage wrapper that we have so far, or think about fiddling with the meta information for animated gif files). The code path leading up to that point is actually quite small; it's basically a couple of checks such as whether the file exists; things we can more or less combine by using the same subroutines.

But I think the biggest argument in favor of using different functions for different tasks is explicitness; if the user uses imread, he will get a single image or an exception is raised. He does not have to check whether the image is maybe 3D or actually a list of images. I know that most users will know what kind of image to expect. I'm just worried about the the size of the WTF for the user who doesn't.

Maybe we can use a hybrid approach, where we expose multiple functions to the user, which all call the same function and use a keyword argument to change the intended behavior, so we can maximize the length of the common code path.

 
For efficiency, it might be worth separating reading/writing image 
into separate phases: opening the file and reading/writing the data, 
similar to how h5py wraps HDF5 files or matplotlib handles multipage 
PDFs.  It might be limiting to require all of the data be ready in 
order to write a multi-image format. 

I like that idea a lot. I've actually run into memory errors sometime (on 32 bit) when I was trying to write a series of HD quality screenshots of a rendering. 

So how would that work in practice? I call function_to_read_series(filename), and instead of returning a list of images it returns a Reader object of some sort? Maybe even an iterator?

cheers,
  Almar

Thouis (Ray) Jones

unread,
May 31, 2012, 9:01:41 AM5/31/12
to ima...@googlegroups.com
On Wed, May 30, 2012 at 10:28 PM, Almar Klein <almar...@gmail.com> wrote:
> Maybe we can use a hybrid approach, where we expose multiple functions to
> the user, which all call the same function and use a keyword argument to
> change the intended behavior, so we can maximize the length of the common
> code path.

This seems like a reasonable approach, especially combined with good
defaults for common cases (so it just works for single
grayscale/RGB/RGBA images, but might throw an exception if handed a
multipage tiff, a volume dataset, or something else where the user's
intent is not obvious).

>> For efficiency, it might be worth separating reading/writing image
>> into separate phases: opening the file and reading/writing the data,
>> similar to how h5py wraps HDF5 files or matplotlib handles multipage
>> PDFs.  It might be limiting to require all of the data be ready in
>> order to write a multi-image format.
>
> I like that idea a lot. I've actually run into memory errors sometime (on 32
> bit) when I was trying to write a series of HD quality screenshots of a
> rendering.
>
> So how would that work in practice? I call
> function_to_read_series(filename), and instead of returning a list of images
> it returns a Reader object of some sort? Maybe even an iterator?

How about an object with number_of_images(), read_info(img_num) and
read_image(img_num) methods (just to start with something)? The
module's default read_image could just be to load image number 0.

I'm not sure of the best way to structure this. Some examples of
other projects that have solved this are pylibtiff (for TIFF images),
and Bioformats (for tons of image formats). The HOWTO for using
bioformats is worth looking at:
http://git.openmicroscopy.org/?p=bioformats.git;a=blob_plain;f=components/bio-formats/doc/using-bioformats.txt;hb=HEAD

Bioformats is written for the bio-image world with an emphasis on 2D
images, possibly with timelapse and/or Z stacks, and so the TZ
coordinates (and series) become first class parts of the interface.
I'd prefer a solution that avoids this sort of data-defined interface.

There are some valid question that are brought up by it, though. What
should a library do when asked to "read an image" from each of the
following:
- a TIFF file with a single RGB image?
- a TIFF file with 3 Z-stacks?
- a TIFF file with 3 different series, each of a single grayscale image?
- a TIFF file with three channels, but not really RGB (i.e., three
different cell stains)?
- a TIFF file with 3 different grayscale timepoints?

Some of these would reasonably return a MxNx3 array, some might return
a MxN grayscale image. I'm sure there are other random formats where
there's another way to separate images rather than
channel/Z/time/series. RGB vs. grayscale is a continual pain in the
ass, as well, a sometimes 3-channel data is stored in RGB, even though
it's not actual RGB (just some random 3-channel image).

Ray

Almar Klein

unread,
Jun 1, 2012, 5:56:58 AM6/1/12
to ima...@googlegroups.com
How about an object with number_of_images(), read_info(img_num) and
read_image(img_num) methods (just to start with something)?  The
module's default read_image could just be to load image number 0.

Yes, something like that looks great. Although I'd prefer to make it look like an iterator, so the user can use the object in a for-loop, and do len(readerObject) to get the number of images.

 
Bioformats is written for the bio-image world with an emphasis on 2D
images, possibly with timelapse and/or Z stacks, and so the TZ
coordinates (and series) become first class parts of the interface.
I'd prefer a solution that avoids this sort of data-defined interface.

Yes, imageio should be as agnostic as possible. But we should try to expose some kind of API that plugins can use to expose even a complex data format. What about allowing this readerObject to be multi-dimensional, like a numpy array. So a simple time series gets a shape of (n,) but a bioformat with a timelapse of Z-stacks gets shape (n,m). The API to retrieve an image would thus become readerObject.read_image(i,j,..), or maybe even readerObject[i,j].

 
There are some valid question that are brought up by it, though.  What
should a library do when asked to "read an image" from each of the
following:
- a TIFF file with a single RGB image?
- a TIFF file with 3 Z-stacks?
- a TIFF file with 3 different series, each of a single grayscale image?
- a TIFF file with three channels, but not really RGB (i.e., three
different cell stains)?
- a TIFF file with 3 different grayscale timepoints?
 
Some of these would reasonably return a MxNx3 array, some might return
a MxN grayscale image.  I'm sure there are other random formats where
there's another way to separate images rather than
channel/Z/time/series.  

I think that internally 3 Z-stacks would be stored the same as 3 timepoints; both as three pages of a grayscale image. And how the one with the 3 channels is stored depends on how the person who wrote the file did it. Can we not just return how the images are stored in the file, and its up to the user how he/she interprets it? But I might miss a point; I don't know TIFF that well.

Just to get this straight, I though that TIFF only provided a way to store muliple pages, or does TIFF provide some mechanism to store a timeseries of Z-stacks (i.e. a series of series)?

 
RGB vs. grayscale is a continual pain in the
ass, as well, a sometimes 3-channel data is stored in RGB, even though
it's not actual RGB (just some random 3-channel image).

 But imageio does not need to know whether it's RGB or some other multi-channel image. It just returns an NxMx3 numpy array. Again it's up to the user to interpret it correctly, right?

  Almar

Thouis Jones

unread,
Jun 1, 2012, 6:45:42 AM6/1/12
to ima...@googlegroups.com
On Fri, Jun 1, 2012 at 11:56 AM, Almar Klein <almar...@gmail.com> wrote:
>> How about an object with number_of_images(), read_info(img_num) and
>> read_image(img_num) methods (just to start with something)?  The
>> module's default read_image could just be to load image number 0.
>
>
> Yes, something like that looks great. Although I'd prefer to make it look
> like an iterator, so the user can use the object in a for-loop, and do
> len(readerObject) to get the number of images.

As long as it's possible to get the metadata about the images, I think
that would be fine.

>> There are some valid question that are brought up by it, though.  What
>> should a library do when asked to "read an image" from each of the
>> following:
>> - a TIFF file with a single RGB image?
>> - a TIFF file with 3 Z-stacks?
>> - a TIFF file with 3 different series, each of a single grayscale image?
>> - a TIFF file with three channels, but not really RGB (i.e., three
>> different cell stains)?
>> - a TIFF file with 3 different grayscale timepoints?
>>
>>
>>
>> Some of these would reasonably return a MxNx3 array, some might return
>>
>> a MxN grayscale image.  I'm sure there are other random formats where
>> there's another way to separate images rather than
>> channel/Z/time/series.
>
>
> I think that internally 3 Z-stacks would be stored the same as 3 timepoints;
> both as three pages of a grayscale image. And how the one with the 3
> channels is stored depends on how the person who wrote the file did it. Can
> we not just return how the images are stored in the file, and its up to the
> user how he/she interprets it? But I might miss a point; I don't know TIFF
> that well.
>
> Just to get this straight, I though that TIFF only provided a way to store
> muliple pages, or does TIFF provide some mechanism to store a timeseries of
> Z-stacks (i.e. a series of series)?

TIFF, through private tags, can store all kinds of things. A common
format in biological imaging is the LSM format, which can add all
kinds of extra dimensions. For an example, search for LSMInfo on the
pylibtiff page: http://code.google.com/p/pylibtiff/

Perhaps this is something that imageio doesn't need to deal with, but
there does need to be *some* way to get at the metadata, I think.

>>
>> RGB vs. grayscale is a continual pain in the
>> ass, as well, a sometimes 3-channel data is stored in RGB, even though
>> it's not actual RGB (just some random 3-channel image).
>
>
>  But imageio does not need to know whether it's RGB or some other
> multi-channel image. It just returns an NxMx3 numpy array. Again it's up to
> the user to interpret it correctly, right?

Probably. This requires the user to be able to fetch the metadata
needed to do so, though. Also, there's nothing that says that TIFF
has to store every image the same size. So some could be NxM and some
could be something else.

The RGB vs. three Z-stacks is made more ambiguous by some strangeness
in TIFF, in that an RGB image can be stored all together (as RGB) or
as separate grayscale images. So I think it's sometimes difficult to
do something correct for get_image(image_index=0). Is that the R
channel of an RGB image, or all the RGB data together?

Maybe all of this can be sidestepped in the interest of simplicity,
but I think it's important to balance the simplicity against the power
of whatever you end up with. TIFF is just an egregiously difficult
case, but one that many scientists have to deal with.

See here for more TIFF fun:
http://blogs.mathworks.com/steve/2007/09/16/multipage-tiffs/

Ray Jones

Almar Klein

unread,
Jun 1, 2012, 7:24:16 AM6/1/12
to ima...@googlegroups.com
>> How about an object with number_of_images(), read_info(img_num) and
>> read_image(img_num) methods (just to start with something)?  The
>> module's default read_image could just be to load image number 0.
>
>
> Yes, something like that looks great. Although I'd prefer to make it look
> like an iterator, so the user can use the object in a for-loop, and do
> len(readerObject) to get the number of images.

As long as it's possible to get the metadata about the images, I think
that would be fine.

Yeah, that's a good point. But we also want to make getting (and setting!) the metadata easy for single images. I'll open up a new thread for this...
 
 
TIFF, through private tags, can store all kinds of things.  A common
format in biological imaging is the LSM format, which can add all
kinds of extra dimensions.  For an example, search for LSMInfo on the
pylibtiff page: http://code.google.com/p/pylibtiff/

Perhaps this is something that imageio doesn't need to deal with, but
there does need to be *some* way to get at the metadata, I think.

So, if I understand correctly, TIFF allows one to put multiple images in a file, and to store arbitrary tags (metadata) with it. LSM is a format that "defines" some tag names to give them meaning and thereby more explicit structure to the file.

If that is the case, then the plugin that deals with TIFF can be relatively agnostic about what the data mean, while the LSM plugin (if we get one) would use the meta information to give meaning to the data and uses it to structure what it returns. Does that make sense?


The RGB vs. three Z-stacks is made more ambiguous by some strangeness
in TIFF, in that an RGB image can be stored all together (as RGB) or
as separate grayscale images.  So I think it's sometimes difficult to
do something correct for get_image(image_index=0).  Is that the R
channel of an RGB image, or all the RGB data together?

Yuk. I suppose the best solution would be to combine them to RGB if the user used imread() and give the separate images if multiple_imread() (or whatever we call it) was used. 

 
Maybe all of this can be sidestepped in the interest of simplicity,
but I think it's important to balance the simplicity against the power
of whatever you end up with.  TIFF is just an egregiously difficult
case, but one that many scientists have to deal with.

Yes, we should offer simplicity for the simple cases, and allow more power for the more complex cases. But in a way that keeps our lives easy :)  The hard part should be for the person that implements the "plugin" to read a certain format.

So how does one use for instance the pylibtiff library? How is the structure of the data exposed to the user?
Thanks, that helped my understanding of TIFF some more. 

  Almar

Almar Klein

unread,
Jun 6, 2012, 5:53:35 AM6/6/12
to ima...@googlegroups.com
 
TIFF, through private tags, can store all kinds of things.  A common
format in biological imaging is the LSM format, which can add all
kinds of extra dimensions.  For an example, search for LSMInfo on the
pylibtiff page: http://code.google.com/p/pylibtiff/

Perhaps this is something that imageio doesn't need to deal with, but
there does need to be *some* way to get at the metadata, I think.

So, if I understand correctly, TIFF allows one to put multiple images in a file, and to store arbitrary tags (metadata) with it. LSM is a format that "defines" some tag names to give them meaning and thereby more explicit structure to the file.

If that is the case, then the plugin that deals with TIFF can be relatively agnostic about what the data mean, while the LSM plugin (if we get one) would use the meta information to give meaning to the data and uses it to structure what it returns. Does that make sense?


The RGB vs. three Z-stacks is made more ambiguous by some strangeness
in TIFF, in that an RGB image can be stored all together (as RGB) or
as separate grayscale images.  So I think it's sometimes difficult to
do something correct for get_image(image_index=0).  Is that the R
channel of an RGB image, or all the RGB data together?

Yuk. I suppose the best solution would be to combine them to RGB if the user used imread() and give the separate images if multiple_imread() (or whatever we call it) was used. 

 
Maybe all of this can be sidestepped in the interest of simplicity,
but I think it's important to balance the simplicity against the power
of whatever you end up with.  TIFF is just an egregiously difficult
case, but one that many scientists have to deal with.

Yes, we should offer simplicity for the simple cases, and allow more power for the more complex cases. But in a way that keeps our lives easy :)  The hard part should be for the person that implements the "plugin" to read a certain format.


Ok, let me propose the following. Please let me know if this sounds right to you...

  • There shall be 4 functions for reading: read() imread() mimread() (the M is for multiple) and volread().
  • The function read(filename, format=None, expect=None, **kwargs) returns a reader object. Using the expect keyword the user can specify whether he expects an image, multiple images or a volume (or other?).
  • The reader has a read_array(*indices) method and a read_meta(*indices) method to read the image data and meta info.
  • The reader object has a __len__ method and can be used as an iterator, which will simply yield all images/volumes.
  • The reader object may have additional methods. For instance, LSM may implement reader.stacks() to iterate over the z-stacks, or reader.timepoints() to iterate over the timepoint, or expose whatever API works for the specific format. Naturally, we should try and keep the API's of such "exotic" formats similar, but we can work out the details as they are introduced in imageio. 
  • imread() does someting like read(filename, format, expect=IMAGE).read_array(0)
  • mimread() does something like [im for im in read(filename, format, expect=MULTI_IMAGE)]
  • A similar story applies to save(), imsave(), mimsave(), and volsave().
So each plugin for imageio consists of a Reader and a Writer class, which will be instantiated and returned by read() and save(). The "expect" argument can be used by the read() function to check whether the selected/detected plugin supports what is expected. But in some cases can also be used by the plugin to yield the right data. For instance RGB tiff where the color planes are stored separately, or specifying VOLUME in DICOM will combine the multiple images to a volume for you.

There should also be an easy way for the user to get access to the documentation of a plugin, to see which keyword arguments can be used, and in some cases what the reader/writer API looks like for a specific format. 

If this sounds like a good way to start, I'll start coding on an initial version.

  Almar
 

Almar Klein

unread,
Jul 5, 2012, 5:50:01 PM7/5/12
to ima...@googlegroups.com
Ok, let me propose the following. Please let me know if this sounds right to you...

  • There shall be 4 functions for reading: read() imread() mimread() (the M is for multiple) and volread().
  • The function read(filename, format=None, expect=None, **kwargs) returns a reader object. Using the expect keyword the user can specify whether he expects an image, multiple images or a volume (or other?).
  • The reader has a read_array(*indices) method and a read_meta(*indices) method to read the image data and meta info.
  • The reader object has a __len__ method and can be used as an iterator, which will simply yield all images/volumes.
  • The reader object may have additional methods. For instance, LSM may implement reader.stacks() to iterate over the z-stacks, or reader.timepoints() to iterate over the timepoint, or expose whatever API works for the specific format. Naturally, we should try and keep the API's of such "exotic" formats similar, but we can work out the details as they are introduced in imageio. 
  • imread() does someting like read(filename, format, expect=IMAGE).read_array(0)
  • mimread() does something like [im for im in read(filename, format, expect=MULTI_IMAGE)]
  • A similar story applies to save(), imsave(), mimsave(), and volsave().
So each plugin for imageio consists of a Reader and a Writer class, which will be instantiated and returned by read() and save(). The "expect" argument can be used by the read() function to check whether the selected/detected plugin supports what is expected. But in some cases can also be used by the plugin to yield the right data. For instance RGB tiff where the color planes are stored separately, or specifying VOLUME in DICOM will combine the multiple images to a volume for you.

There should also be an easy way for the user to get access to the documentation of a plugin, to see which keyword arguments can be used, and in some cases what the reader/writer API looks like for a specific format. 

I did some coding and I implemented something along these lines. The framework is there and the FreeImage plugin makes use of it (although only for single images). Will update the documentation shortly to make it easier to get an overview, and to see how the code should be used and how plugins should be implemented.

  Almar

Almar Klein

unread,
Jul 6, 2012, 6:43:19 AM7/6/12
to ima...@googlegroups.com
Docs are up as well: imageio.readthedocs.org

Comments are most welcome.

  Almar
 

Almar Klein

unread,
Jul 17, 2012, 6:42:00 AM7/17/12
to ima...@googlegroups.com
Hi,

Imageio is now available on Pypi. Let me give a brief overview of the current status with regard to the functions:

The module functions.py defines the main interface for the user. Currently there are imread, imsave, read, save, and help. The latter is a convenience function that prints the docs for a given format. To give an idea of the implementation, here is an extract of imread and read:


def imread(filename, format=None, **kwargs):
    reader = read(filename, format, base.EXPECT_IM, **kwargs)
    with reader:
        return reader.read_data(0)

def read(filename, format=None, expect=None, **kwargs):
    
    # <Test filename>
    
    # Create request object
    request = base.Request(filename, expect, **kwargs)
    
    # Get format
    if format is not None:
        format = formats[format]
    else:
        format = formats.search_read_format(request)
    if format is None:
        raise ValueError('Could not find a format to read the specified file.')
    
    # Return its reader object
    return format.read(request)


The classes for Request, Reader, Writer and Format are defined in base.py (the average needs to know nothing of these). With little over 500 lines it fully implements the plugin system, including docstrings and some niceties. The API for the reader and writer class is yet a bit vague. I'll probably have to implement a few formats myself to see what works well and what not. 

In any case, my feeling is that the API for the functions can be considered "stable", and people should be able to start using imageio.

  Almar

Reply all
Reply to author
Forward
0 new messages