applying arbitrary functions to data arrays (scipy.ndimage)

135 views
Skip to first unread message

Noah Brenowitz

unread,
Oct 29, 2016, 11:05:30 PM10/29/16
to xarray
Hello,

I am wondering if there is some nice decorator or dataarray method which can be  used to make arbitrary shape-preserving transformations of the data. For example, scipy.ndimage is loaded with potentially useful functions for dealing with xarray dataarrays. I am often doing something like a gaussian_filter on xarray data, but my current method for doing this is ugly:

    tempblur = temperature.copy()
    tempblur.values = gaussian_filter(tempblur, 2.0).

Obviously, this is pretty ugly code, and editing arrays in-place like this can be confusing, especially if these two lines of code get separated somehow. I could of course implement a decorator to handle this circumstance, but then it would be necessary to call the decorator on every function I plan to use. A better alternative by my lights would be something like

    tempblur = temperature.apply(lambda x: gaussian_filter(x, 2.0)).

I have looked in the documentation, but I cannot seem to find a suggested workflow.

On another note, I think it would be really cool if xarray implemented its own wrapper around scipy.ndimage which could handle interpolating the data to a regular grid, calling scipy.ndimage functions, and keeping track of attributes and dimensions. Then something like "temperature.gaussian_filter(2.0)" would be possible. 

Stephan Hoyer

unread,
Oct 30, 2016, 2:58:36 AM10/30/16
to xarray
I agree, these sorts of functionality would be useful. In fact, Dataset.apply can do exactly what you envision, but for some reason we never got around to adding it to DataArray:

In [11]: ds = xr.Dataset({'x': ('y', np.arange(10.0))})

In [12]: ds.apply(gaussian_filter, sigma=2.0)
Out[12]:
<xarray.Dataset>
Dimensions:  (y: 10)
Coordinates:
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    x        (y) float64 1.162 1.537 2.21 3.067 4.014 4.986 5.933 6.79 7.463 ...

If you'd like to put together a PR adding this for DataArray, that would be very welcome.

More generally, I'm been working on machinery for applying arbitrary vectorized functions on unlabeled arrays to xarray objects (see https://github.com/pydata/xarray/pull/964). The main remaining challenge is figuring out a user friendly API, especially for the more complicated cases where shapes can change. I would really appreciate feedback on my approach -- comments on the pull request would be very welcome.

For ndimage wrappers, I'm reluctant to add a multitude of additional built-in methods, but this might be a good use case for the extensible accessor interface (http://xarray.pydata.org/en/stable/internals.html#extending-xarray). Calling these method could then look like `temperature.ndimage.gaussian_filter(2.0)`.

Cheers,
Stephan


--
You received this message because you are subscribed to the Google Groups "xarray" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xarray+unsubscribe@googlegroups.com.
To post to this group, send email to xar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xarray/773f952c-6d98-4894-9bdf-3f9e1647b3a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages