masking based on shapely geometries/paths

874 views
Skip to first unread message

Niall Robinson

unread,
Apr 30, 2014, 9:57:32 AM4/30/14
to scitoo...@googlegroups.com
Hi all/Phil :D

Is there a way to mask of a cube based on shapely geometries? I'd like to mask of all data points that aren't in the US. I get the feeling I should be able to do this by downloading the natural earth geometry for the us but I can't figure it out.

Thanks
Niall

Andrew Dawson

unread,
Apr 30, 2014, 10:13:01 AM4/30/14
to scitoo...@googlegroups.com
 
I suppose you should be able to use the same technique here, generate area weights and convert them to in or out using some threshold, and use this result as a mask for the cube. Our previous discussion may be useful: https://groups.google.com/forum/#!searchin/scitools-iris/mask/scitools-iris/4v4nA0nLG3w/Y6gCaBbfPgEJ.

Niall Robinson

unread,
Apr 30, 2014, 11:16:46 AM4/30/14
to scitoo...@googlegroups.com
speaking to someone here at the MO, they suggested looping over all the cells, and doing something along the lines of doing some kind of

point([lat, lon]).within(us_geo_thing)

I'm not sure of the syntax, I'll post back if I figure it out.

Niall Robinson

unread,
Apr 30, 2014, 11:44:34 AM4/30/14
to scitoo...@googlegroups.com
Here's the solution

# Mask off the seas.
import cartopy.io.shapereader as shpreader
from cartopy.mpl.patch import geos_to_path
from shapely.geometry import Point

shpfilename
= shpreader.natural_earth(resolution='110m',
                                      category
='cultural',
                                      name
='admin_0_countries')
reader
= shpreader.Reader(shpfilename)
countries
= reader.records()
us_multipoly
, = [country.geometry for country in countries
                 
if country.attributes['name'] == 'United States']
main_us_geom
= sorted(us_multipoly.geoms, key=lambda geom: geom.area)[-1]

def get_mask(cube, geom):
    mask
= np.ones(cube.data.shape)
    p
= -1
   
for i in np.ndindex(cube.data.shape):
       
if i[0] != p:
           
print i[0],
            p
= i[0]
        this_cube
= cube[i]
        this_lat
= this_cube.coord('latitude').points[0]
        this_lon
= this_cube.coord('longitude').points[0] - 360
        this_point
= Point(this_lon, this_lat)
        mask
[i] = this_point.within(geom)
       
   
return mask

ScottHosking

unread,
Jan 4, 2016, 11:36:15 AM1/4/16
to Iris
Hi, I've been trying to do something similar to this, but instead mask off all data points outside of the Atlantic Ocean sector.  Has anyone done this with ocean sectors before?

Phil Elson

unread,
Jan 9, 2016, 10:05:48 AM1/9/16
to Iris
I'm aware of a MO team which has developed this capability in-house, but nothing which has been released as OSS. I also have a branch of iris which adds such a capability to iris - though I didn't finish making it merge-worthy at this point (I can dig that out if you're interested in taking it on). Also, Filipe wrote a blog post about the manual way of doing this: https://ocefpaf.github.io/python4oceanographers/blog/2015/08/17/shapely_in_polygon/.

There is a whole field of science that is opened up with this functionality, so definitely think this is something that Iris should have built-in.

--
You received this message because you are subscribed to the Google Groups "Iris" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scitools-iri...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Filipe Pires Alvarenga Fernandes

unread,
Jan 10, 2016, 10:17:15 AM1/10/16
to Iris
I also have a branch of iris which adds such a capability to iris - though I didn't finish making it merge-worthy at this point (I can dig that out if you're interested in taking it on).

That would be awesome!
 
Also, Filipe wrote a blog post about the manual way of doing this: https://ocefpaf.github.io/python4oceanographers/blog/2015/08/17/shapely_in_polygon/.

This kind of operation is very slow with shapely.  Matplolib can be much faster:


(Not sure how "correct" that notebook is.  It is old and incomplete.)

-Filipe

Phil Elson

unread,
Jan 10, 2016, 10:24:21 AM1/10/16
to Filipe Pires Alvarenga Fernandes, Iris
If you do end up rolling your own, you will quickly find it is painfully slow unless you apply some optimisations. The first you will find is to "prepare" your geometry. Doing so makes a huge difference to the run time. If you're doing masking type operations (which we are all describing) there is also some built-in functionality within Shapely (I added it a few years ago now) which vectorises the masking operation to squeeze out every last drop of performance. 

My original notebook is available at http://nbviewer.ipython.org/gist/pelson/9785576.
Fillipe, feel free to write a super-awesome blog post about shapely.vectorized ;)

--

Filipe Pires Alvarenga Fernandes

unread,
Jan 10, 2016, 2:49:21 PM1/10/16
to Phil Elson, Iris
If you do end up rolling your own, you will quickly find it is painfully slow unless you apply some optimisations. The first you will find is to "prepare" your geometry. Doing so makes a huge difference to the run time. If you're doing masking type operations (which we are all describing) there is also some built-in functionality within Shapely (I added it a few years ago now) which vectorises the masking operation to squeeze out every last drop of performance. 

I see that you added this code back in 2014!  The shapely documentation does not mention this functionality :-(
 
My original notebook is available at http://nbviewer.ipython.org/gist/pelson/9785576
Fillipe, feel free to write a super-awesome blog post about shapely.vectorized ;)

Awesome!  Will do soon. Thanks for sharing that notebook!!

Meanwhile I added a quick test using my example (I have 9,765,000 points there):


I am getting 7.19 s with shapely.vectorized, which is much faster than the 3 min I got without it, but still a little behind matplotlib's 4.88 s.

PS: I do have projects where I added matplotlib as a dependency only for this functionality.  I guess that removing one heavy dependency is worth the small performance loss.

Stephan Hoyer

unread,
Jan 10, 2016, 7:12:14 PM1/10/16
to Filipe Pires Alvarenga Fernandes, Phil Elson, Iris
rasterio (via GDAL) has very high performance for rasterizing shapes. In my experience, it's much faster than shapely.

Here's an example using xray -- I imagine it would be pretty easy to adapt to Iris:

Cheers,
Stephan

Filipe Pires Alvarenga Fernandes

unread,
Jan 11, 2016, 7:40:32 AM1/11/16
to Stephan Hoyer, Phil Elson, Iris
rasterio (via GDAL) has very high performance for rasterizing shapes. In my experience, it's much faster than shapely.

Thanks Stephan! 
 
Here's an example using xray -- I imagine it would be pretty easy to adapt to Iris:

I believe I've seen that before. I just tried it but I could not make it work for my geometries.  I will try again and see if the rasterizing path is faster than matplotlib Path.

Cheers,
Stephan

Thanks again,

-Filipe 
Reply all
Reply to author
Forward
0 new messages