[postgis-users] Querying Multiple Rasters

168 views
Skip to first unread message

Jayson Gallardo

unread,
Jul 23, 2013, 1:43:47 PM7/23/13
to postgi...@lists.osgeo.org
I've looked and looked, but I have not been able to find an answer to my question. I have downloaded elevation data for the state of Arkansas (in the form of multiple tiles), and used raster2pgsql to upload it into a single table:

raster2pgsql -I -C -e -F -t 50x50 -l 2,4 n*/grdn* public.dem_elevation | psql -U postgres -d testdb -h localhost -p 5432

I did this because I didn't know how to pull the data if they were in separate tables. Now, however I would like to add elevation data for other areas. I tried to just add it to the current table, but that required dropping the constraints which for such a huge amount of data seems to take a long time (I let it run for 24+ hours and it didn't finish). So, my question is, if I load all my rasters as individual tables, how could I run something similar to this query on them all (from a python script):

SELECT ST_AsGDALRaster(ST_CLIP(ST_Union(rast), ST_GeomFromText(WKT,900913)),'GTiff') FROM "dem_elevation" WHERE ST_Intersects(rast, ST_Transform(ST_GeomFromText(WKT,900913),4269))

My goal, if it's not obvious, is to clip elevation data and export it to a GTiff format and perform some operations on that raster data. Eventually, I would like to put the whole continental US elevation data into my database, so I need to be able to do so, while still being able to query them based on an area of interest the user selects from a map. I started working with PostGIS and Mapserver last month, so please forgive my ignorance on such topics. Thanks in advance

Bborie Park

unread,
Jul 23, 2013, 1:57:55 PM7/23/13
to PostGIS Users Discussion
You may not need to drop all the constraints when adding additional data to the table. You most likely will need to drop is the maximum extent constraint. Assuming the input rasters have the same scale, skew and SRID as that found in the table, you don't need to drop those corresponding constraints.

If you're going to do the continental US at a fine resolution (e.g. 1 meter), you do NOT want to put all the rasters in one table. You'll want to use a partitioned table structure and should consider a bigger tile size (depending on your hardware).

-bborie


_______________________________________________
postgis-users mailing list
postgi...@lists.osgeo.org
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users


Jayson Gallardo

unread,
Jul 23, 2013, 2:05:09 PM7/23/13
to PostGIS Users Discussion
Thanks for responding. Could you outline how I would go about doing a partitioned table structure? My only concern with tile size is processing time. Most of my queries will involve areas of less than 1 mi^2, and I would clip the data into that shape. I just don't know where to start! There's not too many resources online/print dealing with postgis rasters in detail.

Bborie Park

unread,
Jul 23, 2013, 2:14:35 PM7/23/13
to PostGIS Users Discussion
Can you describe your elevation dataset? Is it USGS NED? At which resolution (10 meter, 3 meter?)?

As for table partitioning...


You'll probably partition spatially, though an easy solution is to have a table for each input raster file.

-bborie



Jayson Gallardo

unread,
Jul 23, 2013, 2:56:11 PM7/23/13
to PostGIS Users Discussion

Yes, it's usgs ned. And I initially went with one table for each input tile, but I didn't know how to join (or union) them together for my query.

Bborie Park

unread,
Jul 23, 2013, 3:33:42 PM7/23/13
to PostGIS Users Discussion
I use the USGS NED 10 meter for California with one table for each input raster. In the partitioned table scheme, data tables inherit from a template (parent) table. Queries run on the parent table access the inherited tables.

-bborie

Jayson Gallardo

unread,
Jul 23, 2013, 3:51:17 PM7/23/13
to PostGIS Users Discussion
So based on the link you provided, and what else I've gathered, I first create a parent table:
CREATE TABLE dem_elevation
(
  rid integer NOT NULL PRIMARY KEY
  rast raster,

);
 Then I run raster2pgsql on all the downloaded elevation data, sending each input tile to its own table, ie. dem_elevation_n36w091. Then alter table to inherit from parent:
ALTER TABLE dem_elevation_n36w091 INHERIT dem_elevation;

With raster2pgsql taking care of setting the constraints for each table. Now, I can just query the parent table dem_elevation to get what I need?

Bborie Park

unread,
Jul 23, 2013, 3:53:35 PM7/23/13
to PostGIS Users Discussion
I'd suggest adding constraints after the fact through SQL instead of letting raster2pgsql do it.


-bborie

Jayson Gallardo

unread,
Jul 23, 2013, 4:02:53 PM7/23/13
to PostGIS Users Discussion
Okay, is there a specific reason why? As your link states: "raster2pgsql loader uses this function to register raster tables". Are you saying I should specify constraints that will be similar across all tables?

Bborie Park

unread,
Jul 23, 2013, 4:05:54 PM7/23/13
to PostGIS Users Discussion
No. I'm suggesting it later as it does take time and separates operations. Get everything imported first and then add constraints.

Having said that, you can do it all at once if so desired... just preference depending on volume of import data.

-bborie

Jayson Gallardo

unread,
Jul 23, 2013, 4:10:22 PM7/23/13
to PostGIS Users Discussion
Oh, okay. Yeah you're right about it taking time. I wrote a python script to generate the raster2pgsql call with the appropriate table name, so I can just let it run while I do other things. I really appreciate your help on this. I googled your name and I see you're a pretty busy person, so I'm glad you're taking the time to answer my questions.

Bborie Park

unread,
Jul 23, 2013, 4:14:08 PM7/23/13
to PostGIS Users Discussion
I'm just glad to help. Feel free to post your experience, feedback, issues and/or wishes on the mailing-list.

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 10:20:46 AM7/30/13
to PostGIS Users Discussion
Quick follow up question to my situation... I recently loaded 3m resolution NED for Iowa. I have them loaded to one table per source tile, and have them inheriting from the parent table that the Arkansas NED is inheriting from. Ever since, however, my database seems to be running pretty slow. I've run a full vacuum on the data, and there are constraints on each table. 

How can I be sure that when I query the parent database that it's not querying every single table?

Jayson Gallardo

unread,
Jul 30, 2013, 10:49:06 AM7/30/13
to PostGIS Users Discussion
So, I used Explain on my SELECT statement, and whether constraint_exclusion is on or off, it seems to spit out the same number of rows in the query plan. Is there something I need to do for my table constraints so that it doesn't do a check on every table I have loaded?

Bborie Park

unread,
Jul 30, 2013, 12:12:27 PM7/30/13
to PostGIS Users Discussion
Jayson,

Can you share one of the queries? Also, what check constraints are you using?

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 12:25:38 PM7/30/13
to PostGIS Users Discussion
Here's the constraints:
  CONSTRAINT dem_elevation_n33w092_pkey PRIMARY KEY (rid ),
  CONSTRAINT enforce_height_rast CHECK (st_height(rast) = 100),
  CONSTRAINT enforce_max_extent_rast CHECK (st_coveredby(st_convexhull(rast), '...truncated...'::geometry)),
  CONSTRAINT enforce_num_bands_rast CHECK (st_numbands(rast) = 1),
  CONSTRAINT enforce_out_db_rast CHECK (_raster_constraint_out_db(rast) = '{f}'::boolean[]),
  CONSTRAINT enforce_pixel_types_rast CHECK (_raster_constraint_pixel_types(rast) = '{32BF}'::text[]),
  CONSTRAINT enforce_same_alignment_rast CHECK (st_samealignment(rast, '...truncated...'::raster)),
  CONSTRAINT enforce_scalex_rast CHECK (st_scalex(rast)::numeric(16,10) = 0.000092592592593::numeric(16,10)),
  CONSTRAINT enforce_scaley_rast CHECK (st_scaley(rast)::numeric(16,10) = (-0.000092592592593)::numeric(16,10)),
  CONSTRAINT enforce_srid_rast CHECK (st_srid(rast) = 4269),
  CONSTRAINT enforce_width_rast CHECK (st_width(rast) = 100)

and my python script:
wkt = sys.argv[1]  # Polygon shape in WKT format
raster_type = 'GTiff'
table_name = 'dem_elevation'
map_srs = 900913
table_srs = 4269
sql_text = 'SELECT ST_AsGDALRaster(ST_CLIP(ST_Union(rast), ST_GeomFromText(\'%s\',%i)),\'%s\') FROM "%s" WHERE ST_Intersects(rast, ST_Transform(ST_GeomFromText(\'%s\',%i),%i))' % (wkt, map_srs, raster_type, table_name, wkt, map_srs, table_srs)

Bborie Park

unread,
Jul 30, 2013, 12:31:16 PM7/30/13
to PostGIS Users Discussion
Are you able to transform the wkt before passing it to the sql? Partitioning only works on constant values, not values that need processing, e.g. ST_Transform(ST_GeomFromText(\'%s\',%i),%i)).

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 12:35:11 PM7/30/13
to PostGIS Users Discussion
I suppose I could do that in my script. How should I go about that? My process is as follows: 
  • User selects area of interest on a map (openlayers)
  • User clicks submit, and python script is called with the WKT passed as an argument
  • Python script queries the database, which then outputs the raster
  • Raster is processed through a library
  • Processed raster is displayed as an overlay on the map

Bborie Park

unread,
Jul 30, 2013, 12:39:39 PM7/30/13
to PostGIS Users Discussion
The quick and dirty approach is to have a query before that query that transforms the WKT.

Something like "SELECT ST_AsEWKT(ST_Transform(...))"

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 12:42:56 PM7/30/13
to PostGIS Users Discussion
I was reading the page on partitioning, and the very last line says "Partitioning using these techniques will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions." I'm already up to ~400 tables in this partitioning scheme just for Arkansas and Iowa... Is this a good idea? Would there be a better way to do the entire continental US?

Bborie Park

unread,
Jul 30, 2013, 12:54:02 PM7/30/13
to PostGIS Users Discussion
3m NED data doesn't exist for the continental US (at least from USGS). But if you were to do so, you could consider a different scheme...

1. All NED files are stored as out-db rasters
2. Each table is for one state, though in some situations you may want more than one table per state (e.g. Texas, California).

That should help you keep the # of partitions to a minimum and reduce the size of each partition.

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 12:59:32 PM7/30/13
to PostGIS Users Discussion
What if I create subparents for each state, and set an extent constraint on each subparent? Would that help? Or would the query still check the constraint for each child of each subparent?

Bborie Park

unread,
Jul 30, 2013, 1:01:43 PM7/30/13
to PostGIS Users Discussion
That should work if you're querying against the subparent instead of the parent. You'll need to test though...

-bborie

Jayson Gallardo

unread,
Jul 30, 2013, 1:06:13 PM7/30/13
to PostGIS Users Discussion
Thanks for the help again. There will be extensive testing before we go into production. Right now, we're just trying to get a prototype up and running. I might need to look into querying a web service for clipped NED data...

Rémi Cura

unread,
Jul 31, 2013, 3:43:39 AM7/31/13
to PostGIS Users Discussion
Hey
Sorry to bother you with technical details, but your query being slow may come from hardware, OS, postgres, postgis, bad partitionning, bad query, bad index,...
So can you give more details please?

What are your versions (OS,postgres, postgis, gdal).
What is the on disk size of a child table (see statistics)
How many rows are there in a child table?
I'm guessing you have an index about envelope of each row.

What says a EXPLAIN ANALYZE of your query ?


Quick check list :

Hardware
[_] Have you two separate hard disk (one for sys, one for data) / Is your disk the bottleneck?
OS
[_] What is your system doing when computing a query (look for process, memory usage, disk usage)
[_] Is you OS caching working?
[_] if linux, have you tun your kernel max memory?

Postgres
[_] Have you tuned postgresql.conf conf (very restrictive by default : MUST DO).
Postgis
[_] ...no idea yet...
Partitionning
[_] Are your constraints on child table sufficently restrictive to avoid querrying a lots of child table
[_] Are your constraints used by query planner (explain analyze) .
If you want to be sure, you can manually explicit it for test purpose : adding to WHERE clause "AND st_coveredby(st_convexhull(rast), '...truncated...'::geometry)) = TRUE
Query
[_] I'm guessing you test your query in pgadmin before using python : you could start from minimal query (find rasters in a clipping zone directly querying a child table), then make it grow (find rasters in father table,then  compute new raster from previous result, etc), watching how the processing time grows in the process.
Also you should know the computational price of doing ST_AsGDALRaster(ST_CLIP(ST_Union(rast), that is separate the time to find "rast" intersecting your clipping zone, and the time to compute a union and the ST_AsGdalRaster function.

[_] As suggested, and if you want to avoid sending another query first to compute our clipping box, you could use CTE (IE a "WITH" statement), or subquery. This way you avoid computing it twice :
WITH my_clipping_geom AS (
SELECT ST_Transform(ST_GeomFromText( wkt, map_srs,table_srs)) AS geom
LIMIT 1
)
SELECT ST_AsGDALRaster(ST_CLIP(ST_Union(rast), my_clipping_geom.geom , raster_type)
FROM table_name, my_clipping_geom
WHERE ST_Intersects(rast,my_clipping_geom.geom)

Index
[_] You use ST_Intersects, therefore you must have an index to speed it up.
I'm guessing you have one on bounding box of each raster (each line), is it used (explain analyze : index scan)

Don't lose hope, as the whole system can be severly speeded up by tuning and small change may have big effects.

Cheers,
Rémi-C


2013/7/30 Jayson Gallardo <jayson...@gmail.com>

Rémi Cura

unread,
Jul 31, 2013, 3:51:05 AM7/31/13
to PostGIS Users Discussion
Sorry for the double post.
One last thing :
about your query:
you do a ST_clip ( st_union )
You could try a ST_union(st_clip) .
If you have a lot of big rasters it may be faster (??).

Cheers,
Rémi-C

2013/7/31 Rémi Cura <remi...@gmail.com>

Jayson Gallardo

unread,
Aug 5, 2013, 12:13:43 PM8/5/13
to PostGIS Users Discussion
Thanks for your input. I'll try to take a look at your suggestions. I would like to say that I have done some tuning to postgres. Our idea for downloading continental US ned data may be the wrong way to go anyways. Also, forgive me if I seem ignorant to some aspects; I'm a non-traditional college student working for USDA on campus as a programmer, and I'm learning everything as I go. I hope to get to the point where I am on level footing with you all.

Jayson Gallardo

unread,
Jun 2, 2014, 4:12:44 PM6/2/14
to PostGIS Users Discussion
I hate to bump this thread again, but I left this issue to work on more pertinent things, and I finally am able to come back to this. I tried playing around with ST_Buffer, and clipping the resulting bigger raster, but I still get one or more borders consisting of NODATA values. This is what I have right now: 

sql_text_asciigrid = 'SELECT ST_AsGDALRaster(ST_CLIP(ST_Union(rast), ST_Transform(ST_Buffer(ST_GeomFromText(\'%s\',%i),1),%i)),\'%s\',ARRAY[\'FORCE_CELLSIZE=YES\']) FROM "%s" WHERE ST_Intersects(rast, ST_Transform(ST_Buffer(ST_GeomFromText(\'%s\',%i),10),%i))' % (wkt, map_srs, table_srs, raster_type, table_name, wkt, map_srs, table_srs)

Is this something with ST_CLIP and geometries?

Thanks in advance, 
Jayson


On Wed, Jan 8, 2014 at 9:39 AM, Jayson Gallardo <jayson...@gmail.com> wrote:
Sorry to bring up an old thread, but the following question is related to my previous discussion. The queries generated by my app returns rasters with one or two of the borders containing nothing but NODATA values. See attached files for an example. The query generated for the sample is: 

SELECT ST_AsGDALRaster(ST_CLIP(ST_Union(rast), ST_GeomFromText('POLYGON((-10401496.01812 5106173.6751207,-10401119.641642 5106173.6751207,-10401119.641642 5106363.1552702,-10401496.01812 5106363.1552702,-10401496.01812 5106173.6751207))',3857)),'AAIGrid',ARRAY['FORCE_CELLSIZE=YES']) FROM "dem_elevation" WHERE ST_Intersects(rast, ST_Transform(ST_GeomFromText('POLYGON((-10401496.01812 5106173.6751207,-10401119.641642 5106173.6751207,-10401119.641642 5106363.1552702,-10401496.01812 5106363.1552702,-10401496.01812 5106173.6751207))',3857),4269))

I'm thinking it may have something to do with the partitioning, but I couldn't really be sure. Furthermore, I don't think it has anything to do with the size of the tiles in the table, because if I shrink the example area above, I still get a border of NODATA values.

Thank you in advance for your assistance!
Reply all
Reply to author
Forward
0 new messages