Batch Spatial Join

692 views
Skip to first unread message

Joshua Griffiths

unread,
Jul 22, 2015, 6:42:03 AM7/22/15
to GIS In Ecology Forum
Hello forum,

I have a question regarding spatial joins in ArcGIS. Does anyone have experience with batch spatial joining?

I need to join monthly SST + Chlr-a data to my effort segment tracks based on both time and spatial location.

My strategy for this thus far has been to divide my data up into year/month chunks - i.e. Jul 2003, Sep 2007 etc etc  - with the aim of spatially joining each chunk to the corresponding ascii file of SST and Chlr-a which covers my study area.

My question is thus - is there a rapid way to batch join the data? I was able to use a modified python script I found online to rapidly slice my data into monthly chunks, and am wondering if there is a similar approach for a spatial join, possibly using the python window or inbuilt arcgis tools? Has anyone tackled something like this before?

I basically have two folders - one with the monthly data - all named  'Apr_01.shp'  etc
                                        - the other with .asc files - all named 'apr_01.asc'  etc   (separate versions of this folder for both SST and Chlr-a).  

I would love to be able to join them in an automated fashion, because the thought of doing 300+ joins is frightening.

Any ideas appreciated!

Thanks

Josh

P.s. - there are about 300 lines of data in each monthly data chunk from my effort segment shape files, I only want the closest point to each effort segment from the asc file. The asc files have about 35000 gridded data points of monthly SST or Chlr-a.




GIS in Ecology

unread,
Jul 22, 2015, 7:14:56 AM7/22/15
to GIS In Ecology Forum, 1998.joshu...@gmail.com
Hi Josh,
 
Thanks for your post. This is an interesting topic. In general, GIS software doesn't do well with linking things based on space and time. However, there are ways round this.
 
The approach I'm going to outline below assumes that all your raster data layers of environmental variables overlay each other exactly (i.e. have the same cell sizes and that the edges of these cell sizes exactly match up with one another - this is critical to it actually working). In addition, it assumes that you don't have any missing values for any grid cells in your environmental raster data layers (this is critical). If you are familiar with it, you can use ArcGIS's Modelbuilder module to automate this process. 
 
The process is as follows:
 
1. Turn each of your raster data layers for SST into a point data layer (RASTER TO POINT tool). To the attribute table of this point data layer, add a field called month (use the ADD FIELD tool) and fill it with the value for the Month of the SST data. Repeat this to add a field for the year value for the SST data.
 
2. Use the  EXTRACT MULTI VALUE TO POINTS tool to link the Chlorophyll data for the same month and year to the attribute table of the SST point data layer.
 
3. Now use the ADD XY COORDINATES tool to add a X and Y coordinates to this attribute table
 
4. Repeat this for all your other raster data layers.
 
5. Use the MERGE tool to merge the attribute tables of all your point data layers together in a new data layer.
 
You now have a big table where each line represents the month, year, SST, Chlorophyl-A for a specific point in space marked by the coordinates at the central point of each grid cell. Next, you need to have some way to join these data to your survey effort. You could simply do a spatial join, but this link the values for multiple months and year to your survey effort rather than the specific month and year. To get round this, you need to add information about the coordinates of the nearest environmental point to each bit of survey track. To do this, take the first point data you created, and use a spatial join to join the information in this table to the attribute table of your survey segments based on the nearest value. Delete the SST field, the chlorophyll field, the month field and the year field. This will leave just the fields with the coordinates for the centre of the nearest raster grid cell in it. You are now ready to join your data together.
 
Unfortunately, ArcGIS doesn't really do joins particularly well, and your best bet is to export the attribute tables of your merged point data and your survey effort with the coordinates added to it and import them into a database programme, such as Microsoft Access, and use a query to join the two tables together using month, year, and the coordinates of the nearest central point for the raster grid cells. 
 
As I said, much of this can be automated using ModelBuilder, and once you have the tool built, you can use it repeatedly to do the same process over and over again.
 
Now, this is just the way that I would do it (and indeed have done it in the past), and there are likely to be many other alternative ways to do this, and other people may well have better suggestions. In particular, I would suggest checking out the MGET toolkit  to see if there's anything within that which will help you find a solution to what you are wanting to do.
 
I hope this helps, if not just let me know.
 
All the best,
 
Colin

Joshua Griffiths

unread,
Jul 27, 2015, 11:37:44 AM7/27/15
to GIS In Ecology Forum, cdma...@gisinecology.com
Hi Colin,

Thanks for the advice. In the end I used a sort of composite method using elements of your suggestion mixed with a few other techniques. Now I have the technique refined, I managed to do all the joining in a couple of hours, with the main delay waiting for ArcGIS to process the work at each stage. I think your technique would definitely work, but as I was already a certain distance down an alternate approach, I persevered with that.

I thought I would post up how I got around this problem, incase any other visitors to the forum might like to use this method.

I used python to split my large layer file into month/year chunks, using a method similar to the top answer here -
http://gis.stackexchange.com/questions/9998/exporting-feature-class-into-multiple-feature-classes-based-on-field-values-usin

For the rest of the method I used the 'Batch' function for each tool, for each environmental raster file, in the following order -

1. Make XY event layer. 2. Save Layer. 3. Project (to convert from geographic to my project projection) 4. Spatial Join with relevant data chunk.

I found that using Excel spreadsheet, I could prepare lists of all the files (layers, tables etc) (very easy when using 'bulk rename utility', a free download) and then paste them into the batch processing window in ARC, rather than having to tediously add each file individually. This works perfectly if you first use the '+' button to add plenty of rows. This saved me time as individually adding everything individually would have been a nightmare. I kept the files names consistent so I could just paste everything into the 'batch' window for each tool easily for each step of the process.

So that's basically it. I think there must be an easier method in python, but I'm pretty amateur at that, and this way was relatively easy with the main delay being the processing time for each step.

I then used your suggestion of 'Merge' to put all the files back together, with all the environmental data attached.

I know this might work better using the model builder, but I could not get it to work properly, and decided against investing too much time in it to figure out where I was going wrong!

Anyway, as I say, a slightly laborious approach, but just wanted to share as it might be a solution for others too!

Josh




GIS in Ecology

unread,
Jul 27, 2015, 11:58:43 AM7/27/15
to GIS In Ecology Forum, 1998.joshu...@gmail.com, 1998.joshu...@gmail.com
Hi Josh,
 
Glad to hear you got this sorted and that the advice provided proved useful in helping you work this out.
 
Thanks also for posting the solution you ended up using. This will, I'm sure, be very useful for others who encounter similar problems in the future.
 
All the best,
 
Colin
Reply all
Reply to author
Forward
0 new messages