Preparing WorldClim data using R raster package

12,840 views
Skip to first unread message

Julie

unread,
Apr 11, 2012, 5:31:18 PM4/11/12
to Maxent
Hi,
I'm trying to process the Worldclim data layers for import into Maxent
using the raster package in R. I'm looking for confirmation that my
workflow is okay (e.g. won't generate errors in the data) and for help
to move me past the basic R code that I've figured out to date. Here
is what I have:

My study area spans four WorldClim tiles. So far I have downloaded
the generic grid files for these tiles from the WorldClim site and
have used DIVA-GIS to import to grid (e.g. to turn them into .GRD
files that can be read by the raster package). I think I know how to
define the projection for each file, merge all the tiles for a given
variable together and clip to an exact study area

e.g.

#create a raster for each file
bio1_a<-raster(file.choose())
bio1_b<-raster(file.choose())
bio1_c<-raster(file.choose())
bio1_d<-raster(file.choose())

# define projection of each
projection(bio_a)<-"+proj=longlat +datum=WGS84 +ellps=WGS84
+towgs84=0,0,0"
#etc etc

#create one big raster for the whole extent
bio1<-merge(bio1_a, bio1_b, bio1_c, bio1_d)

#clip to just Washington
bio1_WA<-crop(bio1, WashingtonBoundary)

...but I'm trying to write a loop so that I don't have to manually
import all four tiles for all environmental variables and do all of
these steps over and over again...

Can anyone makes some general suggestions for how to proceed? And any
code to process all the files quickly would be greatly appreciated!

Thanks from a newbie!

John B

unread,
Apr 12, 2012, 2:39:40 AM4/12/12
to max...@googlegroups.com
Hi Julie,

You can read .bil and ESRI ASCII files with R, so no need to play with DIVA. 

Try something along the lines of the following (I've also attached the code in case encoding errors arise):

# install.packages('rgdal') # install rgdal if not already installed
library(raster)
# set the working directory to the path that contains your .bil files
setwd('path/to/bils')
# create a list of .bil files that exist in the wd
files <- list.files(pattern='\\.bil$')
# vars is a vector of bioclim variable numbers
vars <- sort(unique(as.numeric(gsub('^bio([0-9]+)_.*', '\\1', files))))
# for each of vars, create raster object for each tile and merge
# (this is a bit messy, but all I could think of for now...)
# grids will be a list of rasters, each of which is the merged tiles for a BC var.
grids <- sapply(vars, function(x) {
  patt <- paste('bio', x, '_', sep='')
  tiles <- files[grep(patt, files)]
  merged <- eval(parse(text=paste('merge(', toString(paste('raster(', tiles, ')', 
                sep='"')), ')', sep='')))
})
# give the list elements names
names(grids) <- paste('bio', vars, sep='')
# combine all list elements into a stack
s <- stack(grids)
# quick plot to make sure nothing went drastically wrong
plot(s)
# crop to your study extent
s.crop <- crop(s, WashingtonBoundary)
plot(s.crop)
read_merge_crop_worldclim_tiles.R

Julie

unread,
Apr 14, 2012, 6:48:55 PM4/14/12
to Maxent
Hi John,

Thank you so much for this! I tested it on a couple of my variables
and it worked perfectly. I added

> projection(s)<-"+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"

prior to cropping, just to set the projection (although I'm not sure
this was necessary).

Thanks again!



On Apr 11, 11:39 pm, John B <j.baumgart...@pgrad.unimelb.edu.au>
wrote:
>  read_merge_crop_worldclim_tiles.R
> 1KViewDownload

ping yang

unread,
Oct 24, 2013, 5:00:20 PM10/24/13
to max...@googlegroups.com
Hi John,

I am asking here is there a way to time merge the daily data into annually (which will reduce the time in reading each file into memory 365 for one year)?
When I do the stack() for one year, it takes almost 15 minutes for one year, Is there a faster way to do it?

Thanks,

Ping

John Baumgartner

unread,
Oct 26, 2013, 7:12:50 PM10/26/13
to maxent

Hi Ping,

You might want to look into storing your data in a database, and then accessing the db from R. That might speed things up.

Good luck,
John

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/groups/opt_out.

Jenna Hamlin

unread,
Nov 19, 2013, 2:23:19 PM11/19/13
to max...@googlegroups.com
Hi,

Following up on the post about WorldClim data spanning multiple tiles for Maxent.

Using the code from John B. (Thanks!), I am able to combine .bil files from multiple tiles. I then change the extent and crop to my study site location and plot my cropped files, which looks good.
What I want to do next is writeRaster to generate a .grd file for each different BioClim variables, but am unsure of how to do that. Specifically, I want to generate bio1.grd, bio2.grd etc., as I don't want to use all 19 variables when I run Maxent.

 Thanks in advance!
Jenna

Bill Sutton

unread,
Nov 19, 2013, 5:53:24 PM11/19/13
to max...@googlegroups.com
Jenna,

I went through this same process about 3 weeks ago and tried to document the process I used to create the 19 bioclim variables from cropped tmax, tmin, and prec worldclim data.  I am sure the R code could have been written much more elegantly, but I made it work for what I needed.  If you plan on using the biovars argument in R and you spatial area is large, it's going to take quite a while for this process.  It took about 2 days to create the data that I needed for each future climate scenario.  Attached is the word document. If anyone sees any errors in the process, please let me know.  A lot of this process happened by piecing together code from multiple sources. 

Bill


--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/groups/opt_out.



--
Bill Sutton, Ph.D.
Post-doctoral Research Associate
Clemson University
School of Agricultural, Forest and Environmental Sciences
261 Lehotsky Hall
Phone: (256) 520-7347
 
Step by step process for extracting bioclim data using R.docx

Jenna Hamlin

unread,
Nov 20, 2013, 1:01:00 PM11/20/13
to max...@googlegroups.com
Hi Bill,

Thanks attaching the document, which seems like a bit more work. But that maybe necessary for future climate predictions. The code that I have attached is for current climate. If I need to add anything in order to generate these .grd files to be used with Maxent, comments are more than welcomed!

Thanks!
Jenna


--
You received this message because you are subscribed to a topic in the Google Groups "Maxent" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/maxent/045l4NYb4JI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to maxent+un...@googlegroups.com.
get_envir_layers.R

John Baumgartner

unread,
Nov 20, 2013, 9:19:32 PM11/20/13
to maxent

Hi Jenna,

Did you manage to get R to write out the .grd files ok? Note that Maxent should also work fine with bil grids.

- John

Jenna Hamlin

unread,
Nov 21, 2013, 5:01:51 PM11/21/13
to max...@googlegroups.com
John,

I did get R to write out the .grd files, good to know that it works fine with bil grids. I am quickly teaching myself SDM in R, so lots of things to learn :)
Thanks!

Amy Bennett

unread,
Nov 26, 2013, 3:54:45 PM11/26/13
to max...@googlegroups.com
Hi I was wondering if someone could help me.

I'm trying to pull out the data (Bioclim, Precip,and Ts) for a set of sites which I have longitude and Latitudes for. How do I do this?
Thanks

Jenna Hamlin

unread,
Nov 26, 2013, 4:13:47 PM11/26/13
to max...@googlegroups.com
Actually, John maybe you could clarify or point me somewhere that might be helpful.
I came upon the your post about Worldclim data spanning multiple files for Maxent and asked how to generate bio1.grd, bio2.grd etc., as I don't want to use all 19 variables when I run Maxent.

I do not understand how to combine .bil files spanning three different tiles from worldclim download and then only have R  analyze 5 of the 19 Bioclim variables, after they have been combined. This is why I thought I needed .grd files, because then I would read them back into R.


Thanks!
Jenna

John Baumgartner

unread,
Nov 27, 2013, 8:22:32 PM11/27/13
to max...@googlegroups.com
If you followed the code that you posted in this thread, I think you should have a rasterstack (the object, s) that has a layer for each bioclim variable.

If you think that is not the case, please reply with the result of:

print(s)

Are you trying to run Maxent via dismo? If so, you can pass the maxent function a subset of the rasterstack, e.g. s[c(1, 3, 6)], to use as predictors. 

If you're using the standalone maxent app, then you will need to write out the merged rasters again, in which case you can do as you've done, writing out .grd files, or similarly write out .bil.

The trick is, as you've noticed, to use the bylayer=TRUE argument.

Hope that helps,
John

Francisco Rodriguez Sanchez

unread,
Dec 2, 2013, 7:13:27 AM12/2/13
to max...@googlegroups.com
Hi Amy,

Have a look at the dismo tutorial:
https://web.archive.org/web/20130323135103/http://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf
(archived version, I couldn't find current vignette on CRAN). That's
well explained there (particularly chapter 4)

Hope it helps

Paco


El 26/11/2013 21:54, Amy Bennett escribi�:
> --
> You received this message because you are subscribed to the Google
> Groups "Maxent" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to maxent+un...@googlegroups.com.
> To post to this group, send email to max...@googlegroups.com.
> Visit this group at http://groups.google.com/group/maxent.
> For more options, visit https://groups.google.com/groups/opt_out.

--
Dr Francisco Rodriguez-Sanchez
Forest Ecology and Conservation Group
Department of Plant Sciences
University of Cambridge
Downing Street
Cambridge CB2 3EA
United Kingdom
http://sites.google.com/site/rodriguezsanchezf

Jenna Hamlin

unread,
Dec 2, 2013, 9:37:54 AM12/2/13
to max...@googlegroups.com
Hi John,

Yes, I am using the package dismo to run Maxent.
Silly me :) yes that makes sense using s[c(1, 3, 6)].

Thanks!!

Matthew Miller

unread,
Dec 2, 2013, 7:50:28 AM12/2/13
to max...@googlegroups.com
I had this issue too, and I know it can be done in R, but I found it was really easy just to install DIVA GIS and it’s climate files and do it that way.  It’s got a command to do it, and it’s really easy.


Nuno Sá

unread,
Feb 7, 2014, 8:32:02 AM2/7/14
to max...@googlegroups.com
Hello here is a sample code that prepares data to be run externally on maxent (if you prefer using the GUI)

In this case it fetches two tiles of Wclim data, merges them (left first right last), outputs them as an ascii file to a folder which you can easily load into maxent.

Also reads a shp and saves it as csv that you can prepare in any type of spreadsheet before running your maxent model (not really a necessary step in many situations but useful to know you can convert easily between shp and csv maintaining coordinates)

Hope it helps.

geosphere library is not needed in this case


--------------------------------------
#libraries

library(rgdal) #projection handling
library(raster) #raster operations

library(maptools) #Mapping related
library(geosphere) #Geographic operations related


#creating an output.dir
outdir <- "C:\\Invasive_Modelling\\Centaurea\\Wclimasciii"


#getting spain extent shapefile

es.shp <- readShapePoly("C:\\Invasive_Modelling\\Centaurea\\ESP_adm\\ESP_adm1_selection_dissolve.shp")



#getting presence points shapefile to output csv format

solti.shp <- readShapePoints("C:\\Invasive_Modelling\\Centaurea\\Solstitialis\\Solstitialis_v01_clip_es.shp")
calci.shp <- readShapePoints("C:\\Invasive_Modelling\\Centaurea\\Calcitrapa\\calcitrapa_v01_clip_es.shp")

#convert to data.frame
solti.df <- as.data.frame(solti.shp)
calci.df <- as.data.frame(calci.shp)

write.csv(solti.df,"C:\\Invasive_Modelling\\Centaurea\\solti_pp.csv")
write.csv(calci.df,"C:\\Invasive_Modelling\\Centaurea\\calci_pp.csv")

#fetching through tiles (15 and 16) http://www.worldclim.org/tiles.php

#Fetching minimum monthly temperature at 30 arc second
l <- getData('worldclim', var='tmin', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='tmin', res=0.5, lon=2, lat=40) # right tile

#visualization
par(mfrow=c(1,2))
plot(l)
plot(r)

#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)

#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)

#reseting names to original
names(m.c)<-names(l)

#setting correct decimal points
m.c <- m.c*0.1


writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)


#-------------------------fetchin maximum monthly temperature-----------------------------------------


l <- getData('worldclim', var='tmax', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='tmax', res=0.5, lon=2, lat=40) # right tile
#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)
#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)
#reseting names to original
names(m.c)<-names(l)
#setting correct decimal points
m.c <- m.c*0.1
writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)


#-------------------------fetchin mean monthly temperature-----------------------------------------


l <- getData('worldclim', var='tmean', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='tmean', res=0.5, lon=2, lat=40) # right tile
#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)
#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)
#reseting names to original
names(m.c)<-names(l)
#setting correct decimal points
m.c <- m.c*0.1
writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)


#-------------------------fetchin average precipitaion monthly-----------------------------------------


l <- getData('worldclim', var='prec', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='prec', res=0.5, lon=2, lat=40) # right tile
#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)
#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)
#reseting names to original
names(m.c)<-names(l)
#setting correct decimal points
m.c <- m.c*0.1
writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)

#-------------fetchin biolcimatic variables-----------------------------------
l <- getData('worldclim', var='bio', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='bio', res=0.5, lon=2, lat=40) # right tile
#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)
#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)
#reseting names to original
names(m.c)<-names(l)
#setting correct decimal points
m.c <- m.c*0.1
writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)

#-------------fetchin altitude-----------------------------------

l <- getData('worldclim', var='alt', res=0.5, lon=-2, lat=40) #left tile
r <- getData('worldclim', var='alt', res=0.5, lon=2, lat=40) # right tile
#cropping to shapefile extent  (This crops to an extent and not a shapefile - to do so: https://stat.ethz.ch/pipermail/r-sig-geo/2012-June/015274.html)
l.c <- crop(l,es.shp)
r.c <- crop(r,es.shp)
#merging final raster
m.c <- merge(l.c,r.c)
plot(m.c)
#reseting names to original
names(m.c)<-names(l)
#setting correct decimal points
m.c <- m.c*0.1
writeRaster(m.c,"C:\\Invasive_Modelling\\Centaurea\\Wclimasciii\\i",format="ascii",bylayer=TRUE,suffix="names",overwrite=TRUE)



---------------------------------------

Nuno Sá

unread,
Feb 7, 2014, 8:41:51 AM2/7/14
to max...@googlegroups.com
Sorry for quick response:

You can easily georreference using the previous known coordinates and export your image into a TIFF format using R also:

From above example:

calci.avg <- raster("C:\\Invasive_Modelling\\Centaurea\\Experimental\\Model_01\\c.calcitrapa_avg.ASC")
projection(calci.avg) <- projection(m.c)
writeRaster(calci.avg,"C:\\Invasive_Modelling\\Centaurea\\Experimental\\Model_01\\TIFFS\\Model_01_calcitrapa_avg.tif")

Matthew B

unread,
Feb 12, 2014, 11:41:01 AM2/12/14
to max...@googlegroups.com
Hi I am I know that this is a pretty old thread, but I am wondering if anyone has had any luck running a loop function for individuals annual models that could be run for the last ~ fifty years from annual data? such that the final result could eventually be converted into some sort of animation. 

Kirk Wythers

unread,
Feb 20, 2014, 1:16:25 PM2/20/14
to max...@googlegroups.com


On Thursday, April 12, 2012 1:39:40 AM UTC-5, John Baumgartner wrote:
John, I am looking at your code here and have a question on line 8. 
"vars <- sort(unique(as.numeric(gsub('^bio([0-9]+)_.*', '\\1', files))))"

I actually want to read in all the worldclim bil files (including alt, tmax, tmin, etc... so I was going to simple modify your code by subsituting "tmax", "tmin" tmean" etc... for the "bio" in your regular expression. However, when I run line 8 (with the bio in place), I get back an empty numeric vector "vars". I was expecting a vector of bioclim names "bio1-19". I am getting a warning message "Warning message:
"NAs introduced by coercion ", but I just don't see what is going wrong here. My vector "files" from above is correct. It contains all 64 elements ("alt.bil", "bio1.bil", "bio2.bil"...). 

Here is the actual code I'm running (exaclty like yours with the exception of the path):

# set the working directory to the path that contains your .bil files
setwd('~/r/worldclim_10m/')
# create a list of .bil files that exist in the wd
files <- list.files(pattern='\\.bil$')
# vars is a vector of bioclim variable numbers
vars <- sort(unique(as.numeric(gsub('^bio([0-9]+)_.*', '\\1', files))))

Can you see what I don't see? 
 

John Baumgartner

unread,
Feb 20, 2014, 4:21:48 PM2/20/14
to max...@googlegroups.com
Hi Kirk,

The code I posted at the beginning of this thread was designed to merge tiled WoldClim data (Julie had, for example, 4 tiles for bio1, which she wanted to merge into a single raster). Your 10 minute grids aren't tiled, so you don't have to worry about this.

Reading the entire directory of 10 minute .bil files into R should be as simple as:

files <- list.files(pattern='\\.bil$', full.names=TRUE) 
s <- stack(files)

You can then manipulate this rasterStack as required (e.g. clipping with crop), or plot it with: plot(s)

Let me know if I didn't understand your situation.

Cheers,
John

Kirk Wythers

unread,
Feb 21, 2014, 1:09:58 PM2/21/14
to max...@googlegroups.com
Thanks John. That helped. I'm trying to make the transition to using R for jobs I used to do with MATLAB or GRASS GIS and I'm find the syntax rather different. My goal is a rather simple one... combine two worldclim layers (alt and tmax into a raster stack for example), and then pull out longitudinal bands of tmax for a particular latitude below a particular elevation). One issue I'm still banging my head on, is how to (in matlab terms), "index into the matrix". If you think about a rater as a 2 dimensional matrix, and a raster stack as a 3 dimensional maritx. I am struggling to figure out (and I have read the raster vignette http://cran.r-project.org/web/packages/raster/vignettes/Raster.pdf) how to specify a particular cell in the matrix. For example, if I want to grab the 1st row in column 1 of a 2 dimensional matlab matrix, I would use the syntax [1:1], all rows in column 2 would look like [:2], or the cell at row 78 column 132 would be [78:132]. What does the R syntax look like to do this?

Thank you for the assist.

Best,

Kirk

John Baumgartner

unread,
Feb 21, 2014, 6:25:13 PM2/21/14
to max...@googlegroups.com
No worries Kirk.

This is a handy resource for MATLAB users coming to R (or vice versa): http://mathesaurus.sourceforge.net/octave-r.html.

Matrices of >2 dimensions are referred to as arrays (a 2D matrix is also an array, but matrix is reserved for 2D arrays), and to index arrays in R, we use the notation: 

A[x, y, z] 

where A is the array, and x, y and z are the desired indices along three axes. If you want to return all elements along a particular axis, then you can leave that index blank, e.g.:

A[, 1:3, 1:5] 

which would return all rows (blank first dimension), the first three columns, and the first five slices of the third dimension.

This indexing can be generalised to an arbitrary number of array dimensions.

So the R equivalents of your examples are:

[1:1] -> [1, 1]
[:2] -> [, 2]
[78:132] -> [78, 132]

Some useful texts to familiarise yourself with R are:
If you're not already aware of Stack Overflow, head over to www.stackoverflow.com and take a look. To search amongst questions tagged "R", include "[r] in your search.

- John

John Baumgartner

unread,
Feb 21, 2014, 6:27:46 PM2/21/14
to max...@googlegroups.com
Also take a look at ?rowFromY in the raster package, which might help with your plan to "pull out longitudinal bands of tmax for a particular latitude below a particular elevation)".

Kirk Wythers

unread,
Feb 22, 2014, 9:22:11 AM2/22/14
to max...@googlegroups.com
Thank you John. Those resources will help immensely. 

Kirk Wythers

unread,
Feb 25, 2014, 2:30:48 PM2/25/14
to max...@googlegroups.com
John,

Again, thanks for assist. I was able to pull together a pretty slick script that handles the import, sets the CRS, aggregates the resolution up to 1°, creates a stack of the two files, and then queries by latitude. The only piece I am still stuck on in specifying a logical operator in the last position (layer). Here is business end of for one site.  

s.altbio5.1[rowFromY(s.altbio5.1, 68.63), , ]

What you see here expands out from the format [ , , , ] 

where s.altbio5.1 is the stack to query, the rowFromY() feeds the row number for lat 68.63 to the row postion, column position is empty because we want all columns, and the layer position is empty because everything I've tried has not worked. In theory, it seems like I should be able to put a logical function (like as.logical()) "if the cell value from layer 1 is < 1000, then return both layers.

As it is right now, the thing works fine, it return all values from both layers. I'm just getting picky! 


If you have any ideas of what, I'd love to hear them.

Kirk

John Baumgartner

unread,
Feb 25, 2014, 4:27:08 PM2/25/14
to maxent
getValues might be more appropriate here. Here's a simple example:

# Create some fake data (a raster stack with 10 rows, 10 columns and 2 layers)
s <- stack(replicate(2, raster(matrix(sample(3000, 100, replace=T), nc=10))))

# return, for each layer, the values of cells in the specified row for which layer 1 is less than 1000
v <- getValues(s, rowFromY(s, 0.3))

This returns all values of cells on the row at lat 0.3, for all layers. The result it a 10 x 2 matrix (since we had 10 columns, and 2 layers).

We can now use basic subsetting operations, which can include logical statements.

v[v[, 1] < 1000, ]

This states that we only want to return the rows of v where v[, 1] (i.e. the first column of v) is less than 1000.

To do it in one step (without the intermediate object v) would yield slightly messier code, and R will still perform the intermediate operation anyway:

getValues(s, rowFromY(s, 0.3))[getValues(s[[1]], rowFromY(s, 0.3), 1) < 1000, ]











--

Ryan Kok

unread,
May 21, 2014, 5:37:43 AM5/21/14
to max...@googlegroups.com
Hi John,

I am needing some help please. I have downloaded past climate models as well as future models with all the BioClim variables. I need to clip these files to Madagascar. They need be clipped to the same resolution and extent (e.g bio_1; bio_5; bio_6; bio_12; bio_13; bio_14). They all need to be the same. How am I able to load the variables and clip them to Madagascar. Could you please provide a code what would allow me to do so, if possible. Also if there is a way to do them all at the same time.

The past variables have .bil files (to .asc files) and the future are .asc flies.   

I am new to R, and trying to understand what is the correct code is kind difficult.

Any help or information would be greatly appreciated.

Kind regards

Ryan

DE LA VEGA Gerardo Jose

unread,
May 22, 2014, 2:39:54 PM5/22/14
to max...@googlegroups.com
Hy Ryan,
you can use the "crop" function with the extent of madagascar, I show you how can you do it...

for example:

Bio17<-raster(("D:/..../bio17.bil"), pattern="grd", full.names=T)
Bio18<-raster(("D:/..../bio18.bil"), pattern="grd", full.names=T)
Bio19<-raster(("D:/..../bio19.bil"), pattern="grd", full.names=T)

#crop
crop<-c(-125,-32,-56,40)###### use madagascar limits!!!! 

Bio17<-crop(Bio17,crop)
Bio18<-crop(Bio18,crop)
Bio19<-crop(Bio19,crop)

or use the "?crop" examples:

r <- raster(nrow=45, ncol=90)
r[] <- 1:ncell(r)
e <- extent(-160, 10, 30, 60)
rc <- crop(r, e)

# crop Raster* with Spatial* object
b <- as(extent(6, 6.4, 49.75, 50), 'SpatialPolygons')
crs(b) <- crs(r)
rb <- crop(r, b)

# crop a SpatialPolygon* object with another one
if (require(rgdal) & require(rgeos)) {
  p <- shapefile(system.file("external/lux.shp", package="raster"))
  pb <- crop(p, b)
}


Gerardo


--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

Samuel Veloz

unread,
May 22, 2014, 2:58:25 PM5/22/14
to max...@googlegroups.com
Umm, I think you should be careful with the code below. You are naming an object "crop" in one step below which I think could then replace the function "crop" from the raster package. You should always avoid naming objects the same name as a function. 

Note "extend" is also a useful raster function to use to get the extents lined up perfectly.

Sam


From: DE LA VEGA Gerardo Jose <delavega...@gmail.com>
To: max...@googlegroups.com
Sent: Thursday, May 22, 2014 11:39 AM
Subject: Re: Preparing WorldClim data using R raster package

Ryan Kok

unread,
May 22, 2014, 3:08:10 PM5/22/14
to max...@googlegroups.com

Thank you so much for you help. I will give it a try and hope it works.

Sam I am not sure what you are suggesting. Yes i think I might get confused if they are named the same...

I know this might be a stupid question but wats the best code to start off by loading your bioclim variables into R? And then that will follow by clipping them to same resolution and extent.

To get the extent values for Madagascar. Would you use Google earth?

You help is truly appreciated.

Kind regards

Ryan

You received this message because you are subscribed to a topic in the Google Groups "Maxent" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/maxent/045l4NYb4JI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to maxent+un...@googlegroups.com.

DE LA VEGA Gerardo Jose

unread,
May 22, 2014, 3:21:35 PM5/22/14
to max...@googlegroups.com
Ryan,
1) Sam was right, just change the name...
2) I used my code for the raster, functioning very well, obviously there are others...
3) extent values for Madagascar: look for where you want, here is one code:
extent.mad<-c(42,55,-27,-11)
4) It is better to download all the raster at the same resolution
G.

Ryan Kok

unread,
May 22, 2014, 3:29:02 PM5/22/14
to max...@googlegroups.com

That's perfect. Thanks so much. I will give it a try and i will let you know if it works or if I have any problems.

Ryan

John Baumgartner

unread,
May 22, 2014, 5:35:01 PM5/22/14
to maxent

I find the easiest way to import all your raster data is to pass a vector of file names to stack(). Assuming you want to read in all of the .grd files from within a particular directory (for example, c:/mydir), you can then use list.files to create that vector:

s <- stack(list.files('c:/mydir', patt='\\.grd$', full.names=TRUE))

If there are grd files of interest in subdirectories within c:/mydir, just add recursive=TRUE to the list.files call.

This requires that all grids are the same extent and resolution.

You can then clip the whole stack with crop.

Ryan Kok

unread,
May 26, 2014, 6:25:16 AM5/26/14
to max...@googlegroups.com
Hi guys,

I have finally been able to test the codes you have so kindly give me. I seem to have a problem/error message. Its been popping up no matter how i change my work. Could you please advise me or help me correct where i am going wrong?

Code i am using:

Bio1<-raster(("E:/MSc 2014/R/ccsmbio1.bil"), pattern="grd", full.names=T)
Bio2<-raster(("E:/MSc 2014/R/ccsmbio2.bil"), pattern="grd", full.names=T)
Bio3<-raster(("E:/MSc 2014/R/ccsmbio3.bil"), pattern="grd", full.names=T)

Error message:

> Bio1<-raster(("E:/MSc 2014/R/ccsmbio1.bil"), pattern="grd", full.names=T)
Error in .local(.Object, ...) : 
  `E:\MSc 2014\R\ccsmbio1.bil' not recognised as a supported file format.


Error in .rasterObjectFromFile(x, band = band, objecttype = "RasterLayer",  : 
  Cannot create a RasterLayer object from this file.

I get the same message doing the stack code provided.

s <- stack(list.files('E:/MSc 2014/R/', patt='\\.grd$', full.names=TRUE))

Can anyone help me please.

Kind regards

Ryan
Ryan Kok
MSc (Biological Science) Student
School of Life Sciences
University of KwaZulu-Natal
Westville

Cel: 0725077868
Email:  ryan...@gmail.com

John Baumgartner

unread,
May 26, 2014, 6:34:13 AM5/26/14
to max...@googlegroups.com
Hi Ryan,

Small typo. Try the following:

s <- stack(list.files('E:/MSc 2014/R/', patt='\\.bil$', full.names=TRUE))

Cheers,
John

Ryan Kok

unread,
May 26, 2014, 6:41:25 AM5/26/14
to max...@googlegroups.com
Hi John,

Thanks I will try it now.

After trying it, I am still getting the same error message.

As follows:

> s <- stack(list.files('E:/MSc 2014/R/', patt='\\.bil$', full.names=TRUE))
Error in .local(.Object, ...) : 
  `E:\MSc 2014\R\ccsmbio1.bil' not recognised as a supported file format.


Error in .rasterObjectFromFile(x, band = band, objecttype = "RasterLayer",  : 
  Cannot create a RasterLayer object from this file.

I don't have a clue what it could be. 

Ryan


John Baumgartner

unread,
May 26, 2014, 6:58:19 AM5/26/14
to max...@googlegroups.com
It's hard to say what the problem is without having the file. 

Where did you get ccsmbio1.bil from?

Ryan Kok

unread,
May 26, 2014, 7:03:32 AM5/26/14
to max...@googlegroups.com
I would send you the file but they at like 70MB each. I got the files from the Bioclim website http://www.worldclim.org/past

I am using this data:

Last glacial maximum (LGM; ~21,000 years BP)

source: Paleoclimate Modelling Intercomparison Project Phase II (PMIP2) use conditions: please register with the PMIP2 project if you want to use these data; or want access to the original variables, we only provided data derived from the PMIP2 database.
2.5 arc-minutes
CCSM: bio
MIROC: bio


CCSM the first model that i am trying to crop the data. And get it to .asc files so i can continue to model my project.

Maybe is there a way to conver the .bil files to .asc because at moment i am using future data with files with .asc and the codes are working.


Hira Fatima

unread,
May 28, 2014, 1:00:30 AM5/28/14
to max...@googlegroups.com
Hi everyone....
i am using bioclim variables and was wondering the current scenario is for 1950-2000, and future projections are for 2020 2050 and 2080, how am i supposed to prepare the data for year 2014..? i am sorry if its a stupid questions. ^^
Hira

Michael Cheek

unread,
Jun 5, 2014, 8:23:44 AM6/5/14
to max...@googlegroups.com
Hello John

Can the .bil files covering all tiles be in one folder? Will R be able to tell precipitation data from temperature data apart? 

Regards

Michael

John Baumgartner

unread,
Jun 6, 2014, 7:54:55 PM6/6/14
to maxent

Yes, as long as filenames are prefixed by "bio" and the variable number. If that's not the case you might need to change the pattern in this line:

vars <- sort(unique(as.numeric(gsub('^bio([0-9]+)_.*', '\\1', files))))

--

Ryan Kok

unread,
Jul 10, 2014, 8:10:53 AM7/10/14
to max...@googlegroups.com
Hi there,

Please can someone help me. I am needing to find/calculate an area on a map which i have used a certain threshold to get. Please can someone help me with a code to help calculate the area.

Also i am needing to find how much this area has shifted from past to present to future. Please could you help me with a code as i dont know where to start with that.



John Baumgartner

unread,
Jul 10, 2014, 8:27:46 AM7/10/14
to maxent

Hi Ryan,

It's best to start a new thread for new questions.

Assuming you want area in units of "cells", you can do something like:

library(raster)
r <- raster('c:/path/to/prediction.asc')
sum(values(r > thr), na.rm=T)

Where thr is your threshold value. The inequality returns a logical grid (coerced to numeric with false as 0 and true as 1). The values function returns the vector values in the raster object's cells, and sum adds these ones and zeros to return the number of cells that exceed the threshold. Use >= if you prefer.

Correct the path to your prediction grid first.

John Baumgartner

unread,
Jul 10, 2014, 8:29:56 AM7/10/14
to maxent

Bear in mind, though, that if your prediction grids are not in an equal area projection, then cell areas will vary across the landscape (particularly problematic if your study area spans a large latitudinal range).

Zach Culumber

unread,
Oct 17, 2014, 12:35:05 PM10/17/14
to max...@googlegroups.com
Hi John, or anyone else who might be able to provide some insight,

I realize this thread is old, but thought I would post my question here since it pertains to this same topic and to code that was posted above in this thread.

I'm new to R and new to ENM.  I'm trying to use some code that was previously posted to do the same thing (I think) as the original thread starter.  I was hoping someone could explain where I am going wrong so that I can learn my way through this. 

Essentially, I have Bioclim layers for all 19 variables and two different tiles (tiles #22 & #23).  I need to stack and merge all 19 variables and the two tiles and crop it all to my study region.  I'm using the code that is pasted below, which is the same code that was posted by John Baumgartner on 4/12/12 in this thread.  This code gives me an error (pasted below) when I run lines 12-17 starting with grids.  I got this to run successfully when I did only two .bil files (bio1_22 & bio1_23) and even produced plots with no problem.  When I tried to run it with bio1_22 through bio9_22 just to make sure it worked before running all 38 .bil files, it gives me the error.  I tried to play around with L8 starting with vars where it says bio([0-9] to adjust the numbers but that didn't seem to fix the problem.  Does anyone have any insight on what is going wrong here?  Please let me know if you need any other info, and apologies if the answer is painfully obvious (it's not to me!).

Thank you!

 

"Error in as.data.frame.default(x) :

  cannot coerce class "structure("RasterLayer", package = "raster")" to a data.frame"

 

CODE (originally from John Baumgartner on 4/12/12):

# install.packages('rgdal') # install rgdal if not already installed

library(raster)

# set the working directory to the path that contains your .bil files

setwd('C:/Users/zwc/Documents/Rdir')

# create a list of .bil files that exist in the wd

files <- list.files(pattern='\\.bil$')

# vars is a vector of bioclim variable numbers

vars <- sort(unique(as.numeric(gsub('^bio([0-9]+)_.*', '\\1', files))))

# for each of vars, create raster object for each tile and merge

# (this is a bit messy, but all I could think of for now...)

# grids will be a list of rasters, each of which is the merged tiles for a BC var.

grids <- sapply(vars, function(x) {

  patt <- paste('bio', x, '_', sep='')

  tiles <- files[grep(patt, files)]

  merged <- eval(parse(text=paste('merge(', toString(paste('raster(', tiles, ')',

                sep='"')), ')', sep='')))

})

# give the list elements names

names(grids) <- paste('bio', vars, sep='')

# combine all list elements into a stack

s <- stack(grids)

# quick plot to make sure nothing went drastically wrong

plot(s)

# crop to your study extent

s.crop <- crop(s, WashingtonBoundary)   #This last line will obviously change for my boundaries, but I haven't made it that far yet!

Zach Culumber

unread,
Oct 20, 2014, 1:52:55 PM10/20/14
to max...@googlegroups.com

Just to update... I did not find a fix to the error I was getting, but a friend sent me some code that I modified to get out what I needed.  This is just basic code and someone could probably easily add a loop in here so that you won't have to manually do all these.  This code at least allowed me to just change the input and output filenames, and create the 19 merged and cropped raster layers that I needed for the 19 bioclim variables.  This also outputs them in .asc format so they are ready for use in Maxent.

 

library(raster)

library(rgdal)

setwd("C:/Users/zwc/Documents/Rdir2")

#r22 is a given bioclim raster for tile 22

r22<-stack(list.files(pattern="*_22.bil", full.names=TRUE))

 

#r23 is a given bioclim raster for tile 23

r23<-stack(list.files(pattern="*_23.bil", full.names=TRUE))

 

#merge the two rasters

r2223<-merge(r22,r23)

bbox <- as(extent(-102.5,-85,13,28), 'SpatialPolygons') #specify lat lon to crop the data, xmin xmax ymin ymax

z <- crop(r2223,bbox)

plot(z) #check to make sure the rasters are merged and cropped to the correct area

z2<-writeRaster(z,filename="bio1.asc",format="ascii",overwrite=TRUE) #change name for each bioclim variable

John Baumgartner

unread,
Oct 22, 2014, 3:09:32 AM10/22/14
to maxent
Sorry for the slow reply, Zach. 

It's hard to say what the problem was, but to debug it you could take a look at the output of:

lapply(vars, function(x) {

  patt <- paste('bio', x, '_', sep='')

  files[grep(patt, files)]

})


Maybe there's something odd about one of the variables - perhaps one of them only has a single tile?


Glad you have a workaround, anyway.


Cheers,

John



J Tan

unread,
Dec 19, 2015, 9:24:21 AM12/19/15
to Maxent
Hi John,

I'm looking to import worldclim variables in ASCII format into R to perform a PCA analysis to reduce the number of variables before I begin a Maxent run. What would be the workflow for this particular task? 

thanks,

Jonathan

Jamie M. Kass

unread,
Dec 20, 2015, 12:19:37 AM12/20/15
to Maxent
Without writing code for you, it would be this:

1. Load ASCII files into R with raster() command from raster package.
2. Get cell values of all rasters with cellValues() command.
3. Run a PCA on these cell values with prcomp() or comparable command.

However, you'd end up with PCs of your original WorldClim variables, and interpreting them is difficult. As long as you're okay with that, it's fine. Another thing to ask yourself is: why remove variables at all? Maxent is a machine learning method that performs variables selection already -- it doesn't use all the variables you put in. If you experiment with different settings and cross-validation, you can find a well-fit model even when using all the 19 bioclim variables (because a stricter model will toss many of them out).

Jamie Kass
PhD Student
City College, NYC

John Baumgartner

unread,
Dec 20, 2015, 5:08:23 PM12/20/15
to maxent
Yes, Maxent can cope with correlated predictors, and moderates complexity with regularisation. But if interpretation of the fitted model (e.g. response curves) is important, it's useful to reduce predictors to a less correlated, ecologically-meaningful set. Considering ecological relevance when choosing amongst multiple correlated predictors is particularly important when projecting models to other times/places, where relationships b/w variables may not hold.

It seems to me that Jonathon is interested in selecting variables according to their loadings, rather than using the PCs themselves as predictors. There are examples of the latter, though (e.g. Pliscoff et al. (2014). Effects of alternative sets of climatic predictors on species distribution models and associated estimates of extinction risk: A test with plants in an arid environment. Ecological Modelling288, 166-177; Cruz-Cárdenas et al. (2014). Potential species distribution modeling and the use of principal component analysis as predictor variables. Revista Mexicana de Biodiversidad85(1), 189-199), and others have proposed that Bioclim PCs be added to the suite of variables (Kriticos et al. (2014). Extending the suite of bioclim variables: a proposed registry system and case study using principal components analysis. Methods in Ecology and Evolution5(9), 956-960). I'm with Jamie on this, though - interpretation is challenging.

- John




Reply all
Reply to author
Forward
0 new messages