Error Applying Model to new climate data

virginia...@gmail.com

unread,

Jun 26, 2014, 12:23:14 AM6/26/14

to vistrai...@googlegroups.com

Hello!

I am trying to apply a set of models developed using historical data to future climate conditions. I followed the directions in the tutorial for setting this up.

I am getting an error when the workflow reaches the MDS Builder module that is connected to the Apply Model modules.

Here is the error message:

File "C:\VisTrails_SAHM\vistrails\core\modules\vistrails_module.py", line 400, in update

self.compute()

File "C:\VisTrails_SAHM\vistrails\packages\sahm\init.py", line 965, in compute

subfolder, runname = utils.get_previous_run_info(MDSParams['fieldData'])

KeyError: 'fieldData'

Do I need to connect the field data file used for the historical climate to the MDS Builder for the applied/future climate conditions? Or is there something else I can do to address this error?

Thank you! I'm happy to provide more information.

Virginia Seamster

Talbert, Colin

unread,

Jun 26, 2014, 9:32:14 AM6/26/14

to virginia...@gmail.com, vistrai...@googlegroups.com

Hello Virginia,

You shouldn't need to have a input field data file in this case so that error is a bug.

You have two options for fixing it.

1) You won't get this error if you connect an outputName module to the MDSBuilder module that's throwing the error. If you don't put anything in for run_name or subfolder_name it won't alter the name or subfolder of your outputs.

2) You can replace line 965 in sahm\init.py with the line below. Make sure the indentation matches what was there already. Restart VisTrails and your workflow should work.

subfolder, runname = utils.get_previous_run_info(MDSParams.get('fieldData', ''))

This fix will go out in the next minor release. Thanks for reporting it.

Colin

Colin Talbert

GIS Analyst and Developer

US Geological Survey

talb...@usgs.gov

USGS Fort Collins Science Center

2150 Centre Ave. Bldg. C

Fort Collins, CO 80526

(970) 226-9425

USGS North Central Climate Science Center

(970) 492-4283

Work schedule:

Monday - 7:00 - 3:00 (NC CSC)

Tuesday - 7:00 - 3:00 (NC CSC)

Wednesday - 7:00 - 3:00 (FORT)

Thursday - 7:00 - 5:00 (NC CSC)

Friday - 7:00 - 5:00 (FORT)

--
You received this message because you are subscribed to the Google Groups "VisTrails SAHM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vistrails-sah...@googlegroups.com.
To post to this group, send email to vistrai...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

virginia...@gmail.com

unread,

Jun 26, 2014, 11:24:45 AM6/26/14

to vistrai...@googlegroups.com, virginia...@gmail.com

Hi Colin:

Thank you so much for your email and recommended fixes! I tried changing the python code and VisTrails stopped recognizing/loading the SAHM package. Then I undid the change to the code and tried adding the OutputName module as you described and that worked/the MDSBuilder module completed successfully.

Now I'm getting an error for the first ApplyModel Module after it starts processing EvaluateNewdata.r and launches EvaluateNewData.r asynchronously:

An error was encountered in the R script for this module.

The R error message is below:

Error in read.ma(out, hl = hl, include = include, evalNew = TRUE):

Response column has only one unique value

Calls: EvaluateNewData -> read.ma

Execution halted

Any ideas are much appreciated!

Virginia

virginia...@gmail.com

unread,

Jul 3, 2014, 8:07:20 PM7/3/14

to vistrai...@googlegroups.com, virginia...@gmail.com

Hi Colin and Marian:

Per Marian's recommendation I installed the latest version (uploaded June 10th) of the VisTrails SAHM package. I copied the tutorial module for "Applying a model to a new region" to a new .vt file and was able to run it without errors. I went through and modified the workflow to suit my data in a stepwise fashion- running the model after almost every change (removed ModelEvaluationandSplit; added BackgroundSurfaceGenerator; added background points and unchecked "ignoreNonOverlap" in PARC; changed the pathways in "FieldData" and "TemplateLayer" and "PredictorListFile" and changed the "OutputName" to match my data; added in the Maxent module). I made the change to the code in the init.py file that Colin recommended- but put in a second double quote (see below)
per the recommendation of my contact in IT here at NMSU.

subfolder, runname = utils.get_previous_run_info(MDSParams.get('fieldData', ''"))

There is one error I've gotten somewhat consistently when I run the modified tutorial workflow using my data. This error occurred for the second of the two species I tried running and went away for the second species when I turn off MESS map generation in all ApplyModel Modules. The error is generated when the workflow reaches the ApplyModel modules and is as follows:

"R Processing launched asynchronously EvaluateNewData.r at: 07/03/2014 11:06

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

An error was encountered in the R script for this module.

The R error message is below:

Traceback (most recent call last):

File "C:\VisTrails_SAHM\vistrails\packages\sahm\pySAHM\runRModel.py", line 161, in <module>

main(sys.argv[1:])

File "C:\VisTrails_SAHM\vistrails\packages\sahm\pySAHM\runRModel.py", line 66, in main

mosaicTiledOutputs(outDir)

File "C:\VisTrails_SAHM\vistrails\packages\sahm\pySAHM\runRModel.py", line 99, in mosaicTiledOutputs

NDValue = getNDVal(onlyfiles[0])

IndexError: list index out of range"

It seems that the program generates the binary and probability plots just fine but runs into a problem when it gets to the residual plots. It generates the tiles for the MESS and MOD plots but never mosaics them together...and there is nothing in the ResidTiff folders.

I've tried running the workflow with data for two different species- it's all presence data, no absences- and the workflow completed successfully/generated MESS maps (but no residual plots- which makes sense since there is no new field data for the ApplyModel modules) for one species but not the other. The input predictor files are the same in both cases- there are however different field datasets, template layers, different layers selected in the CovariateCorrelationandSelection module, and different numbers of background points since the first species (that completed successfully) has a very restricted range.

A second error that I got (but not consistently) was with the GLM module. The workflow completed successfully when I removed the GLM module or when I ran the GLM module for a species that has more presence data and a wider range or limit the background points on a species with a small presence dataset and restricted range. Here is the error message:

"producing prediction maps...
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

An error was encountered in the R script for this module.

The R error message is below:

Error in train.dat[, k] : incorrect number of dimensions

Calls: FitModels -> proc.tiff -> parRaster -> sort

Execution halted"

When I look in the "stdOut" text file it looks like the algorithm calculates the AIC values for all 10 folds and then errors out when producing prediction maps and there are no binary, probability, or MESS maps in the GLM folder.

Any thoughts on getting the workflow to run consistently/for both datasets with MESS map generation turned on for the ApplyModel modules and with the GLM module included would be great. I'm happy to provide further details.

Thank you!! Hope you have a great 4th!!

Virginia

virginia...@gmail.com

unread,

Jul 21, 2014, 4:32:20 PM7/21/14

to vistrai...@googlegroups.com, virginia...@gmail.com

Just to keep this posting up to date - thank you Colin for the revised .py file that addresses the first (IndexError) error that I was getting by skipping the generation of the ResidTiff folder by the ApplyModel module if that folder is generated and there are problems deleting it since residual maps aren't necessary when applying a model to new climate data.

Virginia

Reply all

Reply to author

Forward