ApplyModel error: string index out of range

39 views
Skip to first unread message

Michelle Fink

unread,
Nov 3, 2015, 6:09:01 PM11/3/15
to VisTrails SAHM
And I'm back! :-P

I cannot for the life of me get the ApplyModel module to run. I have followed the tutorial as well as made every little tweak I can think of, but every time, I get this error:

errorTrace: Traceback (most recent call last):

 
File "C:\VisTrails_SAHM\vistrails\core\modules\vistrails_module.py", line 578, in update

 
self.compute()

 
File "C:\VisTrails_SAHM\vistrails\packages\sahm\init.py", line 586, in compute

 
if orig_lines[1][orig_lines[0].index(orig_covariate)] == 1 and \

IndexError: string index out of range


I am trying to apply a model to projected future climate data. From the error message, it looks like it is choking on some unexpected difference between the 2 MDS files? I've attached 2 files, one showing first 3 lines of original MDS, second is the new MDS, on the off chance I am on the right track.

Any help is greatly appreciated. I've banged my head on this for a couple of days now & really need to make some progress. Thank you!!

--Michelle


OriginalMDS_top3lines.csv
MergedDataset_1.csv

Talbert, Colin

unread,
Nov 6, 2015, 7:08:15 PM11/6/15
to Michelle Fink, VisTrails SAHM
Hello Michelle,

Thanks for hanging in there.  Sure enough there was a bug in my code (heavy sigh.. we do our best).   Would you mind opening your C:\VisTrails_SAHM\vistrails\packages\sahm\init.py in any text editor and swapping out line 538 with the following:  (make sure you include the trailing slash and keep the leading spacing the same.

 if orig_lines[1].split(",")[orig_lines[0].split(",").index(orig_covariate)] == "1" and \


I'll be around next week but in a meeting all day Monday if you have any questions or that doesn't work.  Send me an email and I'll make sure to get you up and running.

Cheer,
Colin




Colin Talbert
GIS Analyst and Developer
US Geological Survey


USGS Fort Collins Science Center 
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526

USGS North Central Climate Science Center

Work schedule:
Monday       - 7:00 - 3:00  (FORT)
Tuesday       -  7:00 - 3:00  (NC CSC)
Wednesday  -  7:00 - 3:00 (FORT)
Thursday      -  7:00 - 5:00 (FORT)
Friday          -  7:00 - 5:00 (FORT)

--
You received this message because you are subscribed to the Google Groups "VisTrails SAHM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vistrails-sah...@googlegroups.com.
To post to this group, send email to vistrai...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michelle Fink

unread,
Nov 7, 2015, 12:51:38 PM11/7/15
to VisTrails SAHM
Thanks Colin! I will try the fix first thing Monday. Due to my usual crazy timelines, I had to resort to calling the appropriate R module via commandline to get my models, which worked well enough, but is a tad tedious! :-) I appreciate the trouble-shooting!

--Michelle

Michelle Fink

unread,
Nov 10, 2015, 12:30:23 PM11/10/15
to VisTrails SAHM
Colin,

That suggested fix wasn't quite right either. I'm not sure how to debug within Vistrails except by inserting print statements, but after some more trial and error, I got this code to run (my changes are in bold, and this is the compute() function of the ApplyModel() class in init.py):

    def compute(self):
       
#  if the suplied mds has rows, observations then
       
#  pass r code the flag to produce metrics
        mdsfname
= utils.get_relative_path(self.force_get_input('mdsFile'), self)
        workspace
= utils.get_relative_path(self.force_get_input('modelWorkspace'), self)
       
skip_list = ['X', 'Y', 'responseBinary', '', 'Split', 'EvalSplit', 'Weights', 'Split\n', 'EvalSplit\n', 'Weights\n']

        mdsfile
= open(mdsfname, "r")
        lines
= mdsfile.readlines()

       
if len(lines) > 3:
           
#  we have rows R will need to recreate metrics.
           
self.args = 'pmt=TRUE '
       
else:
           
self.args = 'pmt=FALSE '

       
if len(lines) == 3:
           
#  we're applying this model to a new area
           
#  make sure all the covariates in the original model are in the new csv
           
#  if not tack on the original values.
            orig_mds
= utils.get_mdsfname(workspace)
            orig_mdsfile
= open(orig_mds, "r")
            orig_lines
= orig_mdsfile.readlines()
           
orig_covariates = [item for item in orig_lines[0][0:].split(",") if item not in skip_list]
            orig_use
= [item for item in orig_lines[1][0:].split(",") if item in ['0', '1']]

            missing_covariates
= []
           
new_covariates = [item for item in lines[0][0:].split(",") if item not in skip_list]
           
for orig_covariate in orig_covariates:
               
i = orig_covariates.index(orig_covariate)
               
if orig_use[i] == "1" and \

                    new_covariates
.count(orig_covariate) == 0:
                    missing_covariates
.append(orig_covariate)
           
if len(missing_covariates) > 0:
                msg
= 'One or more of the covariates used in the original model are not specified in the apply model mds file\n'
                msg
+= 'Specfically the following covariates were not found:'
                msg
+= '\n\t'.join(missing_covariates)

               
raise RuntimeError()

       
Model.compute(self)


Talbert, Colin

unread,
Nov 10, 2015, 3:31:44 PM11/10/15
to Michelle Fink, VisTrails SAHM
Hello Michelle,

First off, congratulations on being bold enough to monkey with the code and get it to work!

I like moving the items to skip into a separate list for reuse.  The strip() function applied in the list comprehension takes care of the new lines and makes the code more cross platform.  And the original [3:] after the split takes care of removing the first three columns, so I trimmed a few items from your skip list (FYI, the [0:] in your code specifies a slice from the zero element to the end and thus is redundant and could be removed).

I also think your calculating of the covariate use before the loop is cleaner and easier to understand.

At any rate thanks for sharing your code changes.  I've added them with a few modifications to our code base (https://github.com/ColinTalbert/sahm/commit/af9e7b953b496cfce28e962eeb756d0257a71249)  You're also the first user contributed code added to SAHM so you definitely get a prize.

Cheers,
Colin



Colin Talbert
GIS Analyst and Developer
US Geological Survey


USGS Fort Collins Science Center 
2150 Centre Ave. Bldg. C
Fort Collins, CO 80526

USGS North Central Climate Science Center

Work schedule:
Monday       - 7:00 - 3:00  (FORT)
Tuesday       -  7:00 - 3:00  (NC CSC)
Wednesday  -  7:00 - 3:00 (FORT)
Thursday      -  7:00 - 5:00 (FORT)
Friday          -  7:00 - 5:00 (FORT)

Reply all
Reply to author
Forward
0 new messages