ApplyModel error: string index out of range

Michelle Fink

unread,

Nov 3, 2015, 6:09:01 PM11/3/15

to VisTrails SAHM

And I'm back! :-P

I cannot for the life of me get the ApplyModel module to run. I have followed the tutorial as well as made every little tweak I can think of, but every time, I get this error:

errorTrace: Traceback (most recent call last): 

 File "C:\VisTrails_SAHM\vistrails\core\modules\vistrails_module.py", line 578, in update 

 self.compute() 

 File "C:\VisTrails_SAHM\vistrails\packages\sahm\init.py", line 586, in compute 

 if orig_lines[1][orig_lines[0].index(orig_covariate)] == 1 and \ 

IndexError: string index out of range

I am trying to apply a model to projected future climate data. From the error message, it looks like it is choking on some unexpected difference between the 2 MDS files? I've attached 2 files, one showing first 3 lines of original MDS, second is the new MDS, on the off chance I am on the right track.

Any help is greatly appreciated. I've banged my head on this for a couple of days now & really need to make some progress. Thank you!!

--Michelle

OriginalMDS_top3lines.csv

MergedDataset_1.csv

Talbert, Colin

unread,

Nov 6, 2015, 7:08:15 PM11/6/15

to Michelle Fink, VisTrails SAHM

Hello Michelle,

Thanks for hanging in there. Sure enough there was a bug in my code (heavy sigh.. we do our best). Would you mind opening your C:\VisTrails_SAHM\vistrails\packages\sahm\init.py in any text editor and swapping out line 538 with the following: (make sure you include the trailing slash and keep the leading spacing the same.

if orig_lines[1].split(",")[orig_lines[0].split(",").index(orig_covariate)] == "1" and \

I'll be around next week but in a meeting all day Monday if you have any questions or that doesn't work. Send me an email and I'll make sure to get you up and running.

Cheer,

Colin

Colin Talbert

GIS Analyst and Developer

US Geological Survey

talb...@usgs.gov

USGS Fort Collins Science Center

2150 Centre Ave. Bldg. C

Fort Collins, CO 80526

(970) 226-9425

USGS North Central Climate Science Center

(970) 492-4283

Work schedule:

Monday - 7:00 - 3:00 (FORT)

Tuesday - 7:00 - 3:00 (NC CSC)

Wednesday - 7:00 - 3:00 (FORT)

Thursday - 7:00 - 5:00 (FORT)

Friday - 7:00 - 5:00 (FORT)

--
You received this message because you are subscribed to the Google Groups "VisTrails SAHM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vistrails-sah...@googlegroups.com.
To post to this group, send email to vistrai...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michelle Fink

unread,

Nov 7, 2015, 12:51:38 PM11/7/15

to VisTrails SAHM

Thanks Colin! I will try the fix first thing Monday. Due to my usual crazy timelines, I had to resort to calling the appropriate R module via commandline to get my models, which worked well enough, but is a tad tedious! :-) I appreciate the trouble-shooting!

--Michelle

Michelle Fink

unread,

Nov 10, 2015, 12:30:23 PM11/10/15

to VisTrails SAHM

Colin,

That suggested fix wasn't quite right either. I'm not sure how to debug within Vistrails except by inserting print statements, but after some more trial and error, I got this code to run (my changes are in bold, and this is the compute() function of the ApplyModel() class in init.py):

    def compute(self):
        #  if the suplied mds has rows, observations then
        #  pass r code the flag to produce metrics
        mdsfname = utils.get_relative_path(self.force_get_input('mdsFile'), self)
        workspace = utils.get_relative_path(self.force_get_input('modelWorkspace'), self)
        skip_list = ['X', 'Y', 'responseBinary', '', 'Split', 'EvalSplit', 'Weights', 'Split\n', 'EvalSplit\n', 'Weights\n']

        mdsfile = open(mdsfname, "r")
        lines = mdsfile.readlines()

        if len(lines) > 3:
            #  we have rows R will need to recreate metrics.
            self.args = 'pmt=TRUE '
        else:
            self.args = 'pmt=FALSE '

        if len(lines) == 3:
            #  we're applying this model to a new area
            #  make sure all the covariates in the original model are in the new csv
            #  if not tack on the original values.
            orig_mds = utils.get_mdsfname(workspace)
            orig_mdsfile = open(orig_mds, "r")
            orig_lines = orig_mdsfile.readlines()
            orig_covariates = [item for item in orig_lines[0][0:].split(",") if item not in skip_list]
            orig_use = [item for item in orig_lines[1][0:].split(",") if item in ['0', '1']]
            missing_covariates = []
            new_covariates = [item for item in lines[0][0:].split(",") if item not in skip_list]
            for orig_covariate in orig_covariates:
                i = orig_covariates.index(orig_covariate)
                if orig_use[i] == "1" and \
                    new_covariates.count(orig_covariate) == 0:
                    missing_covariates.append(orig_covariate)
            if len(missing_covariates) > 0:
                msg = 'One or more of the covariates used in the original model are not specified in the apply model mds file\n'
                msg += 'Specfically the following covariates were not found:'
                msg += '\n\t'.join(missing_covariates)

                raise RuntimeError()

        Model.compute(self)

Talbert, Colin

unread,

Nov 10, 2015, 3:31:44 PM11/10/15

to Michelle Fink, VisTrails SAHM

Hello Michelle,

First off, congratulations on being bold enough to monkey with the code and get it to work!

I like moving the items to skip into a separate list for reuse. The strip() function applied in the list comprehension takes care of the new lines and makes the code more cross platform. And the original [3:] after the split takes care of removing the first three columns, so I trimmed a few items from your skip list (FYI, the [0:] in your code specifies a slice from the zero element to the end and thus is redundant and could be removed).

I also think your calculating of the covariate use before the loop is cleaner and easier to understand.

At any rate thanks for sharing your code changes. I've added them with a few modifications to our code base (https://github.com/ColinTalbert/sahm/commit/af9e7b953b496cfce28e962eeb756d0257a71249) You're also the first user contributed code added to SAHM so you definitely get a prize.

Cheers,

Colin

Colin Talbert

GIS Analyst and Developer

US Geological Survey

talb...@usgs.gov

USGS Fort Collins Science Center

2150 Centre Ave. Bldg. C

Fort Collins, CO 80526

(970) 226-9425

USGS North Central Climate Science Center

(970) 492-4283

Work schedule:

Monday - 7:00 - 3:00 (FORT)

Tuesday - 7:00 - 3:00 (NC CSC)

Wednesday - 7:00 - 3:00 (FORT)

Thursday - 7:00 - 5:00 (FORT)

Friday - 7:00 - 5:00 (FORT)

Reply all

Reply to author

Forward