Fail to load the mod file when running an RRM Model

aymericp...@u.northwestern.edu

unread,

Jul 17, 2017, 3:37:53 PM7/17/17

to Biogeme

Hello Everyone !

I'm trying to run an RRM model on PythonBiogeme.

For my code, I got the inspiration from this mixture model http://biogeme.epfl.ch/examples/swissmetro/python/07discreteMixture.py available on the official website.

The only changes I did were:

- Rename the ASC

- Remove the lines:

"# Utility functions

#If the person has a GA (season ticket) her incremental cost is actually 0

#rather than the cost value gathered from the

# network data.

SM_COST = SM_CO * ( GA == 0 )

TRAIN_COST = TRAIN_CO * ( GA == 0 )

# For numerical reasons, it is good practice to scale the data to

# that the values of the parameters are around 1.0.

# A previous estimation with the unscaled data has generated

# parameters around -0.01 for both cost and time. Therefore, time and

# cost are multipled my 0.01.

TRAIN_TT_SCALED = DefineVariable('TRAIN_TT_SCALED', TRAIN_TT / 100.0)

TRAIN_COST_SCALED = DefineVariable('TRAIN_COST_SCALED', TRAIN_COST / 100)

SM_TT_SCALED = DefineVariable('SM_TT_SCALED', SM_TT / 100.0)

SM_COST_SCALED = DefineVariable('SM_COST_SCALED', SM_COST / 100)

CAR_TT_SCALED = DefineVariable('CAR_TT_SCALED', CAR_TT / 100)

CAR_CO_SCALED = DefineVariable('CAR_CO_SCALED', CAR_CO / 100)

"

- Change the name of the variables in the utility functions by the ones of my data set (for the beginning, I just ran a model with two variables: cost and time)

- Change the lines of code

"# Associate the availability conditions with the alternatives

CAR_AV_SP = DefineVariable('CAR_AV_SP',CAR_AV * ( SP != 0 ))

TRAIN_AV_SP = DefineVariable('TRAIN_AV_SP',TRAIN_AV * ( SP != 0 ))

av = {1: TRAIN_AV_SP,

2: SM_AV,

3: CAR_AV_SP}

"

by:

"

one = 1

av = {1: one ,

2: one ,

3: one }

"

- Change the variable CHOICE by the variable CHOICE_CS (to be consistent with the names in my data set)

- Remove the inclusion condition

Even with these few changes , my model didn't run. I got the error message:

Warning: [16:03:43]bioMain.cc:108 pythonbiogeme rrm3.py df.dat

[16:03:43]bioParameters.cc:399 Parameter documentation generated: pythonparam.html

[16:03:43]bioModelParser.cc:109 rrm3.py exists

Warning: [16:03:43]bioModelParser.cc:142 Error: Failed to load rrm3

Warning: [16:03:43]bioMain.cc:169 Error: Failed to load rrm3

I clearly don't have any clue where the problem might come from.

I ran the original file (example from the website with Swiss metro data set) and I didn't get any problem so there is no problem with the installation of PythonBiogeme.

Any help would be much appreciated :) !

If I didn't provide enough details concerning my code, please let me know.

Thanks in advance !

Best,

Aymeric

michel.b...@epfl.ch

unread,

Jul 21, 2017, 10:16:00 AM7/21/17

to Biogeme

How did you exactly remove the exclusion condition? If the statement is still there but not defined, it may be the reason.

Unfortunately, I have no control to the core of python, this is why the error messages are not always easy to understand.

Michel

calvin...@gmail.com

unread,

Jun 1, 2018, 7:46:46 AM6/1/18

to Biogeme

Dear Prof. Michel and Aymeric

just as I was loading my modification of B2 of the latent part

and came across the same problem

I remove the exclusion condition by deleting the entire statement

my code is as follows:

###IMPORT NECESSARY MODULES TO RUN BIOGEME

from biogeme import *

__rowId__=Variable('__rowId__')

pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5=Variable('pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5')

from headers import *

from loglikelihood import *

from statistics import *

### Variables

male = DefineVariable('male',gender == 1)

### Coefficients

coef_intercept = Beta('coef_intercept',0.0,-1000,1000,0 )

coef_male = Beta('coef_male',0.0,-1000,1000,0 )

coef_hand_carry = Beta('coef_hand_carry',0.0,-1000,1000,0 )

coef_tsainexp = Beta('coef_tsainexp',0.0,-1000,1000,0 )

coef_tsaoutexp = Beta('coef_tsaoutexp',0.0,-1000,1000,0 )

coef_colleague = Beta('coef_colleague',0.0,-1000,1000,0 )

coef_income = Beta('coef_income',0.0,-1000,1000,0 )

coef_education = Beta('coef_education',0.0,-1000,1000,0 )

coef_age = Beta('coef_age',0.0,-1000,1000,0 )

### Latent variable: structural equation

PBC = \

coef_intercept +\

coef_male * male +\

coef_hand_carry * hand_carry +\

coef_tsainexp * tsainexp +\

coef_tsaoutexp * tsaoutexp +\

coef_colleague * colleague +\

coef_income * income +\

coef_education * education +\

coef_age * age +\

### Measurement equations

INTER_pbc2 = Beta('INTER_pbc2',0.0,-10000,10000,0 )

INTER_pbc5 = Beta('INTER_pbc5',0.0,-10000,10000,0 )

B_pbc2_F1 = Beta('B_Envir02_F1',-0.431461,-10000,10000,0 )

B_pbc5_F1 = Beta('B_Envir05_F1',0.565903,-10000,10000,0 )

MODEL_pbc2 = INTER_pbc2 + B_pbc2_F1 * PBC

MODEL_pbc5 = INTER_pbc5 + B_pbc5_F1 * PBC

SIGMA_STAR_pbc2 = Beta('SIGMA_STAR_pbc2',1,-10000,10000,0 )

SIGMA_STAR_pbc5 = Beta('SIGMA_STAR_pbc5',10.0,-10000,10000,0 )

delta_1 = Beta('delta_1',1,0,10,0 )

delta_2 = Beta('delta_2',3.0,0,10,0 )

tau_1 = -delta_1 - delta_2

tau_2 = -delta_1

tau_3 = delta_1

tau_4 = delta_1 + delta_2

pbc2_tau_1 = (tau_1-MODEL_pbc2) / SIGMA_STAR_pbc2

pbc2_tau_2 = (tau_2-MODEL_pbc2) / SIGMA_STAR_pbc2

pbc2_tau_3 = (tau_3-MODEL_pbc2) / SIGMA_STAR_pbc2

pbc2_tau_4 = (tau_4-MODEL_pbc2) / SIGMA_STAR_pbc2

Indpbc2 = {

1: bioNormalCdf(pbc2_tau_1),

2: bioNormalCdf(pbc2_tau_2)-bioNormalCdf(pbc2_tau_1),

3: bioNormalCdf(pbc2_tau_3)-bioNormalCdf(pbc2_tau_2),

4: bioNormalCdf(pbc2_tau_4)-bioNormalCdf(pbc2_tau_3),

5: 1-bioNormalCdf(pbc2_tau_4),

6: 1.0,

-1: 1.0,

-2: 1.0

}

P_pbc2 = Elem(Indpbc2, pbc2)

pbc5_tau_1 = (tau_1-MODEL_pbc5) / SIGMA_STAR_pbc5

pbc5_tau_2 = (tau_2-MODEL_pbc5) / SIGMA_STAR_pbc5

pbc5_tau_3 = (tau_3-MODEL_pbc5) / SIGMA_STAR_pbc5

pbc5_tau_4 = (tau_4-MODEL_pbc5) / SIGMA_STAR_pbc5

Indpbc5 = {

1: bioNormalCdf(pbc5_tau_1),

2: bioNormalCdf(pbc5_tau_2)-bioNormalCdf(pbc5_tau_1),

3: bioNormalCdf(pbc5_tau_3)-bioNormalCdf(pbc5_tau_2),

4: bioNormalCdf(pbc5_tau_4)-bioNormalCdf(pbc5_tau_3),

5: 1-bioNormalCdf(pbc5_tau_4),

6: 1.0,

-1: 1.0,

-2: 1.0

}

P_pbc5 = Elem(Indpbc5, pbc5)

loglike = log(P_pbc2) + \

log(P_pbc5) +

# Defines an iterator on the data

rowIterator('obsIter')

BIOGEME_OBJECT.ESTIMATE = Sum(loglike,'obsIter')

is there something I am missing?

Thanks in advance !

Sincerely

Calvin

Bierlaire Michel

unread,

Jun 1, 2018, 7:50:20 AM6/1/18

to calvin...@gmail.com, Bierlaire Michel, Biogeme

I think that the reason is that the first row of your data file contains commas. As a result, biogeme considers the string "pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5” to be the name of one variable.

You should use spaces or tabs as separators.

I suggest that you make sure to estimate a simple logit model first, before working with complex models.

On 1 Jun 2018, at 12:28, calvin...@gmail.com wrote:

pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5=Variable('pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5')

calvin...@gmail.com

unread,

Jun 6, 2018, 3:17:58 AM6/6/18

to Biogeme

Dear Prof. Bierlaire:

thank you very much for your reply
I would definitely try simpler models first

meanwhile, I changed my code to:

###IMPORT NECESSARY MODULES TO RUN BIOGEME

from biogeme import *

pastexp = Variable('pastexp')

tsainexp = Variable('tsainexp')

tsaoutexp = Variable('tsaoutexp')

hand_carry = Variable('hand_carry')

colleague = Variable('colleague')

gender = Variable('gender')

income = Variable('income')

age = Variable('age')

education = Variable('education')

pbc1 = Variable('pbc1')

pbc2 = Variable('pbc2')

pbc5 = Variable('pbc5')

from headers import *

from loglikelihood import *

from statistics import *

#prepare csv file

biopreparedata logit.csv

however, the same error message came up again

aymericp...@u.northwestern.edu於 2017年7月18日星期二 UTC+8上午3時37分53秒寫道：

Bierlaire Michel

unread,

Jun 7, 2018, 5:27:11 AM6/7/18

to tsai calvin, Bierlaire Michel, Biogeme

Hi Calvin,

You simply forgot to import the package defining the distributions, in order to use normalcdf:

from distributions import *

Michel

On 7 Jun 2018, at 08:43, tsai calvin <calvin...@gmail.com> wrote:

Dear Prof.Bierlaire:

thank you so much for your help

the model file and csv are attached

Best Wishes

Sincerely

Calvin

2018-06-06 15:53 GMT+08:00 Bierlaire Michel <michel.b...@epfl.ch>:

Send me the model file and the data file (zipped) and I’ll have a look.

Michel

--
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+unsubscribe@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeme.
For more options, visit https://groups.google.com/d/optout.

<model.py><biogeme_logit.csv>

Bierlaire Michel

unread,

Jun 8, 2018, 12:42:30 PM6/8/18

to tsai calvin, Bierlaire Michel, Biogeme

On 8 Jun 2018, at 11:03, tsai calvin <calvin...@gmail.com> wrote:

Dear Prof.Bierlaire:

that is indeed the bug

code worked very well after calling for distributions

however, my problem now is that after including another latent variable (2 latent variables total)

biogeme reports that trust radius is too small, is there anything I can do to adjust?

It is not necessarily a problem if the norm of the gradient is small enough. These models are notoriously difficult to estimate. Moreover, numerical integration may generate numerical issues.

I had thought about Monte Carlo

problem with that is, my latent variables are alternative specific and not generic

can I still use Monte Carlo in that case

Sure. You’ll need to investigate if you have enough draws. Make sure to use the results of the estimation with numerical integration as starting points for the montecarlo.

If you have more latent variables, you may want to consider Bayesian estimation instead on maximum likelihood. I would recommend using STAN for that http://mc-stan.org/