Fail to load the mod file when running an RRM Model

294 views
Skip to first unread message

aymericp...@u.northwestern.edu

unread,
Jul 17, 2017, 3:37:53 PM7/17/17
to Biogeme
Hello Everyone !

I'm trying to run an RRM model on PythonBiogeme.

For my code, I got the inspiration from this mixture model http://biogeme.epfl.ch/examples/swissmetro/python/07discreteMixture.py available on the official website. 

The only changes I did were:
- Rename the ASC
- Remove the lines:
"# Utility functions

#If the person has a GA (season ticket) her incremental cost is actually 0 
#rather than the cost value gathered from the
# network data. 
SM_COST =  SM_CO   * (  GA   ==  0  ) 
TRAIN_COST =  TRAIN_CO   * (  GA   ==  0  )

# For numerical reasons, it is good practice to scale the data to
# that the values of the parameters are around 1.0. 
# A previous estimation with the unscaled data has generated
# parameters around -0.01 for both cost and time. Therefore, time and
# cost are multipled my 0.01.
TRAIN_TT_SCALED = DefineVariable('TRAIN_TT_SCALED', TRAIN_TT / 100.0)
TRAIN_COST_SCALED = DefineVariable('TRAIN_COST_SCALED', TRAIN_COST / 100)
SM_TT_SCALED = DefineVariable('SM_TT_SCALED', SM_TT / 100.0)
SM_COST_SCALED = DefineVariable('SM_COST_SCALED', SM_COST / 100)
CAR_TT_SCALED = DefineVariable('CAR_TT_SCALED', CAR_TT / 100)
CAR_CO_SCALED = DefineVariable('CAR_CO_SCALED', CAR_CO / 100)
"
- Change the name of the variables in the utility functions by the ones of my data set (for the beginning, I just ran a model with two variables: cost and time)
- Change the lines of code
"# Associate the availability conditions with the alternatives

CAR_AV_SP =  DefineVariable('CAR_AV_SP',CAR_AV  * (  SP   !=  0  ))
TRAIN_AV_SP =  DefineVariable('TRAIN_AV_SP',TRAIN_AV  * (  SP   !=  0  ))

av = {1: TRAIN_AV_SP,
      2: SM_AV,
      3: CAR_AV_SP}
"
by:
"
one = 1

av = {1: one ,
      2: one ,
      3: one }
"
- Change the variable CHOICE by the variable CHOICE_CS (to be consistent with the names in my data set) 
- Remove the inclusion condition

Even with these few changes , my model didn't run. I got the error message:
Warning: [16:03:43]bioMain.cc:108  pythonbiogeme rrm3.py df.dat
[16:03:43]bioParameters.cc:399  Parameter documentation generated: pythonparam.html
[16:03:43]bioModelParser.cc:109  rrm3.py exists
Warning: [16:03:43]bioModelParser.cc:142  Error: Failed to load rrm3
Warning: [16:03:43]bioMain.cc:169  Error: Failed to load rrm3

I clearly don't have any clue where the problem might come from.
I ran the original file (example from the website with Swiss metro data set) and I didn't get any problem so there is no problem with the installation of PythonBiogeme.

Any help would be much appreciated :) ! 
If I didn't provide enough details concerning my code, please let me know.

Thanks in advance !

Best,

Aymeric

michel.b...@epfl.ch

unread,
Jul 21, 2017, 10:16:00 AM7/21/17
to Biogeme
How did you exactly remove the exclusion condition? If the statement is still there but not defined, it may be the reason. 
Unfortunately, I have no control to the core of python, this is why the error messages are not always easy to understand. 
Michel

calvin...@gmail.com

unread,
Jun 1, 2018, 7:46:46 AM6/1/18
to Biogeme
Dear Prof. Michel and Aymeric

just as I was loading my modification of B2 of the latent part
and came across the same problem
I remove the exclusion condition by deleting the entire statement

my code is as follows:

###IMPORT NECESSARY MODULES TO RUN BIOGEME
from biogeme import *
from biogeme import *
__rowId__=Variable('__rowId__')
pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5=Variable('pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5')
from headers import *
from loglikelihood import *
from statistics import *

### Variables

male = DefineVariable('male',gender == 1)

### Coefficients
coef_intercept = Beta('coef_intercept',0.0,-1000,1000,0 )
coef_male = Beta('coef_male',0.0,-1000,1000,0 )
coef_hand_carry = Beta('coef_hand_carry',0.0,-1000,1000,0 )
coef_tsainexp = Beta('coef_tsainexp',0.0,-1000,1000,0 )
coef_tsaoutexp = Beta('coef_tsaoutexp',0.0,-1000,1000,0 )
coef_colleague = Beta('coef_colleague',0.0,-1000,1000,0 )
coef_income = Beta('coef_income',0.0,-1000,1000,0 )
coef_education = Beta('coef_education',0.0,-1000,1000,0 )
coef_age = Beta('coef_age',0.0,-1000,1000,0 )


### Latent variable: structural equation
PBC = \
coef_intercept +\
coef_male * male +\
coef_hand_carry * hand_carry +\
coef_tsainexp * tsainexp +\
coef_tsaoutexp * tsaoutexp +\
coef_colleague * colleague +\
coef_income * income +\
coef_education * education +\
coef_age * age +\


### Measurement equations

INTER_pbc2 = Beta('INTER_pbc2',0.0,-10000,10000,0 )
INTER_pbc5 = Beta('INTER_pbc5',0.0,-10000,10000,0 )


B_pbc2_F1 = Beta('B_Envir02_F1',-0.431461,-10000,10000,0 )
B_pbc5_F1 = Beta('B_Envir05_F1',0.565903,-10000,10000,0 )



MODEL_pbc2 = INTER_pbc2 + B_pbc2_F1 * PBC
MODEL_pbc5 = INTER_pbc5 + B_pbc5_F1 * PBC


SIGMA_STAR_pbc2 = Beta('SIGMA_STAR_pbc2',1,-10000,10000,0 )
SIGMA_STAR_pbc5 = Beta('SIGMA_STAR_pbc5',10.0,-10000,10000,0 )

delta_1 = Beta('delta_1',1,0,10,0 )
delta_2 = Beta('delta_2',3.0,0,10,0 )
tau_1 = -delta_1 - delta_2
tau_2 = -delta_1 
tau_3 = delta_1
tau_4 = delta_1 + delta_2


pbc2_tau_1 = (tau_1-MODEL_pbc2) / SIGMA_STAR_pbc2
pbc2_tau_2 = (tau_2-MODEL_pbc2) / SIGMA_STAR_pbc2
pbc2_tau_3 = (tau_3-MODEL_pbc2) / SIGMA_STAR_pbc2
pbc2_tau_4 = (tau_4-MODEL_pbc2) / SIGMA_STAR_pbc2
Indpbc2 = {
    1: bioNormalCdf(pbc2_tau_1),
    2: bioNormalCdf(pbc2_tau_2)-bioNormalCdf(pbc2_tau_1),
    3: bioNormalCdf(pbc2_tau_3)-bioNormalCdf(pbc2_tau_2),
    4: bioNormalCdf(pbc2_tau_4)-bioNormalCdf(pbc2_tau_3),
    5: 1-bioNormalCdf(pbc2_tau_4),
    6: 1.0,
    -1: 1.0,
    -2: 1.0
}

P_pbc2 = Elem(Indpbc2, pbc2)

pbc5_tau_1 = (tau_1-MODEL_pbc5) / SIGMA_STAR_pbc5
pbc5_tau_2 = (tau_2-MODEL_pbc5) / SIGMA_STAR_pbc5
pbc5_tau_3 = (tau_3-MODEL_pbc5) / SIGMA_STAR_pbc5
pbc5_tau_4 = (tau_4-MODEL_pbc5) / SIGMA_STAR_pbc5
Indpbc5 = {
    1: bioNormalCdf(pbc5_tau_1),
    2: bioNormalCdf(pbc5_tau_2)-bioNormalCdf(pbc5_tau_1),
    3: bioNormalCdf(pbc5_tau_3)-bioNormalCdf(pbc5_tau_2),
    4: bioNormalCdf(pbc5_tau_4)-bioNormalCdf(pbc5_tau_3),
    5: 1-bioNormalCdf(pbc5_tau_4),
    6: 1.0,
    -1: 1.0,
    -2: 1.0
}

P_pbc5 = Elem(Indpbc5, pbc5)


loglike = log(P_pbc2) + \
          log(P_pbc5) + 
         



# Defines an iterator on the data
rowIterator('obsIter') 

BIOGEME_OBJECT.ESTIMATE = Sum(loglike,'obsIter')

is there something I am missing?

Thanks in advance !

Sincerely

Calvin

Bierlaire Michel

unread,
Jun 1, 2018, 7:50:20 AM6/1/18
to calvin...@gmail.com, Bierlaire Michel, Biogeme
I think that the reason is that the first row of your data file contains commas. As a result, biogeme considers the string "pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5” to be the name of one variable. 
You should use spaces or tabs as separators. 

I suggest that you make sure to estimate a simple logit model first, before working with complex models. 



On 1 Jun 2018, at 12:28, calvin...@gmail.com wrote:

pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5=Variable('pastexp,tsainexp,tsaoutexp,hand_carry,colleague,gender,income,age,education,pbc1,pbc2,pbc5')


calvin...@gmail.com

unread,
Jun 6, 2018, 3:17:58 AM6/6/18
to Biogeme
Dear Prof. Bierlaire:

thank you very much for your reply
I would definitely try simpler models first

meanwhile, I changed my code to:

###IMPORT NECESSARY MODULES TO RUN BIOGEME
from biogeme import *
pastexp = Variable('pastexp')
tsainexp = Variable('tsainexp')
tsaoutexp = Variable('tsaoutexp')
hand_carry = Variable('hand_carry')
colleague = Variable('colleague')
gender = Variable('gender')
income = Variable('income')
age = Variable('age')
education = Variable('education')
pbc1 = Variable('pbc1')
pbc2 = Variable('pbc2')
pbc5 = Variable('pbc5')

from headers import *
from loglikelihood import *
from statistics import *

#prepare csv file
biopreparedata logit.csv
however, the same error message came up again

aymericp...@u.northwestern.edu於 2017年7月18日星期二 UTC+8上午3時37分53秒寫道:

Bierlaire Michel

unread,
Jun 7, 2018, 5:27:11 AM6/7/18
to tsai calvin, Bierlaire Michel, Biogeme
Hi Calvin,

You simply forgot to import the package defining the distributions, in order to use normalcdf:
from distributions import *

Michel


On 7 Jun 2018, at 08:43, tsai calvin <calvin...@gmail.com> wrote:

Dear Prof.Bierlaire:

thank you so much for your help
the model file and csv are attached

Best Wishes
Sincerely
Calvin 


2018-06-06 15:53 GMT+08:00 Bierlaire Michel <michel.b...@epfl.ch>:
Send me the model file and the data file (zipped) and I’ll have a look. 
Michel




--
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+unsubscribe@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeme.
For more options, visit https://groups.google.com/d/optout.


<model.py><biogeme_logit.csv>

Bierlaire Michel

unread,
Jun 8, 2018, 12:42:30 PM6/8/18
to tsai calvin, Bierlaire Michel, Biogeme
On 8 Jun 2018, at 11:03, tsai calvin <calvin...@gmail.com> wrote:

Dear Prof.Bierlaire:

that is indeed the bug
code worked very well after calling for distributions

however, my problem now is that after including another latent variable (2 latent variables total)
biogeme reports that trust radius is too small, is there anything I can do to adjust?

It is not necessarily a problem if the norm of the gradient is small enough. These models are notoriously difficult to estimate. Moreover, numerical integration may generate numerical issues. 

I had thought about Monte Carlo
problem with that is, my latent variables are alternative specific and not generic
can I still use Monte Carlo in that case

Sure. You’ll need to investigate if you have enough draws. Make sure to use the results of the estimation with numerical integration as starting points for the montecarlo. 

If you have more latent variables, you may want to consider Bayesian estimation instead on maximum likelihood. I would recommend using STAN for that http://mc-stan.org/

Michel



sorry, I know that is quite a lot of questions

Best Wishes
Sincerely
Calvin
Reply all
Reply to author
Forward
0 new messages