Coding ICLV model with panel data by PandasBiogeme

902 views
Skip to first unread message

tran viet Yen

unread,
Mar 1, 2019, 8:04:33 AM3/1/19
to Biogeme

Dear prof. Bierlaire,

Could you please give me some advices on how to code the ICLV model with panel data using PandasBiogeme? I posted this question because the language of PandasBiogeme is different from pythonbiogeme in this vein.

In pandasBiogeme, I used an approach like this:

# panel setting
SIGMA = Beta('SIGMA',1,0,50,0)
errorcomp = SIGMA * bioDraws('errorcomp','NORMAL')

# latent variable EC
omega_EC = bioDraws('omega_EC','NORMAL')
sigma_s_EC = Beta('sigma_s_EC',1,0,50,0)

#indicator
IndEC1 = {
    1: bioNormalCdf(EC1_tau_1_EC),
    2: bioNormalCdf(EC1_tau_2_EC)-bioNormalCdf(EC1_tau_1_EC),
    3: bioNormalCdf(EC1_tau_3_EC)-bioNormalCdf(EC1_tau_2_EC),
    4: bioNormalCdf(EC1_tau_4_EC)-bioNormalCdf(EC1_tau_3_EC),
    5: 1-bioNormalCdf(EC1_tau_4_EC)
}
P_EC1 = Elem(IndEC1, EC1) 
(I have verified that there is not any error with the measurement model by running a model of only socio variables, EC and its indicators)

# estimation
obsprob = models.nested(V,AV,nests,Choice1)
P_indicator = P_EC1 * P_EC2
condprobIndiv = PanelLikelihoodTrajectory(obsprob) * P_indicator
logprob = log(MonteCarlo(condprobIndiv ))

Then this error happened: “RuntimeError: src/bioExprLiteral.cc:111: Biogeme exception: Null pointer: row index”

If I change like this:

P_indicator = P_EC1 * P_EC2
obsprob = models.nested(V,AV,nests,Choice1)
obsprob1 = obsprob * P_indicator
condprobIndiv = PanelLikelihoodTrajectory(obsprob1)
logprob = log(MonteCarlo(condprobIndiv))

Then another error happened:
“RuntimeError: src/bioExprLog.cc:51: Biogeme exception: Current values of the literals:
Cannot take the log of a negative number <log(Montecarlo(PanelLikelihoodTrajectory((exp(Logit[Choice...”

Generally, I got stuck with two types of error, namely row index and log of negative number. 

In fact, I could not understand why the program reported that the command “MonteCarlo(condprobIndiv)” yields a negative value because all the elements of the function condprobIndiv are probabilities, and they must take positive values.

Is the "row index" error connected with the function PanelLikelihoodTrajectory?

I also could not find any example of both panel data and latent variables (with categorical indicators) in the same model.

Thank you very much for reading this.

Best regards,

Tran Viet Yen.

tran viet Yen

unread,
Mar 3, 2019, 1:53:12 AM3/3/19
to Biogeme
Dear all,

After I manually treat the correlation in panel data, my problem is solved. However, is this a limitation of the function PanelLikelihoodTrajectory, that it can not deal with multiple draws for multiple variables in Montecarlo simulation?

Tran Viet Yen.

Vào 22:04:33 UTC+9 Thứ Sáu, ngày 01 tháng 3 năm 2019, tran viet Yen đã viết:

tran viet Yen

unread,
Mar 13, 2019, 4:40:11 AM3/13/19
to Matthew Wigginton Conway, Biogeme
Hi Matthew,

Thank you for sharing your solution on the issue of log of negative number. In addition, I think that when we use many indicators and more observations, the likelihood function becomes much smaller, thus causing this error. In fact, when I added a constant on the likelihood function in the form of "logprob = log(MonteCarlo(condprobIndiv ) + 0.1)", the error disappearred. I think theoretically, adding a constant, in this example 0.1, to the likelihood function will not alter its shape. So the modified likelihood function will yield the same estimates with likelihood function without this constant. Do you think that it is a possible solution?

Regards,

Yen.

On Tue, Mar 12, 2019 at 12:48 AM Matthew Wigginton Conway <ma...@indicatrix.org> wrote:
Hi Tran,

I ran into the same issue with getting the "cannot take the log of a negative number." I was able to solve it by scaling my variables to have similar means and standard errors. I had one variable that had a much larger scale than the others, and it was somewhere causing numerical issues with a number very close to zero being rounded to a negative number.

Matt

--
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeme.
For more options, visit https://groups.google.com/d/optout.


--
Trần Việt Yên
PhD Student
Civil and Environmental Engineering Department, Graduate school of Engineering, Nagoya University, Japan.

Kuldeep Kavta

unread,
Apr 6, 2020, 2:55:09 AM4/6/20
to Biogeme
Dear Tran viet Yen,

I am doing very similar modeling as you were doing. ICLV model with 8 latent variables, montecarlo integration and panel data. I am getting the error "src/bioExprPanelTrajectory.cc:96: Biogeme exception: Error for data entry 864:" when i apply likelihood equation in the following way:

"
V = {1: V0, 2: V1} av = {1: 1, 2: 1} prob = models.logit(V,av,mode) condlike = (P_att1) * (P_att2) * (P_att4) * (P_att5) * (P_att6) * (P_att7) * (P_att8) * (P_att9) * (P_att11) * (P_att12) * (P_att13) * (P_PVQ5) * (P_PVQ6) * (P_PVQ1) * (P_PVQ2) * (P_PVQ3) * (P_PVQ4) * (P_PVQ8) * (P_PVQ9) * prob indcondlike = PanelLikelihoodTrajectory(condlike) logprob = log(MonteCarlo(indcondlike))

"
I tried changing the initial values of variables, but it is still not working. How did you get through this?

Thanks,
Kuldeep 

Michel Bierlaire

unread,
Apr 7, 2020, 2:49:52 AM4/7/20
to Kuldeep Kavta, Michel Bierlaire, Biogeme
These models are complicated to estimate. In general, you want to give a high initial value to the sigma’s of these expressions. Indeed, if the sigma is too small, the probability of the response is close to one, causing numerical problems. 

I would also suggest to build the model gradually. Start with 0 latent variable, estimate the model. Then include the first LV, starting the estimation from the previous estimates, etc.


-- 
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.

tran viet Yen

unread,
Apr 7, 2020, 5:24:15 AM4/7/20
to Bierlaire Michel, Kuldeep Kavta, Biogeme
Hi Kuldeep,

As Prof. Bierlaire mentioned, I also suggest you to estimate your model step by step, such as in a hierarchy, so that you could reach to the most complex model. This is a very useful practice. Anyway, you may consider whether using so many latent variables in a model is, theoretically, reasonable or not.

Regards,

Yen.

You received this message because you are subscribed to a topic in the Google Groups "Biogeme" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/biogeme/10BHrZ2Hv3A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to biogeme+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/56911887-721A-4C96-B385-E7B3F9CC7031%40epfl.ch.

Kuldeep Kavta

unread,
Apr 8, 2020, 12:32:26 PM4/8/20
to Biogeme
Thank you Professor Bierlaire and tran viett Yen,

I could run the model by arranging data as a panel but without adding "PanelLikelihoodTrajectory(condlike)" in likelihood equation. I am trying to run it by adding the same again.

Regarding theoretical validation for large number of latent variables, the idea is to empirically test the "Value, Attribute and behavior" theory which is used in Psychological research. This makes model a hierarchical structure requiring more LVs. But the point made is very valid that theoretical base should be strong if number of LVs are large.  For more on this, you can refer to the following work where this has been attempted before.

Thank you for suggestion and help.


On Tuesday, April 7, 2020 at 2:54:15 PM UTC+5:30, tran viet Yen wrote:
Hi Kuldeep,

As Prof. Bierlaire mentioned, I also suggest you to estimate your model step by step, such as in a hierarchy, so that you could reach to the most complex model. This is a very useful practice. Anyway, you may consider whether using so many latent variables in a model is, theoretically, reasonable or not.

Regards,

Yen.

To unsubscribe from this group and stop receiving emails from it, send an email to bio...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Biogeme" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/biogeme/10BHrZ2Hv3A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bio...@googlegroups.com.

Michel Bierlaire

unread,
Apr 9, 2020, 4:02:03 AM4/9/20
to Kuldeep Kavta, Michel Bierlaire, Biogeme
Another avenue would be to use Bayesian estimation. 
See the work by Ricardo Daziano.

To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/2d2770c2-4300-48a2-9968-b30ebea09a7f%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages