Re: latent class model for panel data

61 views
Skip to first unread message

Michel Bierlaire

unread,
May 22, 2024, 11:29:02 AMMay 22
to marie...@gmail.com, Michel Bierlaire, Biogeme


> On 22 May 2024, at 14:40, marie...@gmail.com wrote:
>
> Hi there, I have tried to post to the google group but it didn't work
>
> I am making a latent class model for panel data. I have estimated the mixed logit model. I am trying to follow the description here:
> https://biogeme.epfl.ch/sphinx/auto_examples/swissmetro/plot_b16panel_discrete_socio_eco.html#
>
> I have several things that I do not understand.
> • What are ASC_CAR_S_class0 and ASC_CAR_S_class1?

The alternative specific constants of the models in each class.

> And what are CLASS_CTE and CLASS_INC? Do you need one of these for every SE factor that we include?

The parameter of the class membership model. CLASS_CTE is the intercept, and CLASS_INC the coefficient of income.

> • Does something like this count as two or three latent classes:
> • W = CLASS_CTE + CLASS_INC * INCOME + CLASS_Gender*GENDER
> • I want to calculate the class memberships.

But you simulate the logprob...

I suggest that you try first a simple latent class model, not mixed, to figure out how it works. Once it givea sensible results, you can start distributing some parameters.

> I have thried to follow this:
> https://biogeme.epfl.ch/sphinx/auto_examples/swissmetro/plot_b13panel_simul.html
>
>
> simulated_loglike = logprob.getValue_c(
> database=flat_database,
> betas=results.getBetaValues(),
> numberOfDraws=250,
> aggregation=True,
> prepareIds=True,
> )
>
> numerator = MonteCarlo(ACS2_param[0] * probIndiv)
> denominator = MonteCarlo(probIndiv)
>
> simulate = {
> 'Numerator': numerator,
> 'Denominator': denominator,
> }
>
>
> biosim = bio.BIOGEME(flat_database, simulate)
> biosim.modelName = 'panel_flat_latent_class_individual_sim'
> class_simulation = biosim.simulate(theBetaValues=results.getBetaValues())
> class_simulation.describe()
>
> But this gives me 100% class membership. It is also only for one parameter. How do I fix it?

Michel Bierlaire
Transport and Mobility Laboratory
School of Architecture, Civil and Environmental Engineering
EPFL - Ecole Polytechnique Fédérale de Lausanne
http://transp-or.epfl.ch
http://people.epfl.ch/michel.bierlaire

Marie Duper

unread,
May 23, 2024, 5:11:29 AMMay 23
to Biogeme
Thank you  for your response. 

Why is it wrong to simulate the logprob? That is what is done here


is this the right way to get the percentage in each class? 

Marie Duper

unread,
May 24, 2024, 10:43:11 AMMay 24
to Biogeme
Thank you much for your response. I have it working for a 2 class panel model. I want to extend to three classes

# Associate utility functions with the numbering of alternatives
V = [
[{1: Alt1[i][t], 2: Alt2[i][t]} for t in range(9)]
for i in range(NUMBER_OF_CLASSES)
]
# Associate the availability conditions with the alternatives

av = {1: 1, 2: 1, 3:1}

#obsprob = [loglogit(V[t], av, Variable(f'{t+1}_Chosen_Alternative')) for t in range(9)]
#condprobIndiv = exp(bioMultSum(obsprob))
prob = [
exp(
bioMultSum(
[loglogit(V[i][t], av, Variable(f'{t+1}_Chosen_Alternative')) for t in range(9)]
)
)
for i in range(NUMBER_OF_CLASSES)
]

#W = CLASS_1 + Class_Gender*Gender + Class_Race*Race + Class_Hispanic*Hispanic + Class_Age*Age + Class_Income*Income + Class_Residence*Residence + Class_Child*Child + Class_Assisted_Living*Assisted_Living + Class_State*State + Class_Vaccine*Vaccine + Class_Chronic*Chronic + Class_Vulnerable_contact*Vulnerable_contact + Class_Health_Insurance*Health_Insurance + Class_News*News + Class_Political*Political + Class_Self_employed*Self_employed + Class_Remote*Remote + Class_Education*Education + Class_Pregnant*Pregnant + Class_Vehicle*Vehicle
W = CLASS_1 + CLASS_2 + Class_Gender*Gender Class_Age*Age + Class_Income*Income

PROB_class0 = logit({0: W, 1: 0, 2:0 }, None, 0)
PROB_class1 = logit({0: W, 1: 0, 2:0 }, None, 1)
PROB_class2 = logit({0: W, 1: 0, 2:0 }, None, 2)

#[loglogit(V[t], av, Variable(f'{t+1}_Chosen_Alternative')) for t in range(9)]

probIndiv = PROB_class0 * prob[0] + PROB_class1 * prob[1] + PROB_class2 * prob[2]


logprob = log(MonteCarlo(probIndiv))
the_biogeme = bio.BIOGEME(flat_database, logprob, parameter_file='few_draws.toml', numberOfDraws=250)
the_biogeme.modelName = 'panel_flat_latent_class_3'


But I get this response

Incompatible list of alternatives in logit expression. Id(s) used for availabilities and not for utilities: 3

I don't understand which part is incorrect.

Michel Bierlaire

unread,
May 29, 2024, 2:25:44 AMMay 29
to marie...@gmail.com, Michel Bierlaire, Biogeme


> On 22 May 2024, at 18:12, Marie Duper <marie...@gmail.com> wrote:
>
> Thank you for your response.
>
> Why is it wrong to simulate the logprob?

Nothing is wrong. But you said "I want to calculate the class memberships." Then you should simulate the class membership, not the loglikelihood.
> --
> You received this message because you are subscribed to the Google Groups "Biogeme" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/43cea288-623e-4ce9-b25b-1e8ad383dbafn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages