Estimating a conditional logit model in biogeme

244 views
Skip to first unread message

Uma Parasuram

unread,
Feb 16, 2024, 8:01:40 AM2/16/24
to Biogeme
Hi Professor and the Biogeme Forum, 

I have a pannel data set with multiple entried per respondent - it is data from a discrete choice experiment. There are three alternatives and each person was shown 6 different choice sets.  I want to estimate a conditional logit model but I am only able to find the code for a multinomial logit model. How can I implement a conditonal logit model, which is conditional on "id". In STATA we can group by (id). It would be helpful to know how I can run the same using Biogeme. 

I tried the following code:

database = db.Database("Kernza", dat)
database.panel("id")

# Parameters of the RUM Model
B_PM2 = Beta('organic',0, -10, 10, 0)
B_FT2 = Beta('Kernza',0, -10, 10, 0)
B_C2  = Beta('just_water', 0, -10, 10, 0)
B_P2  = Beta('price', 0, -10, 10, 0)
B_ID =  Beta('id2',  0, -10, 10, 0)

# Utility / regret functions
# RUM class
V1_1 = B_PM2 * PM_1 + B_FT2 * FT_1 + B_C2 * C_1 + B_P2 * P_1 + B_ID * ID
V2_1 = B_PM2 * PM_2 + B_FT2 * FT_2 + B_C2 * C_2 + B_P2 * P_2 + B_ID * ID
V3_1 = B_PM2 * PM_3 + B_FT2 * FT_3 + B_C2 * C_3 + B_P2 * P_3 + B_ID * ID

# Associate utility functions with the numbering of alternatives
V2 = {1: V1_1,
     2: V2_1,
     3: V3_1
     }

# Associate the availability conditions with the alternatives
one =  1

av2 = {1: one,
      2: one,
      3: one}
# The choice model is a logit, with availability conditions
prob1 = models.logit(V2, av2, CHOICE)

#Conditional probability for the sequence of choices of an individual
ProbIndiv = PanelLikelihoodTrajectory(prob1)

# Define the likelihood function for the estimation
#prob = probClass1 * ProbIndiv_1 + probClass2 * ProbIndiv_2

# Create Biogeme object
biogeme = bio.BIOGEME(database,log(ProbIndiv))

# Name biogeme object to identify each repetition
biogeme.modelName = "RUM"

results = biogeme.estimate()

# Print the estimation statistics
print(results.short_summary())

# Get the model parameters in a pandas table and  print it
beta_hat_MNL = results.getEstimatedParameters()
statistics_MNL = results.getGeneralStatistics()
print(beta_hat_MNL)

However, the output is different from STATA. One observation is that STATA requires the data to be in the long format (making it 6 * 3 = 18 observations per person) but Biogeme needs it in wide fromat (6 observations per person). 

My question is thus, how can I estimate a conditional logit model for this pannel data? 

Thank you!

Michel Bierlaire

unread,
Feb 16, 2024, 8:23:51 AM2/16/24
to parasu...@gmail.com, Michel Bierlaire, Biogeme
Biogeme can deal with panel data. Here is an example: https://biogeme.epfl.ch/sphinx/auto_examples/swissmetro/plot_b12panel.html#sphx-glr-auto-examples-swissmetro-plot-b12panel-py

It also allows for generating a “flat” version of the data: https://biogeme.epfl.ch/sphinx/database.html#biogeme.database.Database.generateFlatPanelDataframe

Note that, if you are estimating a logit involving only fixed parameters, you do not need to bother about the panel nature of the data.
> --
> You received this message because you are subscribed to the Google Groups "Biogeme" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/f2670e7c-3a27-4cf7-ab5b-8a454842b5b8n%40googlegroups.com.

Michel Bierlaire
Transport and Mobility Laboratory
School of Architecture, Civil and Environmental Engineering
EPFL - Ecole Polytechnique Fédérale de Lausanne
http://transp-or.epfl.ch
http://people.epfl.ch/michel.bierlaire

Uma Parasuram

unread,
Feb 17, 2024, 11:54:42 AM2/17/24
to Biogeme
Hello Professor,
Thank you for your reply. This was very helpful. 
After running the mixed logit model I also want to run a Latent Class Model. So I have two follow up questions:
1. Is it possible to add demographic variables to the Latent Class model in Biogeme? If so how? 
2. After running the Latent class model, is it possible to predict the probabilities of which individual belongs to which class? 

Looking forward to guidance on the same. 

Thank you
Kind Regards
Uma

Michel Bierlaire

unread,
Feb 17, 2024, 12:46:36 PM2/17/24
to parasu...@gmail.com, Michel Bierlaire, Biogeme


> On 17 Feb 2024, at 15:55, Uma Parasuram <parasu...@gmail.com> wrote:
>
> Hello Professor,
> Thank you for your reply. This was very helpful.
> After running the mixed logit model I also want to run a Latent Class Model. So I have two follow up questions:
> 1. Is it possible to add demographic variables to the Latent Class model in Biogeme?

Yes.


> If so how?

https://biogeme.epfl.ch/sphinx/auto_examples/swissmetro/plot_b16panel_discrete_socio_eco.html#sphx-glr-auto-examples-swissmetro-plot-b16panel-discrete-socio-eco-py

> 2. After running the Latent class model, is it possible to predict the probabilities of which individual belongs to which class?

Yes. Simply apply the estimated class membership model to each individual.
> To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/04f7a6a3-e8b3-4eb7-ad26-3c01a7758de5n%40googlegroups.com.

Uma Parasuram

unread,
Feb 18, 2024, 12:27:22 PM2/18/24
to Michel Bierlaire, Biogeme
Thank you again!
When I tried to run the Latent class model I get an error message after this code:
the_biogeme = bio.BIOGEME(flat_database, logprob, parameter_file='few_draws_RUM_LCA.toml')
the_biogeme.modelName = 'kernza_RUM_LCA_socio_eco'

ERROR: 
Variable 1_ID not found in the database.
Variable 1_ID not found in the database.
Variable 1_PM_3 not found in the database.
Variable 1_FT_3 not found in the database.
Variable 1_C_3 not found in the database.
Variable 1_P_3 not found in the database.
..... and so on

In the example provided, you import flat_databased from the swissmetro_panel module. What should I do if I don't have a module but just a flattened database of my original panel data dataframe? 

I tried to define each of these variables but because they start with a number (Eg: 1_ID) I am unable to define them. 
Please let me know how to proceed with this. 

Thank you. 
Regards
Uma

--
PhD student 
Department of Applied Economics
University of Minnesota 

Uma Parasuram

unread,
Feb 23, 2024, 3:03:18 AM2/23/24
to Biogeme
Hi Professor,
Thank you for your feedback. 

I ran the latent class model to get the individual class membership probabilities. I did this by looping over the code so that it estimates a latent class model by "id". But the values are either very close to 0 or close to 1. Further when I sum the number of people who fall into the respective classes, the percentage/ proportion is different from the number I get from the Latent class model on the whole sample. 
 Could you please advise me on the same? 

Looking forward to hearing back. 

Thank you
Best,
Uma
Reply all
Reply to author
Forward
0 new messages