Modeling a multinominal logit model with aggregate panel data

210 views
Skip to first unread message

Mariana Ss

unread,
Mar 30, 2022, 2:15:17 AM3/30/22
to Biogeme
Dear Prof. Michel Bierlaire,

I am trying to build a discrete choice model (multinominal logit model) using aggregate sales data per day (each row contains the observed sales, the corresponding day, the SKU, and its characteristics). That is, I have a set of products and their daily demand (I have no information about consumers) and I want to understand consumer choice for this set of products.
In case the data is neither aggregated nor panel data, I think the model would look like this:

#######
ASC = {}
ASC[0] = 0
for i in range(1, number_of_alternatives):
    ASC[i] = Beta('ASC' + str(i), 0, None, None, 0)

B_class   = Beta('B_class',0,None, None,0)
B_subclass = Beta('B_subclass',0,None, None,0)
B_Brand = Beta('B_Brand',0,None, None,0)
B_Price = Beta('B_Price',0,None, None,0)

# Utilities
V = {}
for i in range(number_of_alternatives):
    V[i] = ASC[i] + B_class * CLASS + B_subclass * SUBCLASS + B_Brand * BRAND + B_Price * PRICE

logprob = models.loglogit(V, None, database.variables['Choice'])
biogeme = bio.BIOGEME(database, logprob)
biogeme.modelName = 'MNL'
results = biogeme.estimate()
pandasResults = results.getEstimatedParameters()
#######

However, with panel and aggregate data, how should I adapt the model or the data?

Thank you very much!
Best regards,
Mariana

Bierlaire Michel

unread,
Mar 30, 2022, 2:21:44 AM3/30/22
to mar99...@gmail.com, Bierlaire Michel, Biogeme

On 29 Mar 2022, at 23:31, Mariana Ss <mar99...@gmail.com> wrote:

Dear Prof. Michel Bierlaire,

I am trying to build a discrete choice model (multinominal logit model) using aggregate sales data per day (each row contains the observed sales, the corresponding day, the SKU, and its characteristics). That is, I have a set of products and their daily demand (I have no information about consumers) and I want to understand consumer choice for this set of products.
In case the data is neither aggregated nor panel data, I think the model would look like this:

#######
ASC = {}
ASC[0] = 0
for i in range(1, number_of_alternatives):
    ASC[i] = Beta('ASC' + str(i), 0, None, None, 0)

B_class   = Beta('B_class',0,None, None,0)
B_subclass = Beta('B_subclass',0,None, None,0)
B_Brand = Beta('B_Brand',0,None, None,0)
B_Price = Beta('B_Price',0,None, None,0)

# Utilities
V = {}
for i in range(number_of_alternatives):
    V[i] = ASC[i] + B_class * CLASS + B_subclass * SUBCLASS + B_Brand * BRAND + B_Price * PRICE

logprob = models.loglogit(V, None, database.variables['Choice'])

Do not use “database.variables”. You need to use a biogeme expression: biogeme.expressions.Variables(‘Choice’)


biogeme = bio.BIOGEME(database, logprob)
biogeme.modelName = 'MNL'
results = biogeme.estimate()
pandasResults = results.getEstimatedParameters()
#######

However, with panel and aggregate data, how should I adapt the model or the data?

If you have aggregate data, the choice variable is replaced by the quantity of each alternative. And the contribution to the log likelihood is something like

logprob = Nbr_atl_1 * models.loglogit(V, None, 1) + Nbr_atl_2 * models.loglogit(V, None, 2) … etc

If you have panel data, you need to introduce the trajectory. The syntax depends on the format of the data.  If the data for one individual is spread over several rows, you need to declare the data panel, and use the PanelTrajectory expression. See the examples. 



Thank you very much!
Best regards,
Mariana


--
You received this message because you are subscribed to the Google Groups "Biogeme" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/biogeme/7878631a-962a-4e7f-92a4-f074ab73cb07n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages