1.8E+308 stds with loglikelihoodregression models

Chris ten Dam

unread,

Jun 12, 2025, 8:26:16 AMJun 12

to Biogeme

Dear Prof. Bierlaire,

I appear to have a numerical issue with the estimation of the standard deviation of sigma for ll.loglikelihoodregression models.

My questions are as follows:

1. Do you have any idea what the root cause of the below described issue is?

2. Do you have any tips for preventing such numerical issues?

3. Would you recommend ignoring the issue by e.g. ignoring the weird standard deviation, only using robust standard deviations, and/or only using the scipy optimization algorithm?

The issue:

I have built a model of monetary expenditures which combines an estimation of the total household budget and of the fraction of this budget for each expenditure category.

In addition, I have built separate (simple) models of these energy, car, and housing expenditures using only single ll.loglikelihoodregression functions.

However, in all cases, the (non-robust) standard error of sigma(s) is 1.8E+308.

Also, the correlations reported in the Biogeme .html file are all 1.8E+308 values, without exception. The covariances seem more reasonable.

Non-convergence warnings occur with the standard optimization algorithms, with the notable exception of scipy.

Interestingly, the issue appears purely numerical:

· The estimated values of sigma do make sense

· All robust standard errors and t-scores seem logical

· The estimated coefficients make sense

· For the simple models (of one expenditure at a time) these coefficients are exactly equal to the coefficients estimated with an sklearn Ordinary Least Squares model

The issue persists when:

· Estimating models with ASCs only

· Using log(ll.likelihoodregression())

· Removing the entries with the lowest simulated (log)likelihoods

The expenditures have been BoxCox transformed previously and vary from ca. 10 to 60.

These expenditures have been derived from different datasets (e.g. the EnergyCosts are from utility companies, whereas CarCosts are computed using odometer counts from obligatory checkups).

All independent variables have been scaled and centered.

All sigma’s are freely estimated, with lower bounds specified (but not reached).

The variance-covarance values vary from abs(10**-10 to 10**-2). There are negative elements. I understand this can cause 1.8E+308 stds for sigma? But then I still do not understand what the root cause of this issue might be and whether there is any cause for concern.

The biogeme version is 3.2.14, but apparently, there were some issues with the installation of this latest version inthe microdata environment by the IT people.

I cannot share the exact data or code due to the remote microdata environment, but the issue already occurs with the following simple setup:

Costs_sigma = Beta(‘Costs_sigma’, 1, None, 0)

Costs_modeled = ASC + coef1*var1 + coef2*var2 + coef3*var3 + coef4*var4

loglike = ll.loglikelihoodregression(CostData, Costs_modeled, Costs_sigma)

Biogeme = bio.BIOGEME(database, loglike)

Michel Bierlaire

unread,

Jun 13, 2025, 3:52:20 AMJun 13

to chrisdjie...@gmail.com, Michel Bierlaire, Biogeme

Note that if your model is essentially a regression model, Biogeme may not be the most appropriate tool for estimating it. Biogeme uses regression for hybrid choice models with latent variables.

Have you tried estimating the model using scikit-learn or a similar package? That might help you better understand the source of the numerical issue.

The value 1.8E+308 typically appears when Biogeme attempts to compute the square root of a negative number—often indicating that the Rao-Cramer matrix is not positive definite.

As an alternative, you might also consider estimating the standard errors using bootstrapping.

What type of issues do you have with installing 3.2.14?

> --
> You received this message because you are subscribed to the Google Groups "Biogeme" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to biogeme+u...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/biogeme/e163ef49-44f0-43c1-81bc-b3b89a4c1245n%40googlegroups.com.

Michel Bierlaire
Transport and Mobility Laboratory
School of Architecture, Civil and Environmental Engineering
EPFL - Ecole Polytechnique Fédérale de Lausanne
http://transp-or.epfl.ch
http://people.epfl.ch/michel.bierlaire

Michel Bierlaire

unread,

Jun 14, 2025, 3:27:27 AMJun 14

to Chris ten Dam, Michel Bierlaire, Biogeme

> On 14 Jun 2025, at 09:16, Chris ten Dam <chrisdjie...@gmail.com> wrote:
>
> Dear Professor Bierlaire,
>
> Thank you for the rapid reply.
>
> The aim is to build a modelling framework of people's budget (continuous) and the division of this budget over different expenditure categories (Multinomial/Nested Logit).
> For diagnostic purposes, I also built separate models of the continuous expenditure per category using both biogeme and scikit learn.
> After centering all independent variables, the estimated coefficients were the exact same between both packages.

Very good.

> However, Biogeme's std of sigma and the correlations in the HTML file were all 1.8E+308.
>
> What does it mean that the Rao-Cramer matrix is not positive definite?

It means that the model is not fully determined, or that there are numerical issues.

> As in: which underlying problem does it indicate?
> Is this something I can ignore by using the robust standard errors isntead?

If the robust standard error is OK, that's indeed the best thing to do.
Bootstrapping is also a possibility.

>
> The issue with installing biogeme 3.2.14 was due to the biogeme optimization 0.10.0 package.
> We have to give the IT guys a list of all packages that we need and we did not remove this (redundant) one at first.
> I could ask them to install biogeme 3.2.13 instead if you think that might fix the issue?

No, it will not change anything.

>
> Best regards,
> Chris

Reply all

Reply to author

Forward