dfbetas

Jo Lu

unread,

Jul 18, 2023, 7:09:07 AM7/18/23

to pystatsmodels

Hello *.*,

I am wondering why the computation times of the dfbeta and dfbetas influence statistics are extremely large. The reason seems to be that the implementation uses results form leave-one-out regression loops, requiring N auxilliary regressions (where N is the number of observations of the data set). In the classical monography by Belsley et al. it is shown that these statistics can be computed extremely fast, even without running OLS regressions.

Why are the algorithms presented in the book not implemented?

best regards,

Johannes

josef...@gmail.com

unread,

Jul 25, 2023, 4:03:51 AM7/25/23

to pystat...@googlegroups.com

Hi,

I don’t know why I never added the version without looo loop .

E.g https://github.com/statsmodels/statsmodels/issues/4740

For all MLEInfluence classes which I had added more recently, I used the one step approximation to avoid the looo loop. In the last years, I was working mainly on outlier influence for GLM and discrete models.

(I’m currently on vacation and cannot look at the details)

Josef

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodels+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/d31d1255-cd48-44fc-80e9-eba2e8e1a79fn%40googlegroups.com.

josef...@gmail.com

unread,

Jul 25, 2023, 4:19:11 AM7/25/23

to pystat...@googlegroups.com

Aside:

I wrote OLS outlier influence initially based on the SAS documentation and read BKW only later.

Jo Lu

unread,

Mar 22, 2024, 8:48:37 AMMar 22

to pystatsmodels

I know the relevant sections in the book of Belsley, Kuh and Welsch and could write a computationally efficient implementation of the dfbetas statistics.

Is there anybody who would review my code? Unfortunately, I don't understand details of statsmodels team organization. Where should I put the extended scripts?

Thank you and best regards.

josefpktd schrieb am Dienstag, 25. Juli 2023 um 10:19:11 UTC+2:

Aside:
I wrote OLS outlier influence initially based on the SAS documentation and read BKW only later.

On Tuesday, July 25, 2023, <josef...@gmail.com> wrote:

Hi,

I don’t know why I never added the version without looo loop .
E.g https://github.com/statsmodels/statsmodels/issues/4740

For all MLEInfluence classes which I had added more recently, I used the one step approximation to avoid the looo loop. In the last years, I was working mainly on outlier influence for GLM and discrete models.

(I’m currently on vacation and cannot look at the details)

Josef

On Tuesday, July 18, 2023, Jo Lu <johannes...@gmail.com> wrote:

Hello *.*,
I am wondering why the computation times of the dfbeta and dfbetas influence statistics are extremely large. The reason seems to be that the implementation uses results form leave-one-out regression loops, requiring N auxilliary regressions (where N is the number of observations of the data set). In the classical monography by Belsley et al. it is shown that these statistics can be computed extremely fast, even without running OLS regressions.

Why are the algorithms presented in the book not implemented?

best regards,
Johannes

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.

To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.

josef...@gmail.com

unread,

Mar 22, 2024, 10:26:32 AMMar 22

to pystat...@googlegroups.com

Hi Johannes,

It would be great if you can open a pull request with the changes.

Otherwise, you can add the relevant code to the issue that I opened https://github.com/statsmodels/statsmodels/issues/9009

I should be able to review it relatively soon. The improvements should go in before the next release. (Now that we know where to find the explicit formulas)

Thanks,

Josef

To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/5b31678d-69dc-4fff-8c8f-0c8117b3a803n%40googlegroups.com.

Jo Lu

unread,

Mar 24, 2024, 6:44:57 PMMar 24

to pystatsmodels

OK. I will be on holiday in the next week but then I will implement the changes.