Possible Bug in statsmodels.stats.libqsturng.qsturng._psturng

45 views
Skip to first unread message

Amelia Taylor

unread,
Feb 23, 2020, 3:46:28 PM2/23/20
to pystatsmodels
I'm not sure where to ask this, like should I make a GitHub issue?

The issues is that Satterthwait's degrees of freedom computation generally returns a float, so it is easy to have  1 < v < 2 (where I'm using v for degrees of freedom as in the codebase).  It turns out This leads to `psturng` returns an error when 1 < v < 2 because there is a line in `_psturng` that tests if v == 1 and in the else passes a p_value of 0.1 to _qsturng, which since v < 2 errors out.   I'm not sure of the math here, but my sense is that the v == 1 line should be 1 <= v < 2.  

Does anyone know the math here well enough to know what that line should be or how we should handle 1 < v < 2 it seems that if v = 1 works and v >= 2 works, it should be mathematically possible to run the code for 1 < v < 2.  And is there somewhere else I should post this/ask this question?  

Also, I'm happy to take on making a PR for this bugfix once I have a better sense if the mathematically correct thing to do is just change that one line to 1 <= v < 2.  

josef...@gmail.com

unread,
Feb 23, 2020, 4:17:01 PM2/23/20
to pystatsmodels
open an issue on github for this kind of things. github issues are easier to keep track of for future reference.
and add an example that errors

What I would to is to change the code to allow this case and check whether the results make sense, and/or whether it breaks then somewhere else.
If that case is not correctly handled in the current approximation of the distribution, then we would have to come up with an extension or alternative.

libqsturng was taken from another package. My guess is that it is not maintained anymore.
Then, we have to figure it out on our own.

I never went through the details of libqsturng

to the issue, IIUC: 
Isn't  1 < v < 2 a very small sample, i.e. we have effectively less than two observations left?

Josef

 

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/3a35bf8f-d18d-4853-8f14-30d151d0896b%40googlegroups.com.

originalb...@gmail.com

unread,
Feb 14, 2021, 1:25:32 PM2/14/21
to pystatsmodels
Josef,

Noting here, cause I'm not sure what pops up on your radar faster, but I finally had a chance to dig into this issue more and just posted a new comment on the issue in Github.

Also, I recently found a different bug.  Similar but different, so posted an additional issue.  

I'm alerting you because I have time this week to put up PR's to fix both of these bugs if we agree on the solutions enough for me to proceed (I can also just put up the PR's and we can discuss there too).  

I look forward to your thoughts on these.  I'm much more confident that the solution I propose to the new bug issue is mathematically correct than the second one where I'm only confident based on the current implementation.  

Thanks!
Amelia

originalb...@gmail.com

unread,
Feb 15, 2021, 2:23:37 PM2/15/21
to pystatsmodels

FYI, something seems very off. My understanding is that `ptukey` in R and `psturng` in libqsturng should return the same values.  However, in R, ptukey(12, 10, 2.02) = 0.9270956 and posturing(12, 10, 2.02) = 0.06583197 which is close to, but not the same as 1-0.9270956 = 0.0729044 (I discovered this while trying to make a test for Issue #7324 to put that fix up as a PR).

I might be missing something, but as far as I can tell these should return the same values.  I'm using this documentation or R and comparing that to the doc string in qsturng_.py.

I did a spot check on the values in the `make_tbl.py` file, so those seem right.  I'm not sure where this goes sideways.  I am seriously hoping, this is me misunderstanding something and not a much bigger bug than the ones I found previously. 

- Amelia

josef...@gmail.com

unread,
Feb 15, 2021, 2:28:33 PM2/15/21
to pystatsmodels
On Mon, Feb 15, 2021 at 2:23 PM originalb...@gmail.com <originalb...@gmail.com> wrote:

FYI, something seems very off. My understanding is that `ptukey` in R and `psturng` in libqsturng should return the same values.  However, in R, ptukey(12, 10, 2.02) = 0.9270956 and posturing(12, 10, 2.02) = 0.06583197 which is close to, but not the same as 1-0.9270956 = 0.0729044 (I discovered this while trying to make a test for Issue #7324 to put that fix up as a PR).

quick check, I get the following in R, v. 3.6.1, only copula and evd packages loaded

> ptukey(12, 10, 2.02)
[1] 0.9341745548005985
>  1 - ptukey(12, 10, 2.02)
[1] 0.06582544519940148

 

originalb...@gmail.com

unread,
Feb 15, 2021, 2:50:10 PM2/15/21
to pystatsmodels
Thank you.  That's good news. I'll check my R version (likely it is old).  Also, digging into the docs a bit more, it is the case the `psturng` should be `1-ptukey`.  This helps a lot.  Digging into the other issue a bit more and will have PR up for #7324 today.

josef...@gmail.com

unread,
Feb 15, 2021, 3:01:29 PM2/15/21
to pystatsmodels
On Mon, Feb 15, 2021 at 2:50 PM originalb...@gmail.com <originalb...@gmail.com> wrote:
Thank you.  That's good news. I'll check my R version (likely it is old).  Also, digging into the docs a bit more, it is the case the `psturng` should be `1-ptukey`.  This helps a lot.  Digging into the other issue a bit more and will have PR up for #7324 today.

Good. We can continue in issues and PRs.

I might not be able to look at details that I don't remember right away, because my head is still stuck in copulas and bernstein polynomials.

(It's annoying when I write formulas in code or code comments and don't add a reference. I need to hunt for the right reference and pdf file again.)

Josef


 
Reply all
Reply to author
Forward
0 new messages