I expect that somebody is doing a study of the changes.
--
http://www.standards.com/; See Howard Kaikow's web site.
"David" <metheg...@yahoo.ie> wrote in message
news:82ac4f89.04031...@posting.google.com...
--
http://www.standards.com/; See Howard Kaikow's web site.
"Jody Goldberg" <jo...@gnome.org> wrote in message
news:slrnc5mf2...@athlon.thegoldbergs.ca...
In the report, McCullough reruns some tests done by Knuesel on
Microsoft Excel 97 in 1998. The conclusion is that Microsoft have done
almost nothing to improve their statistical functions (I can't argue
with that) while "Gnumeric has largely fixed its flaws" (unfortunately
I do disagree with this although it's hard to say what "largely" is
supposed to mean).
I haven't got an executable copy of gnumeric to give specific
examples, but here are a few problems which can be detected from a
quick look at the code in "mathfunc.c".
pgeom
-----
R_DT_Cval(powgnum(1 - p, x + 1)) which if not log_p, simplifies to
(lower_tail ? (1 - (powgnum(1 - p, x + 1))) : (powgnum(1 - p, x + 1)))
So there are 2 potential "1-" disasters waiting to happen.
For cure see pexp. I think the calculations here avoids all "1-"
problems
phyper
------
term = lfastchoose(NR, xr) + lfastchoose(NB, xb) - lfastchoose(N,
n);
uses log of the gamma function which makes it inaccurate for large
values - just what dhyper seeks to avoid!
also uses R_DT_val macro so there are "1-" problems as well.
This algorithm is a disaster (it's slow and inaccurate) and needs
replaced.
pcauchy
-------
uses R_DT_val macro so there are "1-" problems
For cure see pexp.
pgamma
------
uses normal approx if shape parameter, alph > 1000 so not very
accurate.
uses R_D_val(1 - sum) macro so there are "1-" problems if sum close to
1 or sum small and it "logs" it.
The "1-" problem where sum close to 1 is a real problem and not a
careless bug which can be easily removed.
e.g. x = 1, alph 10^(-n) and lower_tail is false then we lose n
figures of accuracy in the answer.
pbeta
------
works with x and not 1-x as well. Calls to it must choose whether to
call beta(x,p,q,lower_tail_option,log_p_option) or beta(accurate
version of 1-x,q,p,!lower_tail_option,log_p_option) but they don't
bother so they don't yield as accurate results as they might.
pbeta_raw "takes forever (or ends wrongly) when (one or) both p & q
are huge" (from FIXME comment)
I don't know the origins of the algorithm well enough to say
definitely, "it can't be accurate for very small p and q", but I would
be surprised if that were the case.
pt
--
not accurate for large degrees of freedom.
pf
--
not accurate for large degrees of freedom. Not accurate elsewhere
because pbeta is not accurate.
Ian Smith
Well, the only place where they considered that there was a continuing
flaw was in the fact that Gnumeric was using a "true" random number
generator rather than using a pseudo-RNG.
For all of the tests that they did where they had found Gnumeric
wanting in version 0.67, they found that 1.1.2 had resolved the flaws
that they had found.
It looks as though it still prefers /dev/urandom; I'm not sure how to
explicitly get at the pseudo-RNG...
But when the Gnumeric developers had fixed nearly all of the flaws
that they had found, the conclusion "Gnumeric has largely fixed its
flaws" seems not too "out there," even if it is poor grammar.
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/spreadsheets.html
If a mute swears, does his mother wash his hands with soap?
The comment is related only to the flaws he mentioned to us after
reviewing the early development version.
> I haven't got an executable copy of gnumeric to give specific
> examples, but here are a few problems which can be detected from a
> quick look at the code in "mathfunc.c".
Thanks. This is exactly the sort of information we're looking for.
I've forwarded your write up to the list. Depending on the
magnitude of the changes we may even be able to back port the fixes
to the 1.2.x stable tree for release later this month.