[PATCH] g-timer.boot: avoid duplication in statistics calculation and simplify code

7 views
Skip to first unread message

Qian Yun

unread,
Nov 4, 2023, 10:28:43 PM11/4/23
to fricas-devel
I introduce a new variable "roundingError", its valued was mistakenly
added to the 'other' class, later it was added to "total" again,
causing duplication. So I change it to only add to
the 'other' named stat.

The comparison "if n >= 0.01" is wrong, since we take 2 digits
accuracy, so it should compare against 0.005, see "significantStat".
Also "roundStat" is totally useless because it rounds to the third
digit. We can rely on "FORMAT" to do the rounding.

I also modified "makeStatString" to simplify the code a bit.

- Qian

diff --git a/src/interp/g-timer.boot b/src/interp/g-timer.boot
index 2b6e2473..689d614c 100644
--- a/src/interp/g-timer.boot
+++ b/src/interp/g-timer.boot
@@ -41,56 +41,46 @@ makeLongStatStringByProperty _
total := 0
str := '""
otherStatTotal := GET('other, property)
+ roundingError := 0
for [name,class,:ab] in listofnames repeat
- name = 'other => 'iterate
cl := first LASSOC(class, listofclasses)
n := GET(name, property)
PUT(cl, classproperty, n + GET(cl, classproperty))
total := total + n
- if n >= 0.01
- then timestr := normalizeStatAndStringify n
- else
- timestr := '""
- otherStatTotal := otherStatTotal + n
- str := makeStatString(str, timestr, name, flag)
- PUT('other, property, otherStatTotal)
- if otherStatTotal > 0 then
- timestr := normalizeStatAndStringify otherStatTotal
- str := makeStatString(str, timestr, 'other, flag)
- total := total + otherStatTotal
- cl := first LASSOC('other, listofnames)
- cl := first LASSOC(cl, listofclasses)
- PUT(cl, classproperty, otherStatTotal + GET(cl, classproperty))
+ name = 'other => 'iterate
+ if significantStat n then
+ str := makeStatString(str, n, name, flag)
+ else
+ roundingError := roundingError + n
+ str := makeStatString(str, otherStatTotal + roundingError, 'other, flag)
if flag ~= 'long then
total := 0
str := '""
for [class,name,:ab] in listofclasses repeat
n := GET(name, classproperty)
- n = 0.0 or n = 0 => 'iterate
total := total + n
- timestr := normalizeStatAndStringify n
- str := makeStatString(str,timestr,ab,flag)
+ str := makeStatString(str, n, ab, flag)
total := STRCONC(normalizeStatAndStringify total,'" ", units)
str = '"" => total
STRCONC(str, '" = ", total)

normalizeStatAndStringify t ==
FLOATP t =>
- t := roundStat t
- t = 0.0 => '"0"
- FORMAT(nil,'"~,2F",t)
+ significantStat t => FORMAT(nil, '"~,2F", t)
+ '"0"
INTEGERP t => FORMAT(nil, '"~:d", t)
STRINGIMAGE t

-roundStat t ==
- not FLOATP t => t
- (TRUNCATE (0.5 + t * 1000.0)) / 1000.0
+-- check if argument is significant enough to be printed.
+-- current printing accuracy is 2 digits after decimal point.
+significantStat t == t >= 0.005

makeStatString(oldstr,time,abb,flag) ==
- time = '"" => oldstr
+ not significantStat time => oldstr
opening := (flag = 'long => '"("; '" (")
- oldstr = '"" => STRCONC(time,opening,abb,'")")
- STRCONC(oldstr,'" + ",time,opening,abb,'")")
+ timestr := normalizeStatAndStringify time
+ oldstr = '"" => STRCONC(timestr, opening, abb, '")")
+ STRCONC(oldstr, '" + ", timestr, opening, abb, '")")

peekTimedName() == IFCAR $timedNameStack
g-timer.patch

Qian Yun

unread,
Nov 7, 2023, 7:12:19 AM11/7/23
to fricas-devel
Or we can take one step further -- don't consider roundingError at all.

- Qian

Waldek Hebisch

unread,
Nov 13, 2023, 5:42:29 PM11/13/23
to fricas...@googlegroups.com
On Sun, Nov 05, 2023 at 10:28:38AM +0800, Qian Yun wrote:
> I introduce a new variable "roundingError", its valued was mistakenly
> added to the 'other' class, later it was added to "total" again,
> causing duplication. So I change it to only add to
> the 'other' named stat.
>
> The comparison "if n >= 0.01" is wrong, since we take 2 digits
> accuracy, so it should compare against 0.005, see "significantStat".
> Also "roundStat" is totally useless because it rounds to the third
> digit. We can rely on "FORMAT" to do the rounding.

I suspect that intent of "if n >= 0.01" is different than you think.
Namely, small statistics are likely to have significant error
so it makes sense to supress printing of info for small classes.
I other words, it make sense to have treshold which is bigger
than treshold for rounding to 0.

AFAICS original code is first computing long statistics and if non-long
statistics are desired it throws out computed result and starts again.
It would make sense to have common code which just computes desired
statistics.

Also, before your recent changes long format skipped "other" class
when corresponding time would print as 0 (or maybe was small). Now
it is:

Time: 0.02(evaluation) + 0.00(other) = 0.03 sec

One more thing: did you try to get more accurate timing? In different
context I have found useful microsecond timings: al least on Linux it
is possible to measure real time taken be a routine with microsecond
accuracy. I am not sure how much accuracy one can get from Lisp
routines, but IIRC Clozure CL reports time with microsecond resolution.
> --
> You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/fricas-devel/15923354-2e48-443a-8efc-2ec836994726%40gmail.com.
--
Waldek Hebisch

Qian Yun

unread,
Nov 14, 2023, 4:47:29 AM11/14/23
to fricas...@googlegroups.com


On 11/14/23 06:42, Waldek Hebisch wrote:
> On Sun, Nov 05, 2023 at 10:28:38AM +0800, Qian Yun wrote:
>> I introduce a new variable "roundingError", its valued was mistakenly
>> added to the 'other' class, later it was added to "total" again,
>> causing duplication. So I change it to only add to
>> the 'other' named stat.
>>
>> The comparison "if n >= 0.01" is wrong, since we take 2 digits
>> accuracy, so it should compare against 0.005, see "significantStat".
>> Also "roundStat" is totally useless because it rounds to the third
>> digit. We can rely on "FORMAT" to do the rounding.
>
> I suspect that intent of "if n >= 0.01" is different than you think.
> Namely, small statistics are likely to have significant error
> so it makes sense to supress printing of info for small classes.
> I other words, it make sense to have treshold which is bigger
> than treshold for rounding to 0.

How about we do no rounding at all, and print 3 digits (or more, see
following) after decimal point, and let user to decide if the last
digit is to be trusted.

> AFAICS original code is first computing long statistics and if non-long
> statistics are desired it throws out computed result and starts again.
> It would make sense to have common code which just computes desired
> statistics.

I can do this improvement.

> Also, before your recent changes long format skipped "other" class
> when corresponding time would print as 0 (or maybe was small). Now
> it is:
>
> Time: 0.02(evaluation) + 0.00(other) = 0.03 sec

This happened before as well (when the number is less than 0.005):

Time: 0.00(O) = 0.00 sec

> One more thing: did you try to get more accurate timing? In different
> context I have found useful microsecond timings: al least on Linux it
> is possible to measure real time taken be a routine with microsecond
> accuracy. I am not sure how much accuracy one can get from Lisp
> routines, but IIRC Clozure CL reports time with microsecond resolution.

The timing accuracy is determined by $timerTicksPerSecond which is
INTERNAL-TIME-UNITS-PER-SECOND in Lisp:

GCL: 100
CLISP: 1000000 (windows: 10^7)
ECL: 1000000
SBCL: 1000000 (x86: 1000)
CMUCL: 100
CCL: 1000000 (x86: 1000)
ABCL: 1000

So that's probably why we were printing 2 digits in the past.
(And Lisps have lower accuracy due to 32-bit limitation.)
If that's the case, then the whole rounding logic did nothing
in the GCL era.

Shall we make it configurable how many digits to be printed?

- Qian

Waldek Hebisch

unread,
Nov 14, 2023, 9:08:32 AM11/14/23
to fricas...@googlegroups.com
On Tue, Nov 14, 2023 at 05:47:24PM +0800, Qian Yun wrote:
>
>
> On 11/14/23 06:42, Waldek Hebisch wrote:
> > On Sun, Nov 05, 2023 at 10:28:38AM +0800, Qian Yun wrote:
> > > I introduce a new variable "roundingError", its valued was mistakenly
> > > added to the 'other' class, later it was added to "total" again,
> > > causing duplication. So I change it to only add to
> > > the 'other' named stat.
> > >
> > > The comparison "if n >= 0.01" is wrong, since we take 2 digits
> > > accuracy, so it should compare against 0.005, see "significantStat".
> > > Also "roundStat" is totally useless because it rounds to the third
> > > digit. We can rely on "FORMAT" to do the rounding.
> >
> > I suspect that intent of "if n >= 0.01" is different than you think.
> > Namely, small statistics are likely to have significant error
> > so it makes sense to supress printing of info for small classes.
> > I other words, it make sense to have treshold which is bigger
> > than treshold for rounding to 0.
>
> How about we do no rounding at all, and print 3 digits (or more, see
> following) after decimal point, and let user to decide if the last
> digit is to be trusted.

Well, as you noted FORMAT will do rounding to specified precision.
_If_ small values have significant error, then it makes sense to
supress printing of such values (except for total): printout
is less cluttered and user that cares about specific class will need
longer time to get meaningful result.

> > AFAICS original code is first computing long statistics and if non-long
> > statistics are desired it throws out computed result and starts again.
> > It would make sense to have common code which just computes desired
> > statistics.
>
> I can do this improvement.
>
> > Also, before your recent changes long format skipped "other" class
> > when corresponding time would print as 0 (or maybe was small). Now
> > it is:
> >
> > Time: 0.02(evaluation) + 0.00(other) = 0.03 sec
>
> This happened before as well (when the number is less than 0.005):
>
> Time: 0.00(O) = 0.00 sec

Maybe this happened sometime. But I obtained:

Time: 0.02(E) = 0.02 sec

from old code and result with "0.00(other)" using new code. I think
we should supress printing of "0.00" time except for total. Old code tried
to replace "0.00" by "0" but as you show it failed at least in some
cases.

> > One more thing: did you try to get more accurate timing? In different
> > context I have found useful microsecond timings: al least on Linux it
> > is possible to measure real time taken be a routine with microsecond
> > accuracy. I am not sure how much accuracy one can get from Lisp
> > routines, but IIRC Clozure CL reports time with microsecond resolution.
>
> The timing accuracy is determined by $timerTicksPerSecond which is
> INTERNAL-TIME-UNITS-PER-SECOND in Lisp:
>
> GCL: 100
> CLISP: 1000000 (windows: 10^7)
> ECL: 1000000
> SBCL: 1000000 (x86: 1000)
> CMUCL: 100
> CCL: 1000000 (x86: 1000)
> ABCL: 1000

Well, that is resolution of the result. But accuracy may
be lower and question is if we get accuracy significantly better
than 2 digits.

> So that's probably why we were printing 2 digits in the past.
> (And Lisps have lower accuracy due to 32-bit limitation.)
> If that's the case, then the whole rounding logic did nothing
> in the GCL era.
>
> Shall we make it configurable how many digits to be printed?

Probably. Even if accuracy is low it make sense to have somewhat
higher resolution of the result simply to avoid extra error due
to rounding. And if system can deliver better accuracy it is
silly to loose accuracy by rounding to 2 digits. OTOH people
will mostly look at time when it is long and for long times
2 digits after dot are enough.

--
Waldek Hebisch

Qian Yun

unread,
Nov 14, 2023, 10:00:59 AM11/14/23
to fricas...@googlegroups.com
I have done all the changes you requested, please review again.

- Qian
g-timer-v2.patch

Waldek Hebisch

unread,
Nov 14, 2023, 10:43:59 AM11/14/23
to fricas...@googlegroups.com
On Tue, Nov 14, 2023 at 11:00:54PM +0800, Qian Yun wrote:
> I have done all the changes you requested, please review again.

OK, please commit.

--
Waldek Hebisch
Reply all
Reply to author
Forward
0 new messages