benchmark timings with gforth 0.7.3

none albert

unread,

Jun 7, 2022, 12:50:29 PM6/7/22

to

I have a benchmark with the infamous byte benchmark repeated 10000
times.

The timings with mpeforth,swiftforth,lina and optimised-lina and gforth-fast
are reasonably reproducible, say at most 10 percent, Mo sly better.
E.g.
time 2>&1 nice -20 gforth-fast ./sieve10k.frt
give 3.3 seconds on my AMD 64 bit 4Ghz, all the time.

However
time 2>&1 nice -20 gforth ./sieve10k.frt
gives 6.5 seconds and then the second time e.g. 4.2 seconds.

What makes gforth 0.7.3 behave differently?

groetjes Albert

P.S. a typical testoutput is
lina plain
4.90user 0.00system 0:04.91elapsed 99%CPU (0avgtext+0avgdata 1348maxresident)k
0inputs+0outputs (0major+87minor)pagefaults 0swaps
gforth plain
5.97user 0.00system 0:05.97elapsed 99%CPU (0avgtext+0avgdata 3104maxresident)k
0inputs+0outputs (0major+399minor)pagefaults 0swaps
gforth fast
3.33user 0.00system 0:03.34elapsed 99%CPU (0avgtext+0avgdata 3068maxresident)k
0inputs+0outputs (0major+342minor)pagefaults 0swaps
lina optimised
$Revision: 1.21 $
0.88user 0.00system 0:00.88elapsed 99%CPU (0avgtext+0avgdata 1348maxresident)k
0inputs+0outputs (0major+84minor)pagefaults 0swaps
swiftforth
0.88user 0.00system 0:00.88elapsed 99%CPU (0avgtext+0avgdata 1876maxresident)k
0inputs+0outputs (0major+216minor)pagefaults 0swaps
mpeforth
0.69user 0.00system 0:00.69elapsed 100%CPU (0avgtext+0avgdata 1780maxresident)k
0inputs+0outputs (0major+2128minor)pagefaults 0swaps
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Anton Ertl

unread,

Jun 8, 2022, 8:12:40 AM6/8/22

to

albert@cherry.(none) (albert) writes:
>I have a benchmark with the infamous byte benchmark repeated 10000
>times.
>
>The timings with mpeforth,swiftforth,lina and optimised-lina and gforth-fast
>are reasonably reproducible, say at most 10 percent, Mo sly better.
>E.g.
>time 2>&1 nice -20 gforth-fast ./sieve10k.frt
>give 3.3 seconds on my AMD 64 bit 4Ghz, all the time.
>
>However
>time 2>&1 nice -20 gforth ./sieve10k.frt
>gives 6.5 seconds and then the second time e.g. 4.2 seconds.
>
>What makes gforth 0.7.3 behave differently?

Nothing particular to gforth-0.7.3 that I can think of.

One thing that I can think of, but that would affect everything is if
you are using a CPU with SMT (aka Hyperthreading). If another thread
runs on the same core, this tends to slow down your thread while still
seeming to take 100% CPU time (unlike classic OS time-multiplexing of
CPUs, where you get a longer elapsed time, but roughly the same user
and system time for running the same job while competing with another
job for CPU resources.

>gforth plain
>5.97user 0.00system 0:05.97elapsed 99%CPU (0avgtext+0avgdata 3104maxresident)k
>0inputs+0outputs (0major+399minor)pagefaults 0swaps

user=elapsed means that gforth ran exclusively on the thread.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: http://www.euroforth.org/ef22/cfp.html

Marcel Hendrix

unread,

Jun 8, 2022, 9:20:19 AM6/8/22

to

On Tuesday, June 7, 2022 at 6:50:29 PM UTC+2, none albert wrote:
> I have a benchmark with the infamous byte benchmark repeated 10000
> times.
>
> The timings with mpeforth,swiftforth,lina and optimised-lina and gforth-fast
> are reasonably reproducible, say at most 10 percent, Mo sly better.
> E.g.
> time 2>&1 nice -20 gforth-fast ./sieve10k.frt
> give 3.3 seconds on my AMD 64 bit 4Ghz, all the time.
>
> However
> time 2>&1 nice -20 gforth ./sieve10k.frt
> gives 6.5 seconds and then the second time e.g. 4.2 seconds.
>
> What makes gforth 0.7.3 behave differently?

[..]

I don't know about Gforth, but I have had problems with power saving
schemes on Windows. The typical (non-high-performance) scheme
leads to disks being send to sleep (sometimes seconds delay). Even
when that is not the case, the performance is about 50% of what is
possible.
There is of course also a cache effect if there is not enough memory.

What happens if you run the test more than twice (say 10 times)?

-marcel

Anton Ertl

unread,

Jun 8, 2022, 10:33:54 AM6/8/22

to

Marcel Hendrix <m...@iae.nl> writes:
>On Tuesday, June 7, 2022 at 6:50:29 PM UTC+2, none albert wrote:
>> I have a benchmark with the infamous byte benchmark repeated 10000
>> times.
>>
>> The timings with mpeforth,swiftforth,lina and optimised-lina and gforth-fast
>> are reasonably reproducible, say at most 10 percent, Mo sly better.
>> E.g.
>> time 2>&1 nice -20 gforth-fast ./sieve10k.frt
>> give 3.3 seconds on my AMD 64 bit 4Ghz, all the time.
>>
>> However
>> time 2>&1 nice -20 gforth ./sieve10k.frt
>> gives 6.5 seconds and then the second time e.g. 4.2 seconds.
>>
>> What makes gforth 0.7.3 behave differently?
>[..]
>
>I don't know about Gforth, but I have had problems with power saving
>schemes on Windows.

Good point. CPUs these days don't just run at 4GHz. Instead, it
depends on a number of factors, including how CPU-intensive the job is
(should not be a problem in this case), how many other cores are
loaded, the total power consumption, and the temperature of the CPU.
That's one reason why I like to measure cycles rather than seconds for
CPU-intensive stuff like this.

>There is of course also a cache effect if there is not enough memory.

The Byte sieve that he measured should easily be within the L1 cache.