Google 网上论坛不再支持新的 Usenet 帖子或订阅项。历史内容仍可供查看。

Forth Performance Question

已查看 177 次
跳至第一个未读帖子

Robert L.

未读,
2017年7月26日 16:23:392017/7/26
收件人
Marcel Hendrix wrote:

> ( Run on a 64-bit nc Forth, Windows 7 64bit, Intel Core i7 920, 2.67 GHz)
> FORTH> : foo 1 1000000 * drop ;
> FORTH> \ : test 1 100000000 ?DO foo LOOP ;
> : test cr timer-reset 100000000 1 ?DO foo LOOP .elapsed ;
> FORTH> test
> 0.219 seconds elapsed. ok

Let's multiply the number of iterations by 5.

(define-inline (foo) (fx* 1 1000000))

(define (test)
(let loop ((n 500000000))
(when (fx> n 0)
(foo)
(loop (fx- n 1)))))

(time (test))

After compiling with -O3 and running on a winXP laptop:

0.578s CPU time


Dividing by 5 yields 0.1156. Almost twice as fast as Forth.


--
[T]he broadcast media ... create a separate and caustic virtual reality, then
broadcast that ideologically driven reality into the homes of millions of
people.... theoccidentalobserver.net/authors/Connelly-Gaza2.html

Julian Fondren

未读,
2017年7月26日 17:20:372017/7/26
收件人
On Wednesday, July 26, 2017 at 3:23:39 PM UTC-5, Robert L. wrote:
> Marcel Hendrix wrote:
>
> > ( Run on a 64-bit nc Forth, Windows 7 64bit, Intel Core i7 920, 2.67 GHz)
> > FORTH> : foo 1 1000000 * drop ;
> > FORTH> \ : test 1 100000000 ?DO foo LOOP ;
> > : test cr timer-reset 100000000 1 ?DO foo LOOP .elapsed ;
> > FORTH> test
> > 0.219 seconds elapsed. ok
>
> Let's multiply the number of iterations by 5.
>
> (define-inline (foo) (fx* 1 1000000))
>
> (define (test)
> (let loop ((n 500000000))
> (when (fx> n 0)
> (foo)
> (loop (fx- n 1)))))
>
> (time (test))
>
> After compiling with -O3 and running on a winXP laptop:
>
> 0.578s CPU time
>
>
> Dividing by 5 yields 0.1156. Almost twice as fast as Forth.

Wow!

I'm convinced. I wish to use your language and join your cult. I am
currently selling all of my Forth stuff. Please answer these simple
questions so that I can know how to spend the money I make from said
sales:

1. What Scheme is that?

2. What command do you run to compile with it?

Thanks. I've wondered this -- very frequently -- in response to your
earlier postings using this Scheme which isn't Gauche or Chicken and
therefore is completely unidentifiable to me, but now that I share your
enthusiasm for Scheme I can see that the questions of an unwashed
unbeliever are really beneath attention.

Paul Rubin

未读,
2017年7月27日 01:29:512017/7/27
收件人
Julian Fondren <julian....@gmail.com> writes:
> 1. What Scheme is that?
> 2. What command do you run to compile with it?

It looks like Chicken to me. csc -O3 a.scm gets 0.344 sec cpu time on
an i5-3570S at 3.1 ghz. So it sounds like WJ's laptop is pretty speedy
for a machine from the Windows XP era.

With csc -O5 it takes no time at all (loop optimized completely away).

minf...@arcor.de

未读,
2017年7月27日 03:15:382017/7/27
收件人
Chicken translates to C. You might be admiring the C compiler's eficiency. ;-)

Anton Ertl

未读,
2017年7月27日 04:49:032017/7/27
收件人
"Robert L." <No_spamming@noWhere_7073.org> writes:
>Marcel Hendrix wrote:
>
>> ( Run on a 64-bit nc Forth, Windows 7 64bit, Intel Core i7 920, 2.67 GHz)
>> FORTH> : foo 1 1000000 * drop ;
>> FORTH> \ : test 1 100000000 ?DO foo LOOP ;
>> : test cr timer-reset 100000000 1 ?DO foo LOOP .elapsed ;
>> FORTH> test
>> 0.219 seconds elapsed. ok
>
>Let's multiply the number of iterations by 5.

Let's multiply them by 10:

: foo 1 1000000 * drop ;
: test 1000000000 1 ?DO foo LOOP ;
see foo
see test
bye

perf stat -e cycles -e instructions vfxlin "include xxx.fs"

shows:

VFX Forth for Linux IA32 Version: 4.72 [build 0555]
Including xxx.fs
FOO
( 080C0AB0 C3 ) NEXT,
( 1 bytes, 1 instructions )

TEST
...
( 080C0AF0 83042401 ) ADD [ESP], 01
( 080C0AF4 8344240401 ) ADD [ESP+04], 01
( 080C0AF9 71F5 ) JNO 080C0AF0
...

As you can see, the mulitplication itself is optimized away, and what
we measure is only the loop overhead (which is a previously known
weakness of VFX at at least 5 cycles latency per iteration on recent
Intel CPUs, thanks to updating the loop counters in memory (instead of
1 cycle when keeping them in registers).

Timing:

5511896630 cycles
3010622182 instructions # 0.55 insns per cycle
1.379565513 seconds time elapsed

I.e., 3 instructions and 5.5 cycles per iteration.

iForth-5.1-mini does not show the disassembly with SEE, so I'll spare
you that. The timing results are:

5790620961 cycles
3414925175 instructions # 0.59 insns per cycle

1.455163427 seconds time elapsed

After subtracting the substantial startup overhead of iForth, these
are essentially the same results as for VFX. So iForth also performs
three instructions per iteration (i.e., it probably optimizes the
multiplication away completely), and also takes 5.5 cycles per
iteration (i.e., it also makes the mistake of keeping the loop counter
in memory).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2017: http://euro.theforth.net/

François

未读,
2017年7月27日 09:50:322017/7/27
收件人
The contest of the one with the biggest ...

m...@iae.nl

未读,
2017年7月27日 13:03:202017/7/27
收件人
On Thursday, July 27, 2017 at 7:29:51 AM UTC+2, Paul Rubin wrote:
> Julian Fondren <julian....@gmail.com> writes:
> > 1. What Scheme is that?
> > 2. What command do you run to compile with it?
>
> It looks like Chicken to me. csc -O3 a.scm gets 0.344 sec cpu time on
> an i5-3570S at 3.1 ghz. So it sounds like WJ's laptop is pretty speedy
> for a machine from the Windows XP era.
[..]

I think this is more in line with the chicken.

: foo 1 #1000000 * drop ;
: test cr timer-reset #100000000 begin foo 1- dup 0= until drop .elapsed ;
FORTH> test test test test
0.083 seconds elapsed.
0.083 seconds elapsed.
0.084 seconds elapsed.
0.071 seconds elapsed. ok

-marcel

Julian Fondren

未读,
2017年7月27日 16:35:192017/7/27
收件人
On Thursday, July 27, 2017 at 12:29:51 AM UTC-5, Paul Rubin wrote:
> Julian Fondren <julian....@gmail.com> writes:
> > 1. What Scheme is that?
> > 2. What command do you run to compile with it?
>
> It looks like Chicken to me. csc -O3 a.scm gets 0.344 sec cpu time on
> an i5-3570S at 3.1 ghz. [...]
>
> With csc -O5 it takes no time at all (loop optimized completely away).

Ah, thanks. I guess I just didn't have the extensions he was using
when I tried Chicken on an earlier posting of his.

> an i5-3570S at 3.1 ghz. So it sounds like WJ's laptop is pretty speedy
> for a machine from the Windows XP era.

It's such an obvious trick that it's probably just laziness.

On a MacBook Pro, Intel Core i&, 2.2GHz, with Chicken 4.12 and iForth,
with each using the same number of iterations.

csc -O3: 0.063s CPU time, maximum live heap: 236.11 KiB
LOOP: 0.200 seconds elapsed.
BEGIN: 0.035 seconds elapsed.
LOCALS: 0.250 seconds elapsed.
FOR: 0.189 seconds elapsed.

With

: test3 ( -- )
cr timer-reset #100000000 locals| i |
BEGIN i WHILE foo -1 +to i REPEAT .elapsed ;

: test4 ( -- )
cr timer-reset #100000000 1-
FOR foo NEXT .elapsed ;

Rod Pemberton

未读,
2017年7月28日 01:06:592017/7/28
收件人
Did you mean this Chicken?

https://esolangs.org/wiki/Chicken


Rod Pemberton
--
Liberals love to point out that vehicles contribute to climate change.
Conservatives should point out that living in skyscrapers does so too.

Julian Fondren

未读,
2017年7月28日 04:33:502017/7/28
收件人
I mean the one with the parentheses: https://www.call-cc.org/

It's a better implementation than the language deserves, really.

Rod Pemberton

未读,
2017年7月28日 18:49:032017/7/28
收件人
On Fri, 28 Jul 2017 01:33:49 -0700 (PDT)
Julian Fondren <julian....@gmail.com> wrote:

> On Friday, July 28, 2017 at 12:06:59 AM UTC-5, Rod Pemberton wrote:
> > On Wed, 26 Jul 2017 14:20:35 -0700 (PDT)
> > Julian Fondren <julian....@gmail.com> wrote:

> > > Thanks. I've wondered this -- very frequently -- in response to
> > > your earlier postings using this Scheme which isn't Gauche or
> > > Chicken and therefore is completely unidentifiable to me, but now
> > > that I share your enthusiasm for Scheme I can see that the
> > > questions of an unwashed unbeliever are really beneath
> > > attention.
> >
> > Did you mean this Chicken?
> >
> > https://esolangs.org/wiki/Chicken
> >
>
> I mean the one with the parentheses: https://www.call-cc.org/
>
> It's a better implementation than the language deserves, really.

Sigh, I don't know why everyone argues with me when I say every
language does or did compile to C. That's yet another one.

Julian Fondren

未读,
2017年7月28日 19:57:562017/7/28
收件人
On Friday, July 28, 2017 at 5:49:03 PM UTC-5, Rod Pemberton wrote:
> On Fri, 28 Jul 2017 01:33:49 -0700 (PDT)
> Julian Fondren <julian....@gmail.com> wrote:
>
> > On Friday, July 28, 2017 at 12:06:59 AM UTC-5, Rod Pemberton wrote:
> > > On Wed, 26 Jul 2017 14:20:35 -0700 (PDT)
> > > Julian Fondren <julian....@gmail.com> wrote:
>
> > > > Thanks. I've wondered this -- very frequently -- in response to
> > > > your earlier postings using this Scheme which isn't Gauche or
> > > > Chicken and therefore is completely unidentifiable to me, but now
> > > > that I share your enthusiasm for Scheme I can see that the
> > > > questions of an unwashed unbeliever are really beneath
> > > > attention.
> > >
> > > Did you mean this Chicken?
> > >
> > > https://esolangs.org/wiki/Chicken
> > >
> >
> > I mean the one with the parentheses: https://www.call-cc.org/
> >
> > It's a better implementation than the language deserves, really.
>
> Sigh, I don't know why everyone argues with me when I say every
> language does or did compile to C. That's yet another one.

Putting the truth of that observation aside, what's the point of it?
Suppose everyone accepts that C is the fundamental intermdiate
language that nobody wants to dirty their hands with because it's an
unpleasant language to directly express anything with, and yet has a
lot of decent compilers, and is easy enough to understand (if you
don't buy into the nasal demon religion.) What changes with this
acceptance? Do people stop directly writing C at all, except when
learning it? Are languages with assemblers obliged to offer C
compilers instead?
0 个新帖子