Template Metaprogramming in Forth?

dream...@yahoo.com

unread,

Dec 8, 2008, 8:06:33 PM12/8/08

to

Through the use of templates, C++ has a powerful feature known as
"template metaprogramming". Basically, it allows for compile time
execution of code. Now we all know that Forth allows this and one
such trivial example would be to compute a constant such as factorial
(x) at compile time and use the constant result during run-time.

However, C++ style templates allow one to define an algorithm within a
template. For example, one could define a sort algorithm within a
template, pass the template the size of the input N (i.e. sort N
elements, say N=5 here) and the compiler would create an inline sort
algorithm that works only for a 5 element input. If loops are
converted to recursive template calls, the 5-sort would be inlined by
the compiler and would be highly efficient for sorting 10 elements.

Now of course this could also be done in Forth, but I don't think it
would be nearly as elegant. C++ allows you to specify the algorithm
directly within a template. I think a Forth approach would be litered
with [COMPILE]'s preceding every word. Maybe there's a better way.
Does anyone have experience generating a specialized algorithm at
compile time, and if so, was there a simple and elegant way to do it?

Here's a good basic reference on C++ template programming.

http://ubiety.uwaterloo.ca/~tveldhui/papers/Template-Metaprograms/meta-art.html

-- Greg

Helmar

unread,

Dec 9, 2008, 4:24:37 AM12/9/08

to

On Dec 8, 8:06 pm, dreamwa...@yahoo.com wrote:

> Now of course this could also be done in Forth, but I don't think it
> would be nearly as elegant.

Elegance is a question of your point of view.

Well first of all you have EVAL - and this can compile templates.
Depending on your compiler this might help to optimize.

I think FreeForth invented the "backticks". Very simply said (in 4p
dialect):

: 12+` 12` +` ;

This builds a compiler to add 12 to something. You can now say

: 24+` 12+` 12+` ;

and it builds a compiler to add 24. You also could say

: moon-24+` moon-phase @ if 24+` then ;

This would compile to add 24 only if moon-phase is true.
For me this looks elegant enough. Well, a good thing is an optimizing
code-generator in the back.

Regards,
-Helmar

Helmar

unread,

Dec 9, 2008, 4:44:31 AM12/9/08

to

I see I forgot something that might not be used to people that do not
know that "backticks".
You use moon-24+ without the backtick. Eg.

: test 1 moon-24+ ;

I also forgot that to notice that making "real" templates in the C++
sense you should have some context dependent words. Say for example
something like TO in Ans that is in 4p-terminology a "class-smart"
word, because it can be used for locals and VALUEs (two different
facts but only one word). In such cases you can make better use of the
"templates" in that you can use it for two different kind of things:
locals and VALUEs.

Regards,
-Helmar

Jeff M.

unread,

Dec 9, 2008, 11:11:45 AM12/9/08

to

On Dec 8, 7:06 pm, dreamwa...@yahoo.com wrote:
>
> Through the use of templates, C++ has a powerful feature known as
> "template metaprogramming". Basically, it allows for compile time
> execution of code. Now we all know that Forth allows this and one
> such trivial example would be to compute a constant such as factorial
> (x) at compile time and use the constant result during run-time.

C++ templates are neither special nor elegant. In fact, while they can
be nice, they can also be quite a hindrance to many [very large]
projects - especially those that are cross-platform. Not all compilers
handle them equally well, they can seriously (negatively) impact
compile times, and are notoriously difficult to debug. Trying to
combine them with other language features (like multiple inheritance)
and you might as well take a shotgun to many a compiler...

Lisp's macros and FORTH's compile STATE are infinitely more powerful
and more elegant. They don't require any special syntax, are much
easier to read, and can do far more. And, they allow you to extend the
language. Try adding new syntax to C++ using templates.

How about this (a real life example): a CRC is often used to fast
comparisons and hashing of strings. More importantly, in many
programs, 95-100% of all strings are actually known at compile-time.
Try creating a C++ template meta-program that computes the CRC of a
string at compile-time instead of runtime for efficiency. To make the
problem simple, we'll just use a simple 8-bit checksum instead of a
full CRC:

Lisp:

(defmacro crc (string)
(logand (reduce #'+ string :key #'char-code :initial-value 0) #xFF))

Forth (been a while, probably better ways):

: /CRC ( u c-addr offset -- u' c-addr ) OVER >R + C@ + R> ;
: CRC" ( ..." -- ) 0 [CHAR] " WORD COUNT 0 DO? I /CRC LOOP DROP 255
AND , ; IMMEDIATE

C++:

Not trying to be sarcastic, but I think far too many people are
smitten by C++ templates or C# delegates or many other "modern"
language features when in reality, they are very poor attempts to redo
what was already there and and much better the way it was.

Jeff M.

Helmar

unread,

Dec 9, 2008, 12:24:08 PM12/9/08

to

On Dec 9, 11:11 am, "Jeff M." <mass...@gmail.com> wrote:

Hi Jeff,

your writing is more a rant than something that could help the one
that asked the question. As I wrote in my previous posts this is a
question of what you think is "elegant".

The idea to do "generic" programming is interesting of course. You do
not see much special Forth constructs at the first view. I think that
Greg was in fact aware of the usual way and same time impressed by the
(say what you want) powerful way C++ has to offer.

Forth has not much to offer in this respect. But honest: it does not
need it. So try to tell the people why and what can be done. The
simple method of how Forth works does not show the people why it is
not needed.

You might have seen my second post where I pointed to "class-smart"
words. This was in a way a little bit sarcastic. Usually in Forth you
do not want to have such "class-smart" words. To make a good use of
the templates you need these words. Would have Forth much more of such
"class-smart" words, I would guess someone would had have the idea to
make a better template mechanism. This is just another idea how to
program. Even Forthers can get ideas from. And: it's not that Forth
could not do it. It's simple not "wanted". But be nice to people that
do have that idea - it is no "far away" idea.

-Helmar

Albert van der Horst

unread,

Dec 9, 2008, 1:52:50 PM12/9/08

to

In article <a3a2a68e-4590-46b6...@s16g2000vbp.googlegroups.com>,

<dream...@yahoo.com> wrote:
>Through the use of templates, C++ has a powerful feature known as
>"template metaprogramming". Basically, it allows for compile time
>execution of code. Now we all know that Forth allows this and one
>such trivial example would be to compute a constant such as factorial
>(x) at compile time and use the constant result during run-time.
>
>However, C++ style templates allow one to define an algorithm within a
>template. For example, one could define a sort algorithm within a
>template, pass the template the size of the input N (i.e. sort N
>elements, say N=5 here) and the compiler would create an inline sort
>algorithm that works only for a 5 element input. If loops are
>converted to recursive template calls, the 5-sort would be inlined by
>the compiler and would be highly efficient for sorting 10 elements.
>
>Now of course this could also be done in Forth, but I don't think it
>would be nearly as elegant. C++ allows you to specify the algorithm
>directly within a template. I think a Forth approach would be litered
>with [COMPILE]'s preceding every word. Maybe there's a better way.
>Does anyone have experience generating a specialized algorithm at
>compile time, and if so, was there a simple and elegant way to do it?

Been there before. What C++ buys you is type safety. Forth gives
up on type safety. Then EXECUTE is all you need, as exemplified
by

' my-< ' my-<--> -1000 1000 QSORT
(Or us :NONAME to make execution tokens.)
The QSORT is totally generic, the equivalent of a C++ template.
(For an implementation and examples you may have a look at
ciforth's library forth.lab )

>
>-- Greg

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

dream...@yahoo.com

unread,

Dec 9, 2008, 2:05:18 PM12/9/08

to

[Helmar: I think that Greg was in fact aware of the usual way and same
time impressed by the
(say what you want) powerful way C++ has to offer. ]

Well put, thank you. I don't really want to debate the "elegance" of C
++ templates. I've worked with both C++ and Forth for years so I am
quite aware of the not-so-nice aspects of templates, especially when
put in the hands of a programmer who uses it as their only hammer.
I'll just say that I think they should be used "wisely" and
"judiciously". It's just that in the case of meta-programming, the
use of C++ templates surprised me a bit and it appeared to be worth
exploring further.

I realize that "elegance" is rather subjective, although most people
can generally appreciate the beauty of high art or construction such
as a Cathedral. By the same token, we can frequently look at a sample
of code and agree that it is "ugly" or at least not well written (I
have no desire to enter a philosophical debate over this). When I
used the term "elegant" here I should have clarified my specific use
of it. Here I mean that ideally, the metaprogammed version of the
algorithm is close in appearance to the original non-metaprogrammed
version, perhaps they might even be identical. I recently saw some
sample code C++ template code where I was struck by the similiarity
between the two and that started me thinking about how it could be
done in an extensible language like Forth. At one level, one would
think that full control of the compiler would make this rather easy to
do in Forth, yet I was at a loss as to how to match this apparent
"closeness" with Forth without the use of explicit COMPILE's or (re)
defining a number of compiling words that are analogs of their non-
metaprogrammed counterparts. It could just be that I'm not
imaginative enough. I did find Helmar's example interesting, better
than what I had first imagined, but still I'm not quite sure it is
what I'm after.

As I see it, the grail would be to have a program T which takes as
input a program P along with a portion of it's input data D and
outputs a meta-programmed equivalent which is P'. P' of course is
specialized to it's input data, and would presumably run faster due to
this specialization. P would be a standard non-metaprogrammed code.
I believe this is generally referred to as "partial evaluation". I am
aware that there has been some work in this direction, one of the more
interesting examples I have come across is by Futamura which takes
essentially an interpreter as input and outputs a compiler.

-- Greg

dream...@yahoo.com

unread,

Dec 9, 2008, 2:32:45 PM12/9/08

to

On Dec 9, 1:52 pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:

> ' my-< ' my-<--> -1000 1000 QSORT
> (Or us :NONAME to make execution tokens.)
> The QSORT is totally generic, the equivalent of a C++ template.
> (For an implementation and examples you may have a look at
> ciforth's library forth.lab )

I took a quick look at forth.lab. I believe what you are showing
above is related to creating generics via templates, but does not go
far enough in terms of the notion of partial evaluation. What I'm
referring to is the ability to specify an algorithm as input and have
a specialized variant of the algorithm produced as the output (please
refer to the bubble sort example within the link in my original post
to see exactly what I am referring to).

For example, it would be more like this:

10 ' QSORT MakeSpecificQSORT QSORT10

Now we have "QSORT10" which is a specialized QSORT created at compile
time which only works for a 10 element input. Since it was created by
the compiler, there will be no run-time checks or "IS" style
indirection involving the size of the input.

The amazing aspect (to me at least) of the bubble sort example I
mention is that the compiler transforms the general bubble sort
algorithm into an in-line comparison sort for N=3.

-- Greg

Anton Ertl

unread,

Dec 9, 2008, 3:15:35 PM12/9/08

to

dream...@yahoo.com writes:
>Now of course this could also be done in Forth, but I don't think it
>would be nearly as elegant. C++ allows you to specify the algorithm
>directly within a template. I think a Forth approach would be litered
>with [COMPILE]'s preceding every word. Maybe there's a better way.
>Does anyone have experience generating a specialized algorithm at
>compile time, and if so, was there a simple and elegant way to do it?

Take a look at
<http://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Advanced-macros-Tutorial.html>

True, there are lots of POSTPONEs there (the modern equivalent of
[COMPILE]). If you want to get rid of that, there are two
possibilities:

1) Introduce colon definitions and POSTPONE them, e.g.:

: cvs1 tuck @ ;
: cvs2 * + swap cell+ ;
: compile-vmul-step ( compilation: n --; run-time: n1 addr1 -- n2 addr2 )
\ n2=n1+(addr1)*n, addr2=addr1+cell
POSTPONE cvs1 POSTPONE literal POSTPONE cvs2 ;

Of course, that relies on inlining by the compiler to get the same
efficiency as the original version, and I don't think it's more readable.

2) Use the ]] ... [[ construct that's available in Gforth (and I think
there's also an implementation of ]] ... [[ in Forth-94):

: compile-vmul-step ( compilation: n --; run-time: n1 addr1 -- n2 addr2 )
\ n2=n1+(addr1)*n, addr2=addr1+cell
]] tuck @ literal * + swap cell+ [[ ;

I guess option 2 is what you are looking for.

Can you show me how C++ template programming can do this? What if the
array A is only available at run-time?

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html

Helmar

unread,

Dec 9, 2008, 4:08:41 PM12/9/08

to

On Dec 9, 3:15 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> 2) Use the ]] ... [[ construct that's available in Gforth (and I think
> there's also an implementation of ]] ... [[ in Forth-94):

This could do the job (without numbers, that require more knowledge):

: [[ ;
: ]] begin >in @ ' ['] [[ <> while >in !
postpone postpone
repeat drop ; immediate

Regards,
-Helmar

dream...@yahoo.com

unread,

Dec 9, 2008, 5:54:37 PM12/9/08

to

On Dec 9, 3:15 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> Can you show me how C++ template programming can do this? What if the
> array A is only available at run-time?

If the array is only available at run-time then I don't see how it can
be done with C++ templates as C++ makes a pure distinction between
compile time and run-time. Forth doesn't impose this distinction so
it is more powerful in that respect. I suppose one could argue that
one could write C++ code to generate machine code at run-time and
execute it but that's extreme and far outside the bounds of a
discussion surrounding templates (in effect, one is writing their own
mini-compiler for the task at hand). I had recognized this point
earlier, but wanted to limit the scope to static analysis occurring
prior to run time, but you're right there's no need to stop there and
that is another aspect where Forth shines over languages which have a
distinct "compile" phase.

-- Greg

Albert van der Horst

unread,

Dec 9, 2008, 8:43:22 PM12/9/08

to

In article <9f3a3186-6498-48a7...@n10g2000yqm.googlegroups.com>,
<dream...@yahoo.com> wrote:
>On Dec 9, 1:52=A0pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
>wrote:
>> ' my-< =A0 ' my-<--> =A0-1000 1000 QSORT

>algorithm into an in-line comparison sort for N=3D3.

An optimizing Forth compiler can certainly do similar things with
: my-QSORT 'my-< 'my-<--> 1000 -1000 QSORT ;

Note that there is no MakeSpecificQSORT here, just the regular
folding optimisation of
'IETS EXECUTE --> IETS
that serves in order places. (There are some notes about this
in one of my forthlectures.)

dream...@yahoo.com

unread,

Dec 9, 2008, 9:18:59 PM12/9/08

to

On Dec 9, 8:43 pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:
> In article <9f3a3186-6498-48a7-b2cd-d123325df...@n10g2000yqm.googlegroups.com>,

>
> An optimizing Forth compiler can certainly do similar things with
> : my-QSORT 'my-< 'my-<--> 1000 -1000 QSORT ;

No doubt that an optimizing compiler would help here, but it wouldn't
create a specific
version of QSORT with a fixed input and that's really the point I'm
driving at.

-- Greg

Albert van der Horst

unread,

Dec 10, 2008, 1:26:13 PM12/10/08

to

In article <62711094-f1f1-471e...@d42g2000prb.googlegroups.com>,
<dream...@yahoo.com> wrote:
>On Dec 9, 8:43=A0pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
>wrote:
>> In article <9f3a3186-6498-48a7-b2cd-d123325df...@n10g2000yqm.googlegroups=

>.com>,
>>
>> An optimizing Forth compiler can certainly do similar things with
>> : my-QSORT 'my-< 'my-<--> 1000 -1000 QSORT ;
>
>No doubt that an optimizing compiler would help here, but it wouldn't
>create a specific
>version of QSORT with a fixed input and that's really the point I'm
>driving at.

I'm not sure.
Forth can keep about anything fixed that you would want to
keep fixed:

( upper-limit lower limit --)
: my-generic-QSORT 'my-< 'my-<--> 2SWAP QSORT ;

-1000 1000 my-generic-QSORT

( compare-xt swap-xt --)
: my-generic-QSORT 1000 -1000 QSORT ;

'my-< 'my-<--> my-generic-QSORT

The job that can be done by the optimizer is different,
of course. In the first example it has more room.

John Passaniti

unread,

Dec 10, 2008, 1:50:53 PM12/10/08

to

Albert van der Horst wrote:
> I'm not sure.
> Forth can keep about anything fixed that you would want to
> keep fixed:

I'm not sure, but I think the point being made is not that you can
inject constants into a call to a generic routine, but that you can
generate algorithms for specialized cases.

That is, if I have an array of four items, I can generate a sort routine
that is specialized for that specific array size that avoids all the
overhead of a traditional sort routine (iteration, partitioning,
possibly recursion, etc.) That routine would look like this (pseudocode):

if a[0] > a[1] then swap
if a[1] > a[2] then swap
if a[2] > a[3] then swap
if a[0] > a[1] then swap
if a[1] > a[2] then swap
if a[0] > a[1] then swap

The point is that despite the pain of C++'s templates (which can be
considerable), once you understand them, you can fairly easily
metaprogram optimized routines like this.

The same kind of thing can be done in Forth. One obvious way to do this
is to generate program text and then evaluate it, and this is a common
pattern in many scripting languages. Other methods (which involve
jumping in and out of the compiler) are more ugly.

This particular kind of metaprogramming (generating specialized
algorithms from generic routines) is something that seems alien to the
Forth world. Maybe it's done, but I haven't seen much code that does it.

There are other kinds of metaprogramming that also seem alien to Forth.
One that I do all the time is to express protocols (like serial
communications protocols) in terms of data structures. I then use those
data structures to dynamically build functions that are specialized to
implement the protocol.

dream...@yahoo.com

unread,

Dec 10, 2008, 2:03:34 PM12/10/08

to

On Dec 10, 1:50 pm, John Passaniti <n...@JapanIsShinto.com> wrote:
> Albert van der Horst wrote:
>
> > I'm not sure.
> > Forth can keep about anything fixed that you would want to
> > keep fixed:
>
> I'm not sure, but I think the point being made is not that you can
> inject constants into a call to a generic routine, but that you can
> generate algorithms for specialized cases.

Thanks John, this is exactly my point.

>
> That is, if I have an array of four items, I can generate a sort routine
> that is specialized for that specific array size that avoids all the
> overhead of a traditional sort routine (iteration, partitioning,
> possibly recursion, etc.) That routine would look like this (pseudocode):
>
> if a[0] > a[1] then swap
> if a[1] > a[2] then swap
> if a[2] > a[3] then swap
> if a[0] > a[1] then swap
> if a[1] > a[2] then swap
> if a[0] > a[1] then swap
>
> The point is that despite the pain of C++'s templates (which can be
> considerable), once you understand them, you can fairly easily
> metaprogram optimized routines like this.

Right, their is a similiar example in the link I provided in my
original post.

>
> The same kind of thing can be done in Forth. One obvious way to do this
> is to generate program text and then evaluate it, and this is a common
> pattern in many scripting languages. Other methods (which involve
> jumping in and out of the compiler) are more ugly.
>
> This particular kind of metaprogramming (generating specialized
> algorithms from generic routines) is something that seems alien to the
> Forth world. Maybe it's done, but I haven't seen much code that does it.
>
> There are other kinds of metaprogramming that also seem alien to Forth.
> One that I do all the time is to express protocols (like serial
> communications protocols) in terms of data structures. I then use those
> data structures to dynamically build functions that are specialized to
> implement the protocol.

Yes, it does seem alien to Forth, yet OTOH Forth excels at
manipulating the compiler. It could be that that a C++ style
templating mechanism would have to be layered ontop of Forth. I don't
know if that would be a reasonable thing to do or not, but I at least
find it interesting to think about.

-- Greg

Andrew Haley

unread,

Dec 11, 2008, 10:47:07 AM12/11/08

to

dream...@yahoo.com wrote:

> http://ubiety.uwaterloo.ca/~tveldhui/papers/Template-Metaprograms/meta-art.html

The trouble with the C++ way is that the language in which the
template metaprograms are written is different from the one in which
"normal" programs are written. Because of this, the size of the
program explodes to several times that of the non-metaprogrammed
version. The template version is about 50 lines long, the
non-template version about 15.

In Forth, as in LISP macros, it's the same language. Let's copy the
C++ example. Here's a simple example of a bubblesort in Forth:

7 constant size
create arr size cells allot
: th ( a n - a) cells + ;

: pass ( n)
1- 0 ?do arr i th 2@
2dup > if 2drop else swap arr i th 2! then
loop ;
: bubblesort 1 size do i pass -1 +loop ;

If we want an optimized bubblesort for a statically-allocated array we
can do the address calculation at compile time:

: ]" postpone s" postpone evaluate ; immediate
: pass { a size }
size 1- 0 ?do a i th ]" literal 2@"
]" 2dup > if 2drop else swap " a i th ]" literal 2! then"
loop ;
: [bubblesort] { a size } 1 size do a i pass -1 +loop ; immediate

( Standard Forth plus gforth locals.)

No recursion necessary, and it's an order of magnitude shorter than
the C++ version. It's less general-purpose than the C++ version, and
that's one reason it's so much shorter: it can only be used for an
array whose address and size is known at compile time. It's also IMO
a hell of a lot easier to get one's head around than the C++ version,
but that depends on one's experiance with the languages.

To see what it does:

: try [ arr 4 ] [bubblesort] ; ok
see try
: try
arr 2@ 2dup >
IF 2drop
ELSE swap arr 2!
THEN
5571856 2@ 2dup >
IF 2drop
ELSE swap 5571856 2!
THEN
5571864 2@ 2dup >
IF 2drop
ELSE swap 5571864 2!
THEN
arr 2@ 2dup >
IF 2drop
ELSE swap arr 2!
THEN
5571856 2@ 2dup >
IF 2drop
ELSE swap 5571856 2!
THEN
arr 2@ 2dup >
IF 2drop
ELSE swap arr 2!
THEN ; ok

Andrew.

Message has been deleted

dream...@yahoo.com

unread,

Dec 11, 2008, 11:13:26 PM12/11/08

to

On Dec 11, 10:47 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> The trouble with the C++ way is that the language in which the
> template metaprograms are written is different from the one in which
> "normal" programs are written. Because of this, the size of the
> program explodes to several times that of the non-metaprogrammed
> version. The template version is about 50 lines long, the
> non-template version about 15.

Isn't that due to the fact that a specialized in-line algorithm is
being created? I think the idea is to trade code space for
performance by taking advantage of the specialized case so I welcome
the expansion.

> No recursion necessary, and it's an order of magnitude shorter than
> the C++ version. It's less general-purpose than the C++ version, and
> that's one reason it's so much shorter: it can only be used for an
> array whose address and size is known at compile time. It's also IMO
> a hell of a lot easier to get one's head around than the C++ version,
> but that depends on one's experiance with the languages.

>...example...

Interesting example, and thanks for putting this together. This is
definitely along the lines of what I was after w/r/t Forth. My
observation is that the Forth example intersperses compile time words
within the base algorithm whereas the C++ template approach adds
template instantiation syntax to the base algorithm so neither
algorithm is "pure" in that sense.

-- Greg

Andrew Haley

unread,

Dec 12, 2008, 5:07:08 AM12/12/08

to

dream...@yahoo.com wrote:
> On Dec 11, 10:47?am, Andrew Haley <andre...@littlepinkcloud.invalid>

> wrote:
> > The trouble with the C++ way is that the language in which the
> > template metaprograms are written is different from the one in which

> > "normal" programs are written. ?Because of this, the size of the

> > program explodes to several times that of the non-metaprogrammed

> > version. ?The template version is about 50 lines long, the
> > non-template version about 15.

> Isn't that due to the fact that a specialized in-line algorithm is
> being created?

I don't think so. A lot of that C++ example is just noise that the
language requires.

> I think the idea is to trade code space for performance by taking
> advantage of the specialized case so I welcome the expansion.

I don't really know why you'd welcome it, but OK. :-)

> > No recursion necessary, and it's an order of magnitude shorter than

> > the C++ version. ?It's less general-purpose than the C++ version, and

> > that's one reason it's so much shorter: it can only be used for an

> > array whose address and size is known at compile time. ?It's also IMO

> > a hell of a lot easier to get one's head around than the C++ version,

> > but that depends on one's experience with the languages.
> >...example...

> Interesting example, and thanks for putting this together. This is
> definitely along the lines of what I was after w/r/t Forth. My
> observation is that the Forth example intersperses compile time
> words within the base algorithm whereas the C++ template approach
> adds template instantiation syntax to the base algorithm so neither
> algorithm is "pure" in that sense.

Sure. Forth is never pure -- about anything. :-)

Andrew.

John Passaniti

unread,

Dec 12, 2008, 11:36:16 AM12/12/08

to

Andrew Haley wrote:
>> I think the idea is to trade code space for performance by taking
>> advantage of the specialized case so I welcome the expansion.
>
> I don't really know why you'd welcome it, but OK. :-)

You two seem to be talking about two different things here. You are
talking about the amount of source code needed in C++ to do this. Greg
is talking about the amount of object code generated.

>> Interesting example, and thanks for putting this together. This is
>> definitely along the lines of what I was after w/r/t Forth. My
>> observation is that the Forth example intersperses compile time
>> words within the base algorithm whereas the C++ template approach
>> adds template instantiation syntax to the base algorithm so neither
>> algorithm is "pure" in that sense.
>
> Sure. Forth is never pure -- about anything. :-)

Which is why Lisp wins here.

Andrew Haley

unread,

Dec 12, 2008, 12:32:24 PM12/12/08

to

John Passaniti <nn...@japanisshinto.com> wrote:
> Andrew Haley wrote:
> >> I think the idea is to trade code space for performance by taking
> >> advantage of the specialized case so I welcome the expansion.
> >
> > I don't really know why you'd welcome it, but OK. :-)

> You two seem to be talking about two different things here. You are
> talking about the amount of source code needed in C++ to do this.
> Greg is talking about the amount of object code generated.

I was wondering about that possibility, but I was so explicit about
what I was talking about -- the size of the source -- that it didn't
seem possible.

> >> Interesting example, and thanks for putting this together. This is
> >> definitely along the lines of what I was after w/r/t Forth. My
> >> observation is that the Forth example intersperses compile time
> >> words within the base algorithm whereas the C++ template approach
> >> adds template instantiation syntax to the base algorithm so neither
> >> algorithm is "pure" in that sense.
> >
> > Sure. Forth is never pure -- about anything. :-)

> Which is why Lisp wins here.

Granted, LISP has a practical advantage in that code and data have the
same form. It might be that this means that a LISP version of this
program would be much smaller and neater than the Forth version.
However, no LISP example has been posted so this is just speculation.

Andrew.

Aleksej Saushev

unread,

Dec 12, 2008, 1:17:14 PM12/12/08

to

"Jeff M." <mas...@gmail.com> writes:

> Try creating a C++ template meta-program that computes the CRC of a
> string at compile-time instead of runtime for efficiency. To make the
> problem simple, we'll just use a simple 8-bit checksum instead of a
> full CRC:
>
> Lisp:
>
> (defmacro crc (string)
> (logand (reduce #'+ string :key #'char-code :initial-value 0) #xFF))
>
>
> Forth (been a while, probably better ways):
>
> : /CRC ( u c-addr offset -- u' c-addr ) OVER >R + C@ + R> ;
> : CRC" ( ..." -- ) 0 [CHAR] " WORD COUNT 0 DO? I /CRC LOOP DROP 255
> AND , ; IMMEDIATE

These don't calculate CRC. Please, don't call them so to avoid confusion.

You don't understand what you're doing by giving bad example.
I have to deal with the code, where CRC is used in a wrong way,
and I had to deal with the code where digest was used in wrong way in
not so distant past.

--
HE CE3OH...

Marcel Hendrix

unread,

Dec 12, 2008, 12:51:07 PM12/12/08

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes Re: Template Metaprogramming in Forth?
[..]

Or in iForth32:

$005C09C0 : try
$005C09C8 mov ebx, $005C03C4 dword-offset
$005C09CE cmp ebx, $005C03C0 dword-offset
$005C09D4 push $005C03C4 dword-offset
$005C09DA push $005C03C0 dword-offset
$005C09E0 jle $005C09ED offset NEAR
$005C09E6 pop ebx
$005C09E7 pop ecx
$005C09E8 jmp $005C09FB offset NEAR
$005C09ED pop ebx
$005C09EE pop ecx
$005C09EF mov $005C03C0 dword-offset, ecx
$005C09F5 mov $005C03C4 dword-offset, ebx
$005C09FB mov ebx, $005C03C8 dword-offset
$005C0A01 cmp ebx, $005C03C4 dword-offset
$005C0A07 push $005C03C8 dword-offset
$005C0A0D push $005C03C4 dword-offset
$005C0A13 jle $005C0A20 offset NEAR
$005C0A19 pop ebx
$005C0A1A pop ecx
$005C0A1B jmp $005C0A2E offset NEAR
$005C0A20 pop ebx
$005C0A21 pop ecx
$005C0A22 mov $005C03C4 dword-offset, ecx
$005C0A28 mov $005C03C8 dword-offset, ebx
$005C0A2E mov ebx, $005C03CC dword-offset
$005C0A34 cmp ebx, $005C03C8 dword-offset
$005C0A3A push $005C03CC dword-offset
$005C0A40 push $005C03C8 dword-offset
$005C0A46 jle $005C0A53 offset NEAR
$005C0A4C pop ebx
$005C0A4D pop ecx
$005C0A4E jmp $005C0A61 offset NEAR
$005C0A53 pop ebx
$005C0A54 pop ecx
$005C0A55 mov $005C03C8 dword-offset, ecx
$005C0A5B mov $005C03CC dword-offset, ebx
$005C0A61 mov ebx, $005C03C4 dword-offset
$005C0A67 cmp ebx, $005C03C0 dword-offset
$005C0A6D push $005C03C4 dword-offset
$005C0A73 push $005C03C0 dword-offset
$005C0A79 jle $005C0A86 offset NEAR
$005C0A7F pop ebx
$005C0A80 pop ecx
$005C0A81 jmp $005C0A94 offset NEAR
$005C0A86 pop ebx
$005C0A87 pop ecx
$005C0A88 mov $005C03C0 dword-offset, ecx
$005C0A8E mov $005C03C4 dword-offset, ebx
$005C0A94 mov ebx, $005C03C8 dword-offset
$005C0A9A cmp ebx, $005C03C4 dword-offset
$005C0AA0 push $005C03C8 dword-offset
$005C0AA6 push $005C03C4 dword-offset
$005C0AAC jle $005C0AB9 offset NEAR
$005C0AB2 pop ebx
$005C0AB3 pop ecx
$005C0AB4 jmp $005C0AC7 offset NEAR
$005C0AB9 pop ebx
$005C0ABA pop ecx
$005C0ABB mov $005C03C4 dword-offset, ecx
$005C0AC1 mov $005C03C8 dword-offset, ebx
$005C0AC7 mov ebx, $005C03C4 dword-offset
$005C0ACD cmp ebx, $005C03C0 dword-offset
$005C0AD3 push $005C03C4 dword-offset
$005C0AD9 push $005C03C0 dword-offset
$005C0ADF jle $005C0AEC offset NEAR
$005C0AE5 pop ebx
$005C0AE6 pop ecx
$005C0AE7 jmp $005C0AFA offset NEAR
$005C0AEC pop ebx
$005C0AED pop ecx
$005C0AEE mov $005C03C0 dword-offset, ecx
$005C0AF4 mov $005C03C4 dword-offset, ebx
$005C0AFA ;

-marcel

dream...@yahoo.com

unread,

Dec 12, 2008, 5:55:14 PM12/12/08

to

On Dec 12, 12:32 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> John Passaniti <n...@japanisshinto.com> wrote:
> > Andrew Haley wrote:
> > >> I think the idea is to trade code space for performance by taking
> > >> advantage of the specialized case so I welcome the expansion.
>
> > > I don't really know why you'd welcome it, but OK. :-)
> > You two seem to be talking about two different things here. You are
> > talking about the amount of source code needed in C++ to do this.
> > Greg is talking about the amount of object code generated.
>
> I was wondering about that possibility, but I was so explicit about
> what I was talking about -- the size of the source -- that it didn't
> seem possible.

I was responding to the following...

> On Dec 11, 10:47?am, Andrew Haley <andre...@littlepinkcloud.invalid>
> wrote:
> > The trouble with the C++ way is that the language in which the
> > template metaprograms are written is different from the one in which
> > "normal" programs are written. ?Because of this, the size of the
> > program explodes to several times that of the non-metaprogrammed
> > version. ?The template version is about 50 lines long, the
> > non-template version about 15.
> Isn't that due to the fact that a specialized in-line algorithm is
> being created?

I think John is pointing out that "the size of the program" was meant
as source by you, whereas it was taken as object code by me (note
there was no mention of "source" in the above).

It's generally understood that template metaprogramming results in
object code bloat so I think my interpretation was reasonable.

-- Greg

Andrew Haley

unread,

Dec 13, 2008, 9:57:03 AM12/13/08

to

dream...@yahoo.com wrote:
> On Dec 12, 12:32?pm, Andrew Haley <andre...@littlepinkcloud.invalid>
> wrote:

> I was responding to the following...

> > On Dec 11, 10:47?am, Andrew Haley <andre...@littlepinkcloud.invalid>
> > wrote:
> > > The trouble with the C++ way is that the language in which the
> > > template metaprograms are written is different from the one in which
> > > "normal" programs are written. ?Because of this, the size of the
> > > program explodes to several times that of the non-metaprogrammed
> > > version. ?The template version is about 50 lines long, the
> > > non-template version about 15.
> > Isn't that due to the fact that a specialized in-line algorithm is
> > being created?

> I think John is pointing out that "the size of the program" was
> meant as source by you, whereas it was taken as object code by me
> (note there was no mention of "source" in the above).

> It's generally understood that template metaprogramming results in
> object code bloat so I think my interpretation was reasonable.

I see. No, it was the source complexity of the template
metaprogrammed example I was complaining about. Given that the whole
purpose of this exercise is to generate a completely unrolled program,
any complaints about the size of the object code don't matter.

This has got me thinking about bubble sorting. It's usually a
horrible algorithm but it has the advantage here of having no
data-dependent loops, so the whole thing can trivially be unrolled.

I was wondering whether some other sorting algorithm might be a better
choice. There's an interesting section in Knuth Vol 3, Optimum
Sorting, which discusses special-purpose programs for N elements. For
example, five elements can be sorted using only seven comparisons,
whereas the bubblesort I posted uses ten.

On my first reading it seems that the size of an unrolled optimum
sorting algorithm would be >= N!, which makes the N^2 of bubble
sorting look quite parsimonious. Maybe for very small N the
compromise between the size of the unrolled program and the number of
comparisons makes bubblesort an attractive choice.

Andrew.

dream...@yahoo.com

unread,

Dec 13, 2008, 12:01:38 PM12/13/08

to

On Dec 13, 9:57 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> This has got me thinking about bubble sorting. It's usually a
> horrible algorithm but it has the advantage here of having no
> data-dependent loops, so the whole thing can trivially be unrolled.
>
> I was wondering whether some other sorting algorithm might be a better
> choice. There's an interesting section in Knuth Vol 3, Optimum
> Sorting, which discusses special-purpose programs for N elements. For
> example, five elements can be sorted using only seven comparisons,
> whereas the bubblesort I posted uses ten.
>
> On my first reading it seems that the size of an unrolled optimum
> sorting algorithm would be >= N!, which makes the N^2 of bubble
> sorting look quite parsimonious. Maybe for very small N the
> compromise between the size of the unrolled program and the number of
> comparisons makes bubblesort an attractive choice.

Finding an optimal comparison sort for arbitrary N is NP complete so
there is no general sorting algorithm that unrolls to an optimal
comparision sort (or if you can find one, you'll make computing
history).

For N = 16, the minimum number of comparisons was once thought to be
65, then 63, then 62, then finally 60 when Danny Hillis applied this
problem to his connection machine. A smaller solution, if possible,
has yet to be found (if you can find one, you'll also make computing
history, but in a far lesser way than the above).

Beyond sorting algorithms, what impressed me was how this technique
was applied to solving the 15 puzzle. See the winning entry for:

http://cplus.about.com/od/programmingchallenges/a/challenge16.htm

Say what you want about C++ templates, but I consider this to be one
amazing piece of code.

-- Greg

Message has been deleted

m_l_g3

unread,

Dec 13, 2008, 8:45:45 PM12/13/08

to

On Dec 9, 4:06 am, dreamwa...@yahoo.com wrote:
> Through the use of templates, C++ has a powerful feature known as
> "template metaprogramming". Basically, it allows for compile time
> execution of code. Now we all know that Forth allows this and one
> such trivial example would be to compute a constant such as factorial
> (x) at compile time and use the constant result during run-time.
>
> However, C++ style templates allow one to define an algorithm within a
> template. For example, one could define a sort algorithm within a
> template, pass the template the size of the input N (i.e. sort N
> elements, say N=5 here) and the compiler would create an inline sort
> algorithm that works only for a 5 element input. If loops are
> converted to recursive template calls, the 5-sort would be inlined by
> the compiler and would be highly efficient for sorting 10 elements.
>
> Now of course this could also be done in Forth, but I don't think it
> would be nearly as elegant. C++ allows you to specify the algorithm
> directly within a template. I think a Forth approach would be litered
> with [COMPILE]'s preceding every word. Maybe there's a better way.
> Does anyone have experience generating a specialized algorithm at
> compile time, and if so, was there a simple and elegant way to do it?
>

First of all, "let us separate flies from cutlets". There are
generic algorithms and there is generation of specialized code.
The first one is about expressing concepts, the second one is
about generated code optimization.

Forth is strong at expressing concepts, at the cost of
difficulty of optimized code generation.

(
This strength and weakness come from the fact that each Forth
word takes the rest of the Forth system as an input argument;
it usually happens that the most part of the system is passed
unchanged to the next word, but this is a good will of the
invoked word to let the next word execute.

For example, DUP does not change anything on the return stack
or reposition the input source; it duplicates the stack top,
fetches the next word from the [threaded] code, and lets it
execute. The next word is pointed to by the virtual
machine's IP register.

But the word EXIT does not touch the word at IP, instead,
it loads IP from the return stack and only then
fetches the word to be executed next.

The same about parsing: a word can parse anything from
the input stream before it returns control to the text
interpreter. For example, that "parsed anything" may be
a program in Pascal. And there used to be such
implementations.
)

Message has been deleted

Anton Ertl

unread,

Dec 14, 2008, 12:19:47 PM12/14/08

to

dream...@yahoo.com writes:
>My
>observation is that the Forth example intersperses compile time words
>within the base algorithm whereas the C++ template approach adds
>template instantiation syntax to the base algorithm so neither
>algorithm is "pure" in that sense.

For this kind of purity you write just the plain algorithm, and then
pass the function and the instantiated subset of the parameters to a
partial evaluator; or even more transparent, you write the call in a
form that an optimizer can split into a partially-instantiated call
(and pass that to a partial evaluator) and an evaluation of the rest.

However, then you have no control over what the compiler and/or
partial evaluator does and doesn't do. IMO, if you want it done, it's
better to see that in the source, and that's also very in line with
the usual Forth philosophy.

Concerning the syntax , you can write it in Forth in different ways,
that tend to support the different views.

E.g., the word COMPILE-MAP-ARRAY from
<http://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Advanced-macros-Tutorial.html>
could also be written as:

: compile-map-array ( compilation: xt -- ; run-time: ... addr u -- ... )
\ at run-time, execute xt ( ... x -- ... ) for each element of the
\ array beginning at addr and containing u elements
{ xt } ]] cells over + swap ?do
i @ [[ xt compile, 1 cells ]] literal
+loop [[ ;

which can be seen as "adding instantiation to the base algorithm".
The ]] ... [[ syntax has the brackets in this direction, because if
you then want to do something earlier, you write it in [[ ... ]], just
like you use [ ... ] during plain compilation if you want to do
something earlier.