Creating New Control Structures in Fexl

14 views
Skip to first unread message

Lewis Van Winkle

unread,
Oct 3, 2010, 6:18:31 AM10/3/10
to fexl
I was wondering if it was possible to create new control structures in
Fexl.

It seems that most high level languages allow the programmer to define
control structures. Let's take AND as an example of a user defined
control structure. In the statement "AND a b", b should be evaluated
only if a is true (short circuit evaluation). Could this AND function
be defined in a Fexl program?

There are three ways I'm familiar with to allow control structures
such as these. Scheme allows these structures through macros; Haskell
allows them through lazy evaluation; TCL allows them through up-level
evaluation. Each method has trade-offs.

Personally, I like Scheme's macros the least. The inclusion of macros
means that a caller doesn't know, and cannot reason about, how things
will be evaluated. Using the AND example above, a caller has no way to
know whether AND is a normal lambda or a macro. Therefor, the caller
has no way to know whether b will be evaluated always or conditionally
(until they look up the definition, obviously).

Haskell has a similar problem, in that a function caller cannot reason
about whether or not their code will be evaluated. Haskell however, is
at least consistent in that its laziness permeates the entire language
(unlike Scheme's macros). In Haskell side effects are taboo. Assuming
that the b argument in the AND example has no side effects, it doesn't
actually matter it gets evaluated or not. Since Fexl seems to embrace
side effects, lazy evaluation may not be suitable (in fact, I'll just
say it's unsuitable).

TCL allows control structures by having the caller pass unevaluated
code as a string, and then the function decides whether or not to
evaluate. In this way, the function caller is always aware whether
they are calling a control structure or not. In order to call a
control structure, the caller must know they are calling a control
structure. Like Haskell, TCL is also very consistent. From a
programmer's perspective, I think this elegant consistency is most
important.


There are probably several other methods I'm not aware of. Just last
month I was unaware that bound variables could be completely
eliminated from functions. Maybe Fexl already has an elegant solution
and I simply didn't notice?

Come to think of it, I don't believe I have seen any example of a
built-in Fexl control structure either. They do exist, right?


-Lewis

Patrick Chkoreff

unread,
Oct 3, 2010, 8:19:16 AM10/3/10
to fexl


On Oct 3, 6:18 am, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:
> I was wondering if it was possible to create new control structures in
> Fexl.

Yes indeed. Excellent question Lewis, and thanks for posting!


> It seems that most high level languages allow the programmer to define
> control structures. Let's take AND as an example of a user defined
> control structure. In the statement "AND a b",  b should be evaluated
> only if a is true (short circuit evaluation). Could this AND function
> be defined in a Fexl program?

Your question goes right to the heart of how data (including Booleans)
are represented as functions. It is very easy to define the AND
function in Fexl as follows:

\and = (\x\y x y F)

So the value of (and x y) is (x y F), and that means if x is true,
return y, otherwise return false.

You'll note that Fexl has no keywords, so there is no need to say "if"
when testing something. If you wanted to say "if",
you could just define a dummy "if" function like this:

\if = (\x x)

So "if" is just the identity function serving no real purpose. You
could then say (if x y F), but I tend to avoid the
extra clutter and just become accustomed to treating Boolean values
directly as functions.

In case you're wondering about what the Boolean values T and F are,
they are defined as two functions like this:

\T = (\T\F T)
\F = (\T\F F)

In other words, the T function returns the first of two arguments you
give it, and the F function returns the second.

> There are three ways I'm familiar with to allow control structures
> such as these. Scheme allows these structures through macros; Haskell
> allows them through lazy evaluation; TCL allows them through up-level
> evaluation. Each method has trade-offs.

Fexl does it through lazy evaluation, so it sounds more like the
Haskell way.


> Personally, I like Scheme's macros the least. The inclusion of macros
> means that a caller doesn't know, and cannot reason about, how things
> will be evaluated. Using the AND example above, a caller has no way to
> know whether AND is a normal lambda or a macro. Therefor, the caller
> has no way to know whether b will be evaluated always or conditionally
> (until they look up the definition, obviously).

Right, this doesn't happen in Fexl. Fexl is always referentially
transparent, almost eerily so sometimes.
By that I mean you can reason about the equivalence of functions with
the same dexterity and clarity
that you reason about the equivalence of natural numbers.


> Haskell has a similar problem, in that a function caller cannot reason
> about whether or not their code will be evaluated. Haskell however, is
> at least consistent in that its laziness permeates the entire language
> (unlike Scheme's macros). In Haskell side effects are taboo. Assuming
> that the b argument in the AND example has no side effects, it doesn't
> actually matter it gets evaluated or not. Since Fexl seems to embrace
> side effects, lazy evaluation may not be suitable (in fact, I'll just
> say it's unsuitable).

Fexl embraces two things: (1) lazy evaluation of high order functions,
and (2) the inevitability of side effects.
But high order functions are powerful enough to let you migrate the
side effects anywhere you like. You can
program in a monadic style if you like, or you can just jam a side
effect right in the middle of something,
perhaps for testing purposes. As an example:

and x (print "hey wait, x is false!"; y)

Fexl just goes with the flow and doesn't try to fight the natural
order of the CPU with excessive denotational
semantics. :)


> TCL allows control structures by having the caller pass unevaluated
> code as a string, and then the function decides whether or not to
> evaluate. In this way, the function caller is always aware whether
> they are calling a control structure or not. In order to call a
> control structure, the caller must know they are calling a control
> structure. Like Haskell, TCL is also very consistent. From a
> programmer's perspective, I think this elegant consistency is most
> important.

I like the way you think -- and yes, I admire TCL for that elegance
even though that's not the way Fexl does it.


> There are probably several other methods I'm not aware of. Just last
> month I was unaware that bound variables could be completely
> eliminated from functions. Maybe Fexl already has an elegant solution
> and I simply didn't notice?

Sure thing. The possibilities are endless, e.g.:

\and = (\x\y x y F)
\or = (\x\y x T Y)
\not = (\x x F T)
\nand = (\x\y not (and x y))
\eq = (\x\y x y (not y))
\xor = (\x\y not (eq x y))

You can also eliminate right-nesting if you prefer by using the
semicolon, for example:

\nand = (\x\y not; and x y)
\eq = (\x\y x y; not y)
\xor = (\x\y not; eq x y)



> Come to think of it, I don't believe I have seen any example of a
> built-in Fexl control structure either. They do exist, right?

Actually they do not exist. Fexl has no keywords or built-in control
structures.


-- Patrick

Patrick Chkoreff

unread,
Oct 3, 2010, 8:46:50 AM10/3/10
to fexl

On Oct 3, 6:18 am, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:

> There are probably several other methods I'm not aware of. Just last
> month I was unaware that bound variables could be completely
> eliminated from functions.

Yes, that's an amazing thing isn't it? For example let's just take
the two lowly functions T and F.

\T = (\T\F T)
\F = (\T\F F)

Those can be expressed without variables as:

\T = C
\F = (C I)

Now here are two more functions:

\and = (\x\y x y F)
\or = (\x\y x T y)

Without variables, those are:

\and = (S S (C (C F)))
\or = (S I (C T))

You can even do recursive functions without variables by using the
fixpoint operator Y.

\append = (\x\y x y \head\tail cons head (append tail y))

I could show this all the way to S and C, but that would not be
especially illuminating. The important thing is to show how to remove
the self-reference. So the append function is equivalent to this:

(Y (\append \x\y x y \head\tail cons head (append tail y)))

You'll notice that the call to append in there is *not* a circular
reference, but rather a simple reference to the enclosing lambda
variable called "append".

Again, I could show it all the way down to S and C, but the Fexl
parser just maps it down to a closed form on the fly as it reads input
characters, with the aid of the "skip" and "route" functions. That's
a little tedious to demonstrate right now but I'll eventually blog
that example.


-- Patrick

Patrick Chkoreff

unread,
Oct 3, 2010, 9:51:25 AM10/3/10
to fexl

> You can
> program in a monadic style if you like, or you can just jam a side
> effect right in the middle of something,
> perhaps for testing purposes.  As an example:
>
>   and x (print "hey wait, x is false!"; y)

By the way, in that example the value of (print "hey wait, x is
false!") is simply the identity function, and it's a "one-shot", so it
only happens once. As Fexl reduces a function, its value is replaced
inline.

So the value of (print "hey wait, x is false!" y) is simply y, and the
print only happens once.

But in general you can make side effects happen repeatedly, as in:

\ad_nauseum = (print "hello world" ad_nauseum)


-- Patrick

Lewis Van Winkle

unread,
Oct 3, 2010, 3:39:13 PM10/3/10
to fexl
Thank you for your detailed replies.

On Oct 3, 7:19 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:
> So the value of (and x y) is (x y F), and that means if x is true,
> return y, otherwise return false.

I'm trying to wrap my head around why this works. It's seems that
you're calling a boolean value as a function? And it behaves as an
"if" type statement as default (like C's ternary operator)?

So is "a b c" in Fexl the same as "a ? b : c" in C (assuming a is a
boolean value)?

>In case you're wondering about what the Boolean values T and F are,
>they are defined as two functions like this:

> \T = (\T\F T)
> \F = (\T\F F)

Actually, after a second reading, I think this answers my question.
Boolean values are functions. If "a" in my above example is T, then of
course it returns b. If "a" is F, then it obviously returns c. Nice.

Sorry, it's taking a bit for these concepts to click with me. When you
say everything is a function, you evidently mean it. Although I still
don't nearly understand all the consequences of that.

>> Come to think of it, I don't believe I have seen any example of a
>> built-in Fexl control structure either. They do exist, right?

>Actually they do not exist. Fexl has no keywords or built-in control
>structures.

Is it possible to do math in Fexl right out of the box? I know that
math can be done with pure functions (although it still blows my
mind). Have you read the book "The Little Schemer" by Friedman and
Felleisen? In that book they implement math using only functions and
recursion. Their numbers are represented by empty lists (i.e. 3 would
be a list containing three empty lists). They define an add1 function
first. Then add is repeated add1. Multiply is repeated add. Comparison
operators can also be easily defined. It's quite an amazing feat, in
my opinion. I'm sure you already understand the concept much better
than I, though.

I'm guessing that doing math in Fexl using the normal methods (like
the FPU) would involve writing an extension in C? Maybe it's a little
early for this, but do you have or are you planning a standard library
for Fexl?

>Fexl embraces two things: (1) lazy evaluation of high order functions,
>and (2) the inevitability of side effects.
OK. Fexl is lazy. That answers the my original question. Since it's
lazy, no special forms are needed for control structures, just like in
Haskell. I'm not sure how I didn't notice that Fexl was lazy before
now.

So, in Fexl the following code: "a (b c)" would always literally pass
"b c" as the argument to a, and never the value of "b c". Does Fexl
have a way to force early evaluation?

I still need to think about what it means to have lazy evaluation and
side effects. At first glace it seems to me that could lead to
unpredictable evaluation order. As long as the language is consistent,
maybe that's not true. You have obviously thoroughly thought this
through, so I'm sure it works. I'm sure I just still don't fully
understand it. I still have a lot to learn.

I guess I'm kind of glad to learn that Fexl is lazy. I briefly played
with Haskell a couple years ago. I remember the two things I didn't
like about Haskell: its purity (no side effects), and its humongous
syntax. It seems like Fexl neatly improves both issues. And of course,
being lazy means Fexl can do the infinitely long lists like Haskell
has (which should just be a matter of infinite lazy recursion). I
always thought that infinite lists were one of Haskell's best
practical features.

I really appreciate you taking the time to respond, and I'm really
impressed with just how well you've thought out and designed Fexl.

Thanks,
Lewis

Patrick Chkoreff

unread,
Oct 3, 2010, 4:59:26 PM10/3/10
to fexl
On Oct 3, 3:39 pm, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:

> Is it possible to do math in Fexl right out of the box? I know that
> math can be done with pure functions (although it still blows my
> mind). ...

I've actually written code that manipulates binary numbers as lists of
bits, and I even wrote division with quotient and remainder. So it is
possible to do all that in a purely functional manner as you mention.
But that ends up begging the question of how to convert a list like
[T,F,F,T] which represents the number 9 into an actual character '9'
when you're printing the thing out. For that you end up needing
"real" machine arithmetic anyway.

Fexl supports arbitrarily long arrays of words at the leaves of its
trees. By convention, the first word in a leaf array is a type
number, and this can designate various types like 32 or 64 bit
integer, 64-bit floating point, String, or even arbitrary BigNum.
That lets us write C code which operates on flat machine data. I use
a heap allocator which coalesces consecutive free arrays into a single
larger free array, so fragmentation is not an issue.

Lately I've been toying with a stripped down memory manager which
supports binary trees with *only* integers at the leaves. That way
all the nodes are the same size and I can use an ultra-simple free
list, with no length word overhead to support coalescing free blocks.
My thinking was that people could integrate larger data structures "on
the side" so to speak, and the Fexl tree leaves would sort of point
into a completely separate array of those larger data structures. The
problem with that approach is that it forces module designers to
implement their own memory management. It would also force the Fexl
programmer to create the data object and then finalize it when you're
done with it, sort of like a "with" concept in Lisp:

create_bignum \num
do_stuff_with num;
free_bignum num

But ugh, that's no fun -- even with a construct that automatically
frees the thing.

So I'm afraid the stripped-down tree with only integer leaves is a bit
*too* stripped down, and causes too many thorny implementation issues
for people writing modules like BigNum and String.


> I'm guessing that doing math in Fexl using the normal methods (like
> the FPU) would involve writing an extension in C? Maybe it's a little
> early for this, but do you have or are you planning a standard library
> for Fexl?

You bet I am, but I'm trying to keep that library as small as possible
right now. Just 32 bit signed integers so far, with addition and
comparison. I also have a string library, but lately I've been
handling strings as lists of integers. The string library allocates
arrays within the single Fexl heap, and does the correct memcpy
operations to handle append.

So yeah, I really do need to stick with the heap allocator which
allows arbitrarily sized leaves. My idea to handle that sort of thing
outside of the main Fexl memory arena is half-baked I think. Module
writers need the convenience of storing their slabs of data *right
there* in the Fexl tree without worrying about memory management.


> So, in Fexl the following code: "a (b c)" would always literally pass
> "b c" as the argument to a, and never the value of "b c". Does Fexl
> have a way to force early evaluation?

Not really, except you can always refactor your code to change the
order. In your example you might evaluate (b c) first, and have it
pass a value into a handler function like this:

b c \value ...

Then in the "..." there you'd be referring to the value produced by
evaluating (b c).

Your question has particular relevance for external module writers
though. I've grappled with this issue myself. Here's a simple
example:

add (add 1 2) (add 3 4)

When the "add" function written in C runs, it must examine the two
arguments to ensure that they are already leaf arrays of the proper
type (bignum or 32 bit int or whatever). If either argument is an
unevaluated form, the add function throws that argument on top of the
reduction stack to force it to evaluate first.

So (add 1 2) goes on the stack and reduces to a leaf num:3. The num:3
"function" knows just to pop the stack and not run anything. Now the
original add is seen again, and it looks like this now:

add num:3 (add 3 4)

So it says OK I had better push that second argument on the stack.
That reduces in place to num:7, and now the original add sees this:

add num:3 num:7

And at long last it can reduce itself to num:10.

Notice that because reduction is done IN PLACE, any references to the
original form (add (add 1 2) (add 3 4)) now just see a simple num:10
there -- a leaf value. So even if you have dozens of references to
the original form, the computation only happens once.


> I still need to think about what it means to have lazy evaluation and
> side effects. At first glace it seems to me that could lead to
> unpredictable evaluation order. As long as the language is consistent,
> maybe that's not true. You have obviously thoroughly thought this
> through, so I'm sure it works. I'm sure I just still don't fully
> understand it. I still have a lot to learn.

I think the language is consistent enough to make that predictable.
In Fexl, you know that the evaluation is always chewing away on the
far LEFT side of the function tree.



> I guess I'm kind of glad to learn that Fexl is lazy. I briefly played
> with Haskell a couple years ago. I remember the two things I didn't
> like about Haskell: its purity (no side effects), and its humongous
> syntax. It seems like Fexl neatly improves both issues.

Certainly Fexl has the smallest syntax imaginable! Unlike some
functional languages, I have opted against the typical "pattern
matching" approach. I think the straight-up lambda forms with no
pattern matching clauses makes reasoning and execution especially
easy.

As for purity, you can do the whole "monadic" thing in Fexl if you
like -- and actually that can be an amazingly compact way of
expressing things because you can completely eliminate references to
chained "state".

But if you want to write the "cat" program in Fexl, you really just
need to say this:

\cat = (get_char \ch eq EOF ch halt; put_char ch cat)

My point being, the calls to get_char and put_char have to go
*somewhere* ya know. You can disguise things if you like, for example
making the standard input look like a pure list of characters:

\input = (get_char \ch eq EOF null; cons ch input)

Then you can define a put_list function like this:

\put_list = (\list list halt \ch\tail put_char ch; put_list tail)

And then "cat" becomes:

\cat = (put_list input)

But what have you really accomplished? You can push the get_char and
put_char around like toothpaste in a tube all you want, but it's still
gonna be *in the tube*.


And of course,
> being lazy means Fexl can do the infinitely long lists like Haskell
> has (which should just be a matter of infinite lazy recursion). I
> always thought that infinite lists were one of Haskell's best
> practical features.

Oh yes I *love* infinite data structures. In Fexl you can even do
arithmetic on the full set of *computable* numbers. For example, you
can use an infinite decimal expansion of pi if you like, and add that
to an infinite decimal expansion of e. Though it might slow down a
bit if you try printing out too many decimal places. :)


> I really appreciate you taking the time to respond, and I'm really
> impressed with just how well you've thought out and designed Fexl.

Thanks! These long replies do help me sort out my thoughts, and they
will serve as future documentation. It's great having a reader who
really knows his stuff when it comes to language semantics and trade-
offs, and can ask the demanding questions. You have a great deal of
experience and insight.


-- Patrick

Patrick Chkoreff

unread,
Oct 3, 2010, 5:55:02 PM10/3/10
to fexl
On Oct 3, 4:59 pm, Patrick Chkoreff <loomsp...@gmail.com> wrote:

> Lately I've been toying with a stripped down memory manager which
> supports binary trees with *only* integers at the leaves.  That way
> all the nodes are the same size and I can use an ultra-simple free
> list, with no length word overhead to support coalescing free blocks.
> My thinking was that people could integrate larger data structures "on
> the side" so to speak, and the Fexl tree leaves would sort of point
> into a completely separate array of those larger data structures.  The
> problem with that approach is that it forces module designers to
> implement their own memory management.  It would also force the Fexl
> programmer to create the data object and then finalize it when you're
> done with it, sort of like a "with" concept in Lisp:
>
>   create_bignum \num
>       do_stuff_with num;
>      free_bignum num
>
> But ugh, that's no fun -- even with a construct that automatically
> frees the thing.


But on the other hand, sometimes I think the stripped-down tree idea
still has merit. Chances are the "BigNum" library is *already
written* anyway, and you just need to write the Fexl wrapper for it.
That BigNum code might already be doing the mallocs and such anyway!
So the BigNum values aren't going to be stored inside the Fexl arena,
period. Now if the BigNum library is written in such a way that you
just give it pointers to data and make your own storage decisions,
then you could just call malloc or whatever. Or perhaps better yet,
use the heap allocator which I wrote, even though I'm not using it for
the internal Fexl trees.

Here's another reason why the "with" idea for creating and finalizing
an object still has merit. Consider the simple act of opening a data
file, processing its contents, and then closing the file. Having the
file descriptor record inside the Fexl arena doesn't really help -- I
don't have type-specific "finalize" routines or anything like that.

So let's say the core Fexl interpreter just has the two "raw"
unguarded functions raw_open and raw_close. You could write your own
guarded "open" function like this:

\open = (\name\process raw_open name \file process file; raw_close
file_handle)

Now you can write a file processor function such as this:

\parse_csv_data = (\file ... do stuff with file handle ...)

To run that on a file, you can say this:

open "data.csv" parse_csv_data

That prevents you from forgetting to close the file.

Now let's apply that same concept to the creation and destruction of
some external data type such as "BigNum". You might say something
like this:

new "40782265" \x
new "77511227" \y
add x y \z
string z \result
print result

(Note that I'm keeping the function names very short instead of
prefixing them all with "bignum_" or whatever. It's trivial to make
short synonyms for long names.)

That function would print "118293492". The programmer doesn't have to
worry about "freeing" the various bignum values created. What's
actually happening under the hood is that the various bignum functions
are wrappers around inner "raw" versions, and these wrappers
automatically destroy the results:

\new = (\string\process raw_new string \num process num; destroy
num)

\add = (\x\y \process raw_add x y \z process z; destroy z)


> So I'm afraid the stripped-down tree with only integer leaves is a bit
> *too* stripped down, and causes too many thorny implementation issues
> for people writing modules like BigNum and String.

On second thought, now I'm thinking I should stick with the simple
integer tree approach a while longer, and hold onto that idea of
keeping arbitrarily sized data structures "on the side".

Patrick Chkoreff

unread,
Oct 3, 2010, 5:59:54 PM10/3/10
to fexl
Small error here:

\open = (\name\process
raw_open name \file
process file;
raw_close file_handle)

I meant to say:

\open = (\name\process
raw_open name \file
process file;
raw_close file)

Lewis Van Winkle

unread,
Oct 3, 2010, 7:32:10 PM10/3/10
to fexl


On Oct 3, 3:59 pm, Patrick Chkoreff <loomsp...@gmail.com> wrote:
> Lately I've been toying with a stripped down memory manager which
> supports binary trees with *only* integers at the leaves. That way
> all the nodes are the same size and I can use an ultra-simple free
> list, with no length word overhead to support coalescing free blocks.
> My thinking was that people could integrate larger data structures "on
> the side" so to speak, and the Fexl tree leaves would sort of point
> into a completely separate array of those larger data structures. The
> problem with that approach is that it forces module designers to
> implement their own memory management. It would also force the Fexl
> programmer to create the data object and then finalize it when you're
> done with it, sort of like a "with" concept in Lisp:

In some cases having to explicitly free something isn't so bad. For
example, when a system resource is tied up, like with file I/O, there
is no choice but to have an explicit close statement.

I also think that keeping things simple has a lot of merit. If you
only give storage for integers, that's enough to store a pointer, and
therefor anything.

Lua gives users to way to extend with objects in C. They have
"userdata" which is memory managed by Lua itself, and "light userdata"
which is merely a pointer sized amount of unmanaged storage space. Lua
allows userdata to call a finalizer upon garbage collection, but light
userdata is completely un-managed (and therefor the Lua program must
explicitly free if/when required). I always thought that the memory
storage for the garbage collected data was mostly unnecessary. C
programmers are comfortable doing memory management already. Why try
to force them to use another method? All they need is some way for the
garbage collector to call a finalizer, and then the C programmer can
handle clean-up. (I'm sure Lua has good reasons for doing things the
way they did. Lua is actually very well thought out. Also, I think
that light userdata is a newer feature in Lua than userdata, so maybe
the reason is historical.)

In TCL, each new user defined type must be registered (by providing a
free function, a copy function, etc). Each object then stores a
pointer to its type. A user defined type object in TCL can store two
pointers. The only reason they allow two pointers, I believe, is
because a TCL object can be a double, and therefor already has 64-bits
(two pointers worth) of space reserved. TCL does reference counting,
which seems to work pretty well for it. I think this system could work
well for Fexl too. In any case, I prefer TCL's extension and memory
management system over Lua's, personally.

Is there some inherit reason that you can't allow only 32 bits of
storage, but at the same time also call a finalizer when a leaf is
purged? I'm not following everything yet, so maybe that doesn't even
make sense. I do feel that giving C users more storage than a single
data pointer for storage is unnecessary.

> create_bignum \num
> do_stuff_with num;
> free_bignum num
>
> But ugh, that's no fun -- even with a construct that automatically
> frees the thing.
With something like file I/O or sockets, that construct becomes
necessary. With something like a bignum, I think that construct is
unexcusible. I mean, may as well just use C then, right?


> You bet I am, but I'm trying to keep that library as small as possible
> right now. Just 32 bit signed integers so far, with addition and
> comparison. I also have a string library, but lately I've been
> handling strings as lists of integers. The string library allocates
> arrays within the single Fexl heap, and does the correct memcpy
> operations to handle append.

If you haven't already considered it, you should probably break the
standard library up into separate modules. Lua, Pawn, and even C do
this. When I've used Lua before, I never wanted file I/O in the
standard library (among other things) so I simply didn't compile it
in. As an added bonus, the program used less memory too.

> So yeah, I really do need to stick with the heap allocator which
> allows arbitrarily sized leaves. My idea to handle that sort of thing
> outside of the main Fexl memory arena is half-baked I think. Module
> writers need the convenience of storing their slabs of data *right
> there* in the Fexl tree without worrying about memory management.
Again, I don't really see the benefit. If you allow them space to
store a pointer, they can store anything. C programmers are used to
doing manual memory management, and it would probably be a bigger
hassle to be forced into another method. I'm not sure having my data
stored *right there* in the Fexl tree would be convenient. In any
case, whether you give them 32 bits or an arbitrary amount, you really
need to allow a callback finalizer function.

Just my two cents.

The other big issue is how copying will work...


>I also have a string library, but lately I've been
>handling strings as lists of integers. The string library allocates
>arrays within the single Fexl heap, and does the correct memcpy
>operations to handle append.
You're probably not familiar with the Pawn programming language, but
it handles strings in this way. In Pawn, everything is an integer.
Even real numbers are stored as an integer, and C code must cast to a
float to work with them. I admire the consistency, but I'm not sure if
I like the idea or not. Every string communication between C and Pawn
has a linear conversion cost.

While I'm on the subject, I just want to throw in that Pawn has really
well done documentation. I've played my fair share with small
languages. Lua's documentation seems a bit lacking, TCL is better, and
all the Lua knock-offs (GameMoney, Squirrel, etc..) seem to have
almost non-existent documentation. TinyScheme has no usable
documentation to speak of. Pawn really does have top notch
documentation though.

An advantage of using a 32-bit integer for strings is that multi-byte
character sets would be trivial to support (up to 4 bytes, anyway). A
down side is conversions back and forth, and not being able to simply
throw strings into C and back. Since Fexl may be well suited to web
applications, having it use normal C null terminated strings seems
like it would be a huge speed up.

Why not just store a pointer to a null terminated string? It's a tried
and true, and its limitations (and security risks) are well
understood. Furthermore, it would allow you to easily piggy-back on
C's existing string library.


> Your question has particular relevance for external module writers
> though. I've grappled with this issue myself.
Thanks for explaining this. I had wondered about how it related to C
modules too. It seems that a force evaluation function could easily be
written in C, if need be.


> I think the language is consistent enough to make that predictable.
> In Fexl, you know that the evaluation is always chewing away on the
> far LEFT side of the function tree.
I think you've really thought this through, and I believe you're
right. Consistency is key, IMO.


> Certainly Fexl has the smallest syntax imaginable! Unlike some
> functional languages, I have opted against the typical "pattern
> matching" approach. I think the straight-up lambda forms with no
> pattern matching clauses makes reasoning and execution especially
> easy.
I agree. Pattern matching is neat in Haskell, but I wish they didn't
add ten pounds of syntax to accomplish it. I supposed a lot of people
like the terse and mathematical looking syntax of Haskell, but I much
prefer the simplicity and consistency of TCL and Lisp.

Which brings up another question. Are function definitions in Fexl a
special case? It seems like your syntax for functions could be more
consistent. For example in TCL the "proc" function is called to define
a function. In Scheme the "define" function in combination with the
"lambda" function does the same thing. In Lua the "function" function
defines a function and the = operator names it (although Lua also
allow additional syntax to accomplish the same thing in a more
traditional BASIC like form).

But it looks like Fexl uses a special case, instead of a standard
looking function to define a function. So what made you decide on the
syntax to define a function like:

\T = (\T\F T)

Instead of something like:

= \T (\T\F T)

where it looks like you're just calling the function "=" to define a
new function. Simply pass the function "=" a name (\T) and an
anonymous function ((\T\F T)), and it adds it to the environment. In
that case, "=" is a normal function just like any other, but it has
the side effect of creating a new function in the interpreter.


> As for purity, you can do the whole "monadic" thing in Fexl if you
> like -- and actually that can be an amazingly compact way of
> expressing things because you can completely eliminate references to
> chained "state".
Honestly, I only toyed with Haskell a bit, and the concept of monads
never really clicked with me. I gathered it was supposed to be some
big elegant epiphany, but all I ever understood was that they forced
evaluation. I need to go back and take another look sometime.

Side effects are really the only practical thing a program can do.
Then again, I come from a hardware/assembly/C background, not a
mathematical background.


> Thanks! These long replies do help me sort out my thoughts, and they
> will serve as future documentation. It's great having a reader who
> really knows his stuff when it comes to language semantics and trade-
> offs, and can ask the demanding questions. You have a great deal of
> experience and insight.

Thanks for the kind words. I'm really just a dilettante, and I
probably don't know nearly as much as you're giving me credit for. I
do hope that this discussion is useful for you though, and I'm not
merely slowing you down. I can hardly wait to see the first release of
Fexl.


After re-reading this, I realized that a consistent language usually
has an "everything is a" type mantra. For example, in TCL everything
is a string. In Lua, everything is a table. In Io or Smalltalk,
everything's an object. In Pawn, everything's an integer. In J
everything's an array.

In Fexl, everything is a function.


-Lewis

Lewis Van Winkle

unread,
Oct 3, 2010, 7:53:53 PM10/3/10
to fexl
Well I wrote my latest reply without seeing your last reply. It's
interesting that we seem to agree on keeping the storage outside of
the Fexl.


Just to clarify on what I was saying, I think you should allow 64 bits
for each value (plus reference counting). Each object in Fexl could be
stored with the following structure (with reference counting tacked on
somehow in addition):

struct FexlObj
{
struct ObjType* type;
union
{
int data;
void* ptr;
};
};

If type is 0, then the value is an integer and no memory management
need be done. The ObjType structure might look like:

struct ObjType
{
FreeFunctionPtr free;
CopyFunctionPtr copy;
//any extra functions, like DebugPrint etc...
};

A custom user data, like BigNum, would simply have a static instance
of ObjType with the proper functions pointers filled in. Then each
BigNum object would have its type set to the address of the static
type definition object, and ptr would point directly to the BigNum
object in C. The BigNum functions (like add) would be able to check if
they were passed a BigNum object by simply checking its type pointer.
Once the BigNum obect is done being used (reference count = 0), it
calls the free function from the object's type pointer.

Hope that makes sense. It just a slightly simplified version of how
TCL does it. From an extension writer's standpoint, I think it works
very well.

-Lewis

Lewis Van Winkle

unread,
Oct 3, 2010, 8:07:44 PM10/3/10
to fexl


On Oct 3, 4:55 pm, Patrick Chkoreff <loomsp...@gmail.com> wrote:
> That BigNum code might already be doing the mallocs and such anyway!
Actually, not only may it already be doing mallocs, it probably has no
choice. A BigNum library with arbitrary precious requires an undefined
amount of memory. Therefor, freeing the memory probably means parsing
a tree several pointers deep.

That's why I think Lua's userdata was implemented kind of silly. Sure
I can ask Lua for 57 bytes of garbage collected storage, but how often
will I know exactly how much memory I need? How often will my
structure contain no deep pointers or system resources? Almost never.
It's better to just have one pointer and let C programmers do what C
programmers do, IMO.


-Lewis

Patrick Chkoreff

unread,
Oct 4, 2010, 10:55:04 AM10/4/10
to fexl
Yes, great minds think alike ... just this morning I was mulling over
all my difficulties and I realized they stemmed from my desire to (1)
eliminate recursion completely from my C functions and (2) take over
memory management completely, even to the point of doing my own
execution stack, just so I could avoid the dreaded "Segmentation
fault" error if you evaluated a really insane function like (Y Y). My
fears were stoked by the thought of putting this interpreter on a
public web site.

But I'm thinking now that even a "seg fault" when evaluating (Y Y)
submitted by an evil black hat is not the end of the world. So I'm
just going to embrace recursive C functions. I'm certainly using them
for the parser, but I'll also call the "reduce" function recursively
as well, for those occasions like (add X Y) when the add function
needs to force X and Y to become normal forms. Of course, reduce
still has a simple loop to avoid tail recursion, so I can still
evaluate infinite loops like (Y I) just fine.

Roughly speaking, I'm going back to my C code from a year ago. I'm
even going to embrace "malloc" again and not fight it so hard. I'll
start by just blindly calling malloc and free and get things working
there, and then almost as an afterthought I'll throw in a free list,
and perhaps resurrect the lazy free technique (instead of aggressively
freeing entire tree structures recursively all the way down when all
the ref counts drop to zero).

(It's amazing how differently I thought when implementing in Perl,
because I just took the language infrastructure for granted then, and
didn't try to fight it.)

The node structure is similar to what you envision, but a little
different because I elevate the concept of a "pair" to a special
status. It's essentially like this:

struct node
{
int count; /* reference count */
struct type * type;
struct node *left;
struct node *right;
};

For portability reasons I wrap the left and right pointers into a
struct, and put that in a union with some other data types. I also
wrap a union around the reference count so it can double as a free
list pointer:

struct node
{
union
{
int count; /* reference count of live node */
struct node *prev; /* links nodes on free list */
} hdr;

struct type *type;

union
{
int v_int;
short v_short;
char *v_str;
void *v_pointer;
long v_long;
long long v_long_long;
float v_float;
double v_double;
struct
{
struct node *L;
struct node *R;
} pair;
} data;
};

Oh and by the way, the type structure is:

struct type
{
struct node *(*value)(struct node *parent);
void (*clear)(struct node *);
}

The clear function is called when the node's reference count drops to
zero.

The value function does the actual evaluation work. When we have a
function application node, we call the type-specific value function
for the *left* side of the function application, passing it a pointer
to the parent application node. That way it can evaluate itself,
replacing the left side of the function application with the new
value, and continue with the argument on the right side.

With that in mind, the S combinator becomes:

#include "node.h"
#include "type_app.h"
#include "type_S.h"

static struct node *reduce(struct node *parent)
{
if (0 == parent->LEFT->LEFT || 0 == parent->LEFT->LEFT->LEFT)
{
parent->type = parent->LEFT->type;
return parent;
}

struct node *x = parent->LEFT->LEFT->RIGHT;
struct node *y = parent->LEFT->RIGHT;
struct node *z = parent->RIGHT;

struct node *fun = make_app(x,z);
struct node *arg = make_app(y,z);

hold(fun);
hold(arg);

parent->type->clear(parent);
parent->LEFT = fun;
parent->RIGHT = arg;

return fun->type->reduce(parent);
}

static struct type type = { reduce, clear_pair };

struct node *make_S(void)
{
return make_pair(&type,0,0);
}


Which, by golly, is exactly what I posted on HN almost a year ago:

http://news.ycombinator.com/item?id=1101240

My goodness, talk about chasing one's tail in a circle. :) Commit,
man, commit!


-- Patrick

Patrick Chkoreff

unread,
Oct 4, 2010, 11:09:26 AM10/4/10
to fexl

> But it looks like Fexl uses a special case, instead of a standard
> looking function to define a function. So what made you decide on the
> syntax to define a function like:
>
> \T = (\T\F T)
>
> Instead of something like:
>
> = \T (\T\F T)


The function definition syntax is based on a very simple idea. This
syntactic form:

\name=def body

Is equivalent to this syntactic form:

(\name body) def

In other words, you don't actually *need* a special-purpose syntax for
defining functions -- I just provide it for convenience. The special
syntax allows you to say this:

\do_this = (... definition of do_this ...)
\do_that = (... definition of do_that ...)
... main program ...

Instead of this:

(
\do_this
\do_that
... main program ...
)
(... definition of do_this ...)
(... definition of do_that ...)


You see how in the latter version, I parameterized my main program in
terms of do_this and do_that, and then I *passed in* the definitions
of do_this and do_that as arguments. You can actually use this latter
style if you prefer to define your auxiliary functions *after* the
main function (kind of like saying "where do_this equals ..." after
the fact).

Also I think it's more natural to say this:

\name = def
body

Than this:

=\name def
body

I like the solid and natural feel of seeing "name = def" when I define
a name. Putting the '=' in between the name and the definition is
just the natural and conventional thing to do.


> where it looks like you're just calling the function "=" to define a
> new function. Simply pass the function "=" a name (\T) and an
> anonymous function ((\T\F T)), and it adds it to the environment. In
> that case, "=" is a normal function just like any other, but it has
> the side effect of creating a new function in the interpreter.

I like that you're going after elegance, but the currently reality is
already probably more elegant than you imagine. I don't even *have* a
central table of defined function names anywhere in the interpreter!
Everything just reduces to S and C, and even the definitions of named
functions are just passed in as arguments to the other functions which
use them.


-- Patrick

Patrick Chkoreff

unread,
Oct 4, 2010, 11:26:13 AM10/4/10
to fexl
On Oct 3, 7:32 pm, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:

> where it looks like you're just calling the function "=" to define a
> new function. Simply pass the function "=" a name ...

Also, quite simply, "=" could never actually be a function because it
has no closed form definition in terms of S and C. How's that for a
succinct, absolutely true, but slightly cheeky answer? :)


Patrick Chkoreff

unread,
Oct 4, 2010, 11:31:08 AM10/4/10
to fexl
On Oct 4, 10:55 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:

> Roughly speaking, I'm going back to my C code from a year ago.  I'm
> even going to embrace "malloc" again and not fight it so hard.  I'll
> start by just blindly calling malloc and free and get things working
> there, and then almost as an afterthought I'll throw in a free list,
> and perhaps resurrect the lazy free technique (instead of aggressively
> freeing entire tree structures recursively all the way down when all
> the ref counts drop to zero).

Good grief, what's the matter with me? All that code is *already
done* in my "node.c" file dated (get this) 2009-11-29! I even have
code which allocates new node memory in pages, up to a fixed maximum
number of pages to avoid crashing the machine.

It's all there! All I really have to do is port my current parser
back into that version of the code. I really have made great strides
with parsing and converting to S and C on the fly as I go, so it's not
like I haven't achieved anything since 2009. :)

Lewis Van Winkle

unread,
Oct 5, 2010, 6:58:23 PM10/5/10
to fexl

On Oct 4, 9:55 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:
> But I'm thinking now that even a "seg fault" when evaluating (Y Y)
> submitted by an evil black hat is not the end of the world.  So I'm
> just going to embrace recursive C functions.  I'm certainly using them
> for the parser, but I'll also call the "reduce" function recursively
> as well, for those occasions like (add X Y) when the add function
> needs to force X and Y to become normal forms.  Of course, reduce
> still has a simple loop to avoid tail recursion, so I can still
> evaluate infinite loops like (Y I) just fine.
I'm probably not following all this yet, but why not just have a
counter that get's incremented each time a recursive function is
called, and decremented each time a one returns? Then if the counter
exceeds a preset, bail out with an error. Otherwise, converting
recursive code into iterative code is usually just a matter of adding
a stack (which in C is usually just a pointer array and index).

For tail recursion, don't forget about the lowly goto. I've seen
people add tens of if, switch, and loop statements just to avoid one
goto. Seriously, it's not like a velociraptor is going to just spring
out of nowhere...

That said, having (Y Y) seg fault the interpreter is probably just
fine for a first release of Fexl. In the future though, I really think
it would be in your best interest to make the interpreter impossible
to hard crash (except through C extensions, obviously, which couldn't
be prevented). As far as I know, most other scripting languages would
consider it a bug if code could crash the interpreter/virtual machine.
And of course, people may be reluctant to embed Fexl in their process
if it was possible to cause a crash it.

Of course, recursive functions are also much prettier for this type of
work. Especially this early in the game, you probably want to easiest
to change code, not the most optimized/robust.


> Roughly speaking, I'm going back to my C code from a year ago.  I'm
> even going to embrace "malloc" again and not fight it so hard.  I'll
> start by just blindly calling malloc and free and get things working
> there, and then almost as an afterthought I'll throw in a free list,
> and perhaps resurrect the lazy free technique (instead of aggressively
> freeing entire tree structures recursively all the way down when all
> the ref counts drop to zero).
For a first version, I don't see any reason the shun malloc. I mean,
that's what it's there for.



>     union
>         {
>         int v_int;
>         short v_short;
>         char *v_str;
>         void *v_pointer;
>         long v_long;
>         long long v_long_long;
>         float v_float;
>         double v_double;
>         struct
>             {
>             struct node *L;
>             struct node *R;
>             } pair;
>         } data;
>     };
I'd be interested to hear about Fexl's type system sometime (that's a
lot of types you have there!).

> Which, by golly, is exactly what I posted on HN almost a year ago:
>
> http://news.ycombinator.com/item?id=1101240
>
> My goodness, talk about chasing one's tail in a circle.  :)  Commit,
> man, commit!
That's way version control exists, right?

-Lewis

Lewis Van Winkle

unread,
Oct 5, 2010, 7:11:16 PM10/5/10
to fexl


On Oct 4, 10:09 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:
> I like that you're going after elegance, but the currently reality is
> already probably more elegant than you imagine.  I don't even *have* a
> central table of defined function names anywhere in the interpreter!
> Everything just reduces to S and C, and even the definitions of named
> functions are just passed in as arguments to the other functions which
> use them.
I was pretty sure you had a good reason for doing things the way they
are. I just didn't know what that reason was. Honestly, I'm still not
really following it all.

>Also, quite simply, "=" could never actually be a function because it
>has no closed form definition in terms of S and C. How's that for a
>succinct, absolutely true, but slightly cheeky answer? :)
That's actually a really good point that I hadn't though of. Wouldn't
that be true, though, for any function that has side-effects, or any
function that's defined as a C extension?

>It's all there! All I really have to do is port my current parser
>back into that version of the code. I really have made great strides
>with parsing and converting to S and C on the fly as I go, so it's not
>like I haven't achieved anything since 2009. :)
Great. I'm still pretty anxious to see some code soon.

-Lewis

Patrick Chkoreff

unread,
Oct 7, 2010, 10:28:37 AM10/7/10
to fexl
On Oct 5, 7:11 pm, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:

> >Also, quite simply, "=" could never actually be a function because it
> >has no closed form definition in terms of S and C.  How's that for a
> >succinct, absolutely true, but slightly cheeky answer?  :)
>
> That's actually a really good point that I hadn't though of. Wouldn't
> that be true, though, for any function that has side-effects, or any
> function that's defined as a C extension?

True, but if you eliminate the side effects from a function like
"print" at least it still has a definition as (C I), and will still go
through the combinatoric motions so to speak. But "=" can't be
defined as a function in any sense, either in terms of S and C or even
as a C extension, any more than "(" or ")" can. It's just a syntactic
token.

Keep in mind that Fexl doesn't even *need* a special syntax for
definitions: they can done with lambda variables and function
application. I only throw in the special syntax for convenience.

And hey, if someone wants to put the "=" sign before the name instead
of after it, I won't stop them from making their own version of Fexl.


> Great. I'm still pretty anxious to see some code soon.

Yeah, although I am in the throes of monthly hedge fund accounting
right now, I did take a little time yesterday to start integrating the
parser back into my really excellent year-old code.

Patrick Chkoreff

unread,
Oct 11, 2010, 9:38:36 AM10/11/10
to fexl


On Oct 5, 6:58 pm, Lewis Van Winkle <lewisvanwin...@gmail.com> wrote:
> On Oct 4, 9:55 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:> But I'm thinking now that even a "seg fault" when evaluating (Y Y)
> > submitted by an evil black hat is not the end of the world.  So I'm
> > just going to embrace recursive C functions. ...


> I'm probably not following all this yet, but why not just have a
> counter that get's incremented each time a recursive function is
> called, and decremented each time a one returns? Then if the counter
> exceeds a preset, bail out with an error. Otherwise, converting
> recursive code into iterative code is usually just a matter of adding
> a stack (which in C is usually just a pointer array and index).


I've messed with doing my own stack in the past, but lately I've just
learned to love the built-in system stack. They do it faster anyway.
Plus, it's all a question of primary goals here. My primary goal is
not to put a super-safe Fexl interpreter on the web. My primary goal
is to wrap a thin functional programming layer around the C
programming language.


> For tail recursion, don't forget about the lowly goto. I've seen
> people add tens of if, switch, and loop statements just to avoid one
> goto. Seriously, it's not like a velociraptor is going to just spring
> out of nowhere...


Instead of goto, I just use a while loop. For example here is the
code which reduces a function application:

http://news.ycombinator.com/item?id=1101761

The "while" loop you see there implements tail recursion.

Of course you'll notice that after we drop out of the while loop, we
immediately call "reduce" on the resulting form, which might be an "S"
form or whatever. That form can evolve into another function
application, which would bring us right back to the original reduce
routine, called recursively.



> That said, having (Y Y) seg fault the interpreter is probably just
> fine for a first release of Fexl.

In light of my primary goal, which is to wrap a thin functional
programming layer around the C programming language, having (Y Y) seg
fault the interpreter is probably just fine *forever*. After all, if
you want to write a C program that segfaults, just call this function:

int do_segfault(void)
{
return do_segfault() + do_segfault();
}

If it's easy to segfault in C, it should be easy in Fexl as well.


> In the future though, I really think
> it would be in your best interest to make the interpreter impossible
> to hard crash (except through C extensions, obviously, which couldn't
> be prevented). As far as I know, most other scripting languages would
> consider it a bug if code could crash the interpreter/virtual machine.
> And of course, people may be reluctant to embed Fexl in their process
> if it was possible to cause a crash it.


At some basic level, programming in raw Fexl gives you the same power
as programming in raw C. You can overflow the stack, eat up all the
memory in your machine, and delete entire directories of files. I
think when running Fexl on the web from untrusted sources, I could
manage that with safeguarded versions of the various reduction
functions, as you suggest.


> Of course, recursive functions are also much prettier for this type of
> work. Especially this early in the game, you probably want to easiest
> to change code, not the most optimized/robust.

Right. Then again, I figure that the C compiler does a fine job
managing an execution stack, so why should I bother doing it myself?
Sure OK, so I can detect when it's about to blow up -- but in the
pursuit of my primary goal, I don't actually care about that.




> >     union
> >         {
> >         int v_int;
> >         short v_short;
> >         char *v_str;
> >         void *v_pointer;
> >         long v_long;
> >         long long v_long_long;
> >         float v_float;
> >         double v_double;
> >         struct
> >             {
> >             struct node *L;
> >             struct node *R;
> >             } pair;
> >         } data;
> >     };
>
> I'd be interested to hear about Fexl's type system sometime (that's a
> lot of types you have there!).


Fexl has no type system. Or rather, there is exactly one type: a
function. A function maps a single input to a single output. Both
the input and the output are also functions.

The types you see in the big C union there have two purposes: 1. To
guarantee portably that the structure is large enough to hold all of
them, and 2. To avoid having to use typecasts when you write an
extension. For example if your extension needs to store a float, you
can just put it in v_float.

I could have avoided all of that by just having the two node pointers
L and R. I'm quite sure that two pointers are larger than a (long
long) or a double or any of those other types, and the extension
writer could just do a typecast. But the union guarantees the sizes,
avoids the need for typecast, and probably provides some alignment
benefits.


-- Patrick

Patrick Chkoreff

unread,
Oct 11, 2010, 9:44:15 AM10/11/10
to fexl
On Oct 11, 9:38 am, Patrick Chkoreff <loomsp...@gmail.com> wrote:

> Fexl has no type system.  Or rather, there is exactly one type:  a
> function.  A function maps a single input to a single output.  Both
> the input and the output are also functions.

I need to qualify that. Certainly a function like "add" will check
its arguments (after evaluation) to ensure that they are the proper
numeric types (e.g. int, long, float, special-purpose BigNum, or
whatever). But note that it's all *dynamic* type checking. Static
type checking a pure Fexl function is pretty much impossible. That
doesn't worry me though.


Reply all
Reply to author
Forward
0 new messages