On Oct 3, 3:59 pm, Patrick Chkoreff <
loomsp...@gmail.com> wrote:
> Lately I've been toying with a stripped down memory manager which
> supports binary trees with *only* integers at the leaves. That way
> all the nodes are the same size and I can use an ultra-simple free
> list, with no length word overhead to support coalescing free blocks.
> My thinking was that people could integrate larger data structures "on
> the side" so to speak, and the Fexl tree leaves would sort of point
> into a completely separate array of those larger data structures. The
> problem with that approach is that it forces module designers to
> implement their own memory management. It would also force the Fexl
> programmer to create the data object and then finalize it when you're
> done with it, sort of like a "with" concept in Lisp:
In some cases having to explicitly free something isn't so bad. For
example, when a system resource is tied up, like with file I/O, there
is no choice but to have an explicit close statement.
I also think that keeping things simple has a lot of merit. If you
only give storage for integers, that's enough to store a pointer, and
therefor anything.
Lua gives users to way to extend with objects in C. They have
"userdata" which is memory managed by Lua itself, and "light userdata"
which is merely a pointer sized amount of unmanaged storage space. Lua
allows userdata to call a finalizer upon garbage collection, but light
userdata is completely un-managed (and therefor the Lua program must
explicitly free if/when required). I always thought that the memory
storage for the garbage collected data was mostly unnecessary. C
programmers are comfortable doing memory management already. Why try
to force them to use another method? All they need is some way for the
garbage collector to call a finalizer, and then the C programmer can
handle clean-up. (I'm sure Lua has good reasons for doing things the
way they did. Lua is actually very well thought out. Also, I think
that light userdata is a newer feature in Lua than userdata, so maybe
the reason is historical.)
In TCL, each new user defined type must be registered (by providing a
free function, a copy function, etc). Each object then stores a
pointer to its type. A user defined type object in TCL can store two
pointers. The only reason they allow two pointers, I believe, is
because a TCL object can be a double, and therefor already has 64-bits
(two pointers worth) of space reserved. TCL does reference counting,
which seems to work pretty well for it. I think this system could work
well for Fexl too. In any case, I prefer TCL's extension and memory
management system over Lua's, personally.
Is there some inherit reason that you can't allow only 32 bits of
storage, but at the same time also call a finalizer when a leaf is
purged? I'm not following everything yet, so maybe that doesn't even
make sense. I do feel that giving C users more storage than a single
data pointer for storage is unnecessary.
> create_bignum \num
> do_stuff_with num;
> free_bignum num
>
> But ugh, that's no fun -- even with a construct that automatically
> frees the thing.
With something like file I/O or sockets, that construct becomes
necessary. With something like a bignum, I think that construct is
unexcusible. I mean, may as well just use C then, right?
> You bet I am, but I'm trying to keep that library as small as possible
> right now. Just 32 bit signed integers so far, with addition and
> comparison. I also have a string library, but lately I've been
> handling strings as lists of integers. The string library allocates
> arrays within the single Fexl heap, and does the correct memcpy
> operations to handle append.
If you haven't already considered it, you should probably break the
standard library up into separate modules. Lua, Pawn, and even C do
this. When I've used Lua before, I never wanted file I/O in the
standard library (among other things) so I simply didn't compile it
in. As an added bonus, the program used less memory too.
> So yeah, I really do need to stick with the heap allocator which
> allows arbitrarily sized leaves. My idea to handle that sort of thing
> outside of the main Fexl memory arena is half-baked I think. Module
> writers need the convenience of storing their slabs of data *right
> there* in the Fexl tree without worrying about memory management.
Again, I don't really see the benefit. If you allow them space to
store a pointer, they can store anything. C programmers are used to
doing manual memory management, and it would probably be a bigger
hassle to be forced into another method. I'm not sure having my data
stored *right there* in the Fexl tree would be convenient. In any
case, whether you give them 32 bits or an arbitrary amount, you really
need to allow a callback finalizer function.
Just my two cents.
The other big issue is how copying will work...
>I also have a string library, but lately I've been
>handling strings as lists of integers. The string library allocates
>arrays within the single Fexl heap, and does the correct memcpy
>operations to handle append.
You're probably not familiar with the Pawn programming language, but
it handles strings in this way. In Pawn, everything is an integer.
Even real numbers are stored as an integer, and C code must cast to a
float to work with them. I admire the consistency, but I'm not sure if
I like the idea or not. Every string communication between C and Pawn
has a linear conversion cost.
While I'm on the subject, I just want to throw in that Pawn has really
well done documentation. I've played my fair share with small
languages. Lua's documentation seems a bit lacking, TCL is better, and
all the Lua knock-offs (GameMoney, Squirrel, etc..) seem to have
almost non-existent documentation. TinyScheme has no usable
documentation to speak of. Pawn really does have top notch
documentation though.
An advantage of using a 32-bit integer for strings is that multi-byte
character sets would be trivial to support (up to 4 bytes, anyway). A
down side is conversions back and forth, and not being able to simply
throw strings into C and back. Since Fexl may be well suited to web
applications, having it use normal C null terminated strings seems
like it would be a huge speed up.
Why not just store a pointer to a null terminated string? It's a tried
and true, and its limitations (and security risks) are well
understood. Furthermore, it would allow you to easily piggy-back on
C's existing string library.
> Your question has particular relevance for external module writers
> though. I've grappled with this issue myself.
Thanks for explaining this. I had wondered about how it related to C
modules too. It seems that a force evaluation function could easily be
written in C, if need be.
> I think the language is consistent enough to make that predictable.
> In Fexl, you know that the evaluation is always chewing away on the
> far LEFT side of the function tree.
I think you've really thought this through, and I believe you're
right. Consistency is key, IMO.
> Certainly Fexl has the smallest syntax imaginable! Unlike some
> functional languages, I have opted against the typical "pattern
> matching" approach. I think the straight-up lambda forms with no
> pattern matching clauses makes reasoning and execution especially
> easy.
I agree. Pattern matching is neat in Haskell, but I wish they didn't
add ten pounds of syntax to accomplish it. I supposed a lot of people
like the terse and mathematical looking syntax of Haskell, but I much
prefer the simplicity and consistency of TCL and Lisp.
Which brings up another question. Are function definitions in Fexl a
special case? It seems like your syntax for functions could be more
consistent. For example in TCL the "proc" function is called to define
a function. In Scheme the "define" function in combination with the
"lambda" function does the same thing. In Lua the "function" function
defines a function and the = operator names it (although Lua also
allow additional syntax to accomplish the same thing in a more
traditional BASIC like form).
But it looks like Fexl uses a special case, instead of a standard
looking function to define a function. So what made you decide on the
syntax to define a function like:
\T = (\T\F T)
Instead of something like:
= \T (\T\F T)
where it looks like you're just calling the function "=" to define a
new function. Simply pass the function "=" a name (\T) and an
anonymous function ((\T\F T)), and it adds it to the environment. In
that case, "=" is a normal function just like any other, but it has
the side effect of creating a new function in the interpreter.
> As for purity, you can do the whole "monadic" thing in Fexl if you
> like -- and actually that can be an amazingly compact way of
> expressing things because you can completely eliminate references to
> chained "state".
Honestly, I only toyed with Haskell a bit, and the concept of monads
never really clicked with me. I gathered it was supposed to be some
big elegant epiphany, but all I ever understood was that they forced
evaluation. I need to go back and take another look sometime.
Side effects are really the only practical thing a program can do.
Then again, I come from a hardware/assembly/C background, not a
mathematical background.
> Thanks! These long replies do help me sort out my thoughts, and they
> will serve as future documentation. It's great having a reader who
> really knows his stuff when it comes to language semantics and trade-
> offs, and can ask the demanding questions. You have a great deal of
> experience and insight.
Thanks for the kind words. I'm really just a dilettante, and I
probably don't know nearly as much as you're giving me credit for. I
do hope that this discussion is useful for you though, and I'm not
merely slowing you down. I can hardly wait to see the first release of
Fexl.
After re-reading this, I realized that a consistent language usually
has an "everything is a" type mantra. For example, in TCL everything
is a string. In Lua, everything is a table. In Io or Smalltalk,
everything's an object. In Pawn, everything's an integer. In J
everything's an array.
In Fexl, everything is a function.
-Lewis