Announcement: new Lisp implementation

Donald Fisk

unread,

Feb 17, 2003, 7:50:57 PM2/17/03

to

I have been developing Emblem, an implementation of a dialect
of Lisp, similar to a subset of Common Lisp, but with a few
important differences.

The system is now useful to me and I have been using it for
development of software applications for my own use. It is
not perfect but I am running into problems with it less and
less frequently.

So I intend to release it, on a shared source license similar
to the one which covers Microsoft CLI, and similar to the non-
commercial Creative Commons License. This prohibits commercial
use, but allows private individuals to use it, free of charge,
for any other purpose, and allows academic institutions to use
it for teaching and research. Users are supplied with the
source. Emblem will also be available for commercial use, for
a fee.

At present, it only runs on Linux, though it should be easily
portable to other unices. A version running on Microsoft
Windows will have to wait until I have the time and resources
necessary to port it to that platform.

The total size of the code is around 14000 lines, 9000 of which
are in C, the remainder being in Lisp. There are also 2000 lines
of documentation. This represents about a man year of work.

A brief overview follows.
(
* Emblem runs on its own virtual machine.
* Functions are compiled to byte code when they are defined.
* When you exit Emblem, you can save the world (or image, if you
prefer to call it that).
* Emblem has an object system, which is a subset of CLOS.
* It has a graphics system, implemented using Xlib, which supports
3d-graphics and high-level event handling.
* It supports its own threading mechanism.
* There is a connexion to Emacs, so you can select regions of files
and load them into the world.
* It comes with a web server, which contains a Wiki, which is used
for the documentation, and also an object browser.
* There is source and object level debugging.
* There is (CGI and shell) scripting support: regular expressions,
sockets, pipes.
* There are hacks (small demo programs), mostly GPL'ed. These include
tetris, life, munching squares and a 3d map of the Fort William region
of Scotland (including Ben Nevis) which you can 'walk' around.
)

The most noticeable differences between Emblem and Common Lisp are
(
* Emblem is smaller;
* Emblem's reader doesn't convert input to upper case;
* Emblem is a Lisp 1 (like Scheme);
* Emblem separates the empty list () from FALSE;
* No dotted pairs -- if x is a list, (cdr x) is also a list;
* Emblem allows you to declare the types of a function's arguments;
* Association lists are a separate type;
* I/O is through bidirectional channels rather than unidirectional
files.
)

I am developing this for a number of reasons:
(
* I intend it to be the implementation language of a language
which I think will be even better than Lisp. I don't want to
tell you much more than this until I have something to show,
but those who do know what I am talking about are impressed.
* I think that, in order for Lisp to continue evolving, it needs
testing grounds for new ideas, which by necessity involve
breaking the standard before they make the standard.
* I want a language implementation I have complete control over,
that I can modify and improve when I need to, and that I can
use for developing applications rapidly in.
* I need to increase my profile, get my name recognized, get a
reputation to improve my prospects.
)

If anyone is interested in trying out Emblem, please mail me
(remove nospam from my address in the message header) and I'll
send you the source, to your home address (or college). Please
be patient, and expect some problems, which I'll do my best to
fix (color allocation has only been tested for 8 bit colors, for
example).

Donald Fisk (a.k.a. Le Hibou)
--
In any large organization, mediocrity is almost by definition
an overwhelming phenomenon; the systematic disqualification
of competence, however, is the managers' own invention, for
the sad consequences of which they should bear the full blame.
-- Edsger W. Dijkstra, 1986.

Ray Blaak

unread,

Feb 18, 2003, 2:00:09 AM2/18/03

to

Donald Fisk <hibou000...@enterprise.net> writes:
> The most noticeable differences between Emblem and Common Lisp are

> * Emblem is a Lisp 1 (like Scheme);
> * Emblem separates the empty list () from FALSE;

I am not bothered by these myself, but a lot of Lispers would be. My question
though is why not just use Scheme?

Don't use FALSE, though, since that prevents symbol manipulation involving
that word (as does t and nil, but they at least have an established
history). Scheme's notion of #t and #f, for example, lets those constants be
out of the user namespace of symbols.

I would recommend #true and #false (or #t and #f) and #nil if a non-list
"bottom value" is needed.

> * No dotted pairs -- if x is a list, (cdr x) is also a list;

This would bug me -- it destroys the simplicity of the list type, and is
unnecessary.

What lisp programmer would put up with this?

Overall, I would just pick a Scheme that runs in a Java VM: better defined and
farther evolved in terms of features, testing, and bug fixes.

> I am developing this for a number of reasons:

Unless you give an idea of the new features, there is not yet a single reason
for any existing Lisp or Scheme programmer to switch to Emblem.

--
Cheers, The Rhythm is around me,
The Rhythm has control.
Ray Blaak The Rhythm is inside me,
bl...@telus.net The Rhythm has my soul.

Jochen Schmidt

unread,

Feb 18, 2003, 2:47:18 AM2/18/03

to

Donald Fisk wrote:

> I am developing this for a number of reasons:
> (
> * I intend it to be the implementation language of a language
> which I think will be even better than Lisp. I don't want to
> tell you much more than this until I have something to show,
> but those who do know what I am talking about are impressed.

Difficult to say without knowing anything.
Would those ideas be so impossible to implement in one of the many Common
Lisp implementations available?
Your changes do not sound so fundamental to me that this would be obvious.

> * I think that, in order for Lisp to continue evolving, it needs
> testing grounds for new ideas, which by necessity involve
> breaking the standard before they make the standard.

You can do this with available implementations of Common Lisp too. If your
work is only valuable enough it will be accepted even if it breaks
standards. Having said that I think that much could be done without
_breaking_ the standard at all.

> * I want a language implementation I have complete control over,
> that I can modify and improve when I need to, and that I can
> use for developing applications rapidly in.

Take one of the free opensource implementations of Common Lisp. SBCL seems
like a nice testbed for experimenting with language extensions.

> * I need to increase my profile, get my name recognized, get a
> reputation to improve my prospects.

IMHO You would increase your profile by far more if you would work on
enhancing existing and accepted tools instead.

ciao,
Jochen

Kenny Tilton

unread,

Feb 18, 2003, 3:03:46 AM2/18/03

to

Jochen Schmidt wrote:
> Donald Fisk wrote:
>
>
>>I am developing this for a number of reasons:
>>(
>>* I intend it to be the implementation language of a language
>> which I think will be even better than Lisp. I don't want to
>> tell you much more than this until I have something to show,
>> but those who do know what I am talking about are impressed.
>
>
> Difficult to say without knowing anything.
> Would those ideas be so impossible to implement in one of the many Common
> Lisp implementations available?

That was my question.

> IMHO You would increase your profile by far more if you would work on
> enhancing existing and accepted tools instead.

Constraints have been hailed before and have died before, a couple of
times. I think because folks get so excited about their ideas that they
have the fatal thought, "Hey, let's create a whole new language!".

I say, make new data structures (for Lisp, via the MOP if necessary),
not new languages. And for god's sake, don't make new syntax.

my2

--

kenny tilton
clinisys, inc
http://www.tilton-technology.com/
---------------------------------------------------------------
"Cells let us walk, talk, think, make love and realize
the bath water is cold." -- Lorraine Lee Cudmore

Pascal Costanza

unread,

Feb 18, 2003, 6:00:34 AM2/18/03

to

In article <usmumk...@telus.net>, Ray Blaak <bl...@telus.net>
wrote:

> Donald Fisk <hibou000...@enterprise.net> writes:
> > The most noticeable differences between Emblem and Common Lisp are
> > * Emblem is a Lisp 1 (like Scheme);
> > * Emblem separates the empty list () from FALSE;
>
> I am not bothered by these myself, but a lot of Lispers would be. My question
> though is why not just use Scheme?

Why is it that new dialects of Lisp are almost always defined as Lisp-1.
My understanding so far is that Lisp-2 has some pragmatic advantages
(for example, it makes variable capture less likely). On the other hand,
Lisp-1 seems to be more aesthetically pleasing but obviously has no real
benefits. Am I missing something?

Pascal

--
"If I could explain it, I wouldn't be able to do it."
A.M.McKenzie

Donald Fisk

unread,

Feb 18, 2003, 6:58:38 AM2/18/03

to

Pascal Costanza wrote:

> Why is it that new dialects of Lisp are almost always defined as Lisp-1.
> My understanding so far is that Lisp-2 has some pragmatic advantages
> (for example, it makes variable capture less likely). On the other hand,
> Lisp-1 seems to be more aesthetically pleasing but obviously has no real
> benefits. Am I missing something?

Emblem started off as a Lisp2, but I as I was never using both
function and value cells in the same symbol (even in Common Lisp),
I decided I might as well drop them. Having a Lisp1 isn't a
religious issue for me, though, and if I were in future to find
a need for a separate function cell, I would, and could, easily
put it back in.

> Pascal

Le Hibou

Donald Fisk

unread,

Feb 18, 2003, 8:09:28 AM2/18/03

to

Ray Blaak wrote:
>
> Donald Fisk <hibou000...@enterprise.net> writes:
> > The most noticeable differences between Emblem and Common Lisp are
> > * Emblem is a Lisp 1 (like Scheme);
> > * Emblem separates the empty list () from FALSE;
>
> I am not bothered by these myself, but a lot of Lispers would be. My question
> though is why not just use Scheme?

I need a more typeful language. The sort of thing I have in mind
is

(defun csvToList ((s Str))
"Converts arg1 from a Str of CSV fields into a list of fields,
trimmed."
(the List (map1 trim (split \, s))))

where both the arguments and the return values can be declared
(if they aren't they default to type Any). This is a /requirement/ for
the new language I intend to implement using Emblem. This feature is
/not/ negotiable.

> Don't use FALSE, though, since that prevents symbol manipulation involving
> that word (as does t and nil, but they at least have an established
> history). Scheme's notion of #t and #f, for example, lets those constants be
> out of the user namespace of symbols.
>
> I would recommend #true and #false (or #t and #f) and #nil if a non-list
> "bottom value" is needed.

People get used to language details. Java has true and false, and not
even I complain about that. But if everyone were to tell me that they
preferred #true and #false I'd sit up and listen.

> > * No dotted pairs -- if x is a list, (cdr x) is also a list;
>
> This would bug me -- it destroys the simplicity of the list type, and is
> unnecessary.
>
> What lisp programmer would put up with this?

Well, me. Again, the reasons are so that car(x: List) -> Any and
cdr(x: List) -> List. This is needed for the language I intend to
implement.
I don't want dotted pairs -- I have association lists, and it's easy to
build up other data structures where the RHS not a list.

However, I have a /potential/ problem with the restriction, which has
not
yet realized itself. It's that I might want to introduce lazy lists,
implemented with a function as the cdr. But that would be done in C,
rather
than Lisp. A similar thing has already happened with association lists
in Emblem -- originally they only had symbols as keys, now any type.

> Overall, I would just pick a Scheme that runs in a Java VM: better defined and
> farther evolved in terms of features, testing, and bug fixes.

As you haven't tried Emblem, you're only assuming that. They may be
more
stable, because more time has been spent developing them, and because
they're using a tried and tested design rather than a new, experiemental
one. So if you need to build a large system that under no
circumstances
should crash, don't use Emblem (yet). Use Common Lisp until Emblem is
stable.

I doubt they are "farther evolved". Read through the list of features
again. Emblem is more than just a language. Along with Emacs and a
web browser, it's a full development environment. And it's not
dependent
on the virtual machine of a completely different language. It has
its own VM, designed specifically for it.

> > I am developing this for a number of reasons:
>
> Unless you give an idea of the new features, there is not yet a single reason
> for any existing Lisp or Scheme programmer to switch to Emblem.

/I'm/ using it, so there's at least one Lisp programmer who's
switched to it, for most things. But 'switch' is the wrong
word. Emblem is still a Lisp, strongly influenced by Common
Lisp and much closer to it than it is to Scheme (I think).
It should be easy to port most code to and from Common Lisp.

I'm not expecting everyone else to switch from Common Lisp to
Emblem, but for me, I have faster, cleaner graphics than I can
build on top of CLX, and I have multiple threads, which (unless
it's changed recently) are absent from CLisp. Plus, I don't
need to muck about with FASL files or explicitly compile things.
Plus, I can browse around data structures, while the program is
running, using Netscape.

I'm not doing this out of any opposition to Common Lisp, which
I like and encourage non-lispers to use. I'm not expecting
lots of people to suddenly drop Common Lisp and switch to my
new dialect. In its present state I'm hoping that a few people
might use it for /some/ things.

> Ray Blaak

Le Hibou

Wade Humeniuk

unread,

Feb 18, 2003, 10:04:35 AM2/18/03

to

"Donald Fisk" <hibou000...@enterprise.net> wrote in message
news:3E518371...@enterprise.net...

>
> The total size of the code is around 14000 lines, 9000 of which
> are in C, the remainder being in Lisp. There are also 2000 lines
> of documentation. This represents about a man year of work.
>

Why did you use C as the main implementation language? You
could have used CL to implement Emblem, in less time. Once you
got your details worked out then you could have moved to C.

Wade

Wade Humeniuk

unread,

Feb 18, 2003, 10:22:10 AM2/18/03

to

"Donald Fisk" <hibou000...@enterprise.net> wrote in message

news:3E523088...@enterprise.net...

> As you haven't tried Emblem, you're only assuming that. They may be
> more
> stable, because more time has been spent developing them, and because
> they're using a tried and tested design rather than a new, experiemental
> one. So if you need to build a large system that under no
> circumstances
> should crash, don't use Emblem (yet). Use Common Lisp until Emblem is
> stable.
>

I honestly do not think I would use Emblem. I used Scheme and other
languages before and have found CL to be the best engineered of all
of them. If you have trouble with CL because it is "too large" exhibit
some self-control and only use a subset. (Never use a variable name
that has a bound function, etc, etc. etc.)

> I doubt they are "farther evolved". Read through the list of features
> again. Emblem is more than just a language. Along with Emacs and a
> web browser, it's a full development environment. And it's not
> dependent
> on the virtual machine of a completely different language. It has
> its own VM, designed specifically for it.

Aside from the techincal issues involved, why did feel it necessary to
strike out in such an independent manner? Essentially giving up
on the work and support of a tried and tested community? (Java or
whoever else)

CL can be viewed as a VM or a Common Reference Platform.

Wade

Mark Hurd

unread,

Feb 18, 2003, 10:25:11 AM2/18/03

to

"Donald Fisk" <hibou000...@enterprise.net> wrote in message
news:3E523088...@enterprise.net...

> Ray Blaak wrote:
> > Donald Fisk <hibou000...@enterprise.net> writes:
> > > The most noticeable differences between Emblem and Common Lisp are

<snip>

> > > * No dotted pairs -- if x is a list, (cdr x) is also a list;
> >
> > This would bug me -- it destroys the simplicity of the list type, and is
> > unnecessary.

/Before/ I used it, I thought the same about this restriction in Rich Hickey's
dotlisp, announced in this newsgroup last year.

> > What lisp programmer would put up with this?
>
> Well, me. Again, the reasons are so that car(x: List) -> Any and
> cdr(x: List) -> List. This is needed for the language I intend to
> implement.
> I don't want dotted pairs -- I have association lists, and it's easy to
> build up other data structures where the RHS not a list.

Yes, after I played with dotlisp (and the features of the .NET Framework), I
almost never noticed it missing. It was only when I was doing "very" Lisp
things or attempting to convert existing Lisp programs that I missed it, a
bit.

> However, I have a /potential/ problem with the restriction, which has
> not
> yet realized itself. It's that I might want to introduce lazy lists,
> implemented with a function as the cdr. But that would be done in C,
> rather
> than Lisp. A similar thing has already happened with association lists
> in Emblem -- originally they only had symbols as keys, now any type.

In dotlisp, this is done using the IEnumerator interface, but essentially you
have Lists and LazyLists. Where it matters, or where LazyLists are likely you
cater for them as well as normal lists.

Regards,
Mark Hurd, B.Sc.(Ma.) (Hons.)

PS Rich Hickey's dotlisp is described here:
http://www.richhickey.com/dotlisp.htm

Jeremy H. Brown

unread,

Feb 18, 2003, 10:36:33 AM2/18/03

to

Kenny Tilton <kti...@nyc.rr.com> writes:
> Constraints have been hailed before and have died before, a couple of
> times. I think because folks get so excited about their ideas that
> they have the fatal thought, "Hey, let's create a whole new language!".
>
> I say, make new data structures (for Lisp, via the MOP if necessary),
> not new languages. And for god's sake, don't make new syntax.

There's a lovely, lovely paper on language design by Hoare, presently
available online at
http://www.cs.berkeley.edu/~necula/cs263/handouts/hoarehints.pdf

One of the key points of Hoare's paper is that one should either
invent new programming language features in the context of an extant
language, or one should create a new language presenting a set of
previously understood features; building a new language around new
constructs is discouraged. "One thing [a language designer] should
not do is to include untried ideas of his own. His task is
consolidation, not innovation."

Jeremy

PS Hoare's paper also contains the timeless quote, regarding Algol 60,
"Here is a language so far ahead of its time, that it was not only an
improvement on its predecessors, but also on nearly all its
successors."

Barry Margolin

unread,

Feb 18, 2003, 11:13:10 AM2/18/03

to

In article <3E521FEE...@enterprise.net>,

Donald Fisk <hibou000...@enterprise.net> wrote:
>Pascal Costanza wrote:
>
>> Why is it that new dialects of Lisp are almost always defined as Lisp-1.
>> My understanding so far is that Lisp-2 has some pragmatic advantages
>> (for example, it makes variable capture less likely). On the other hand,
>> Lisp-1 seems to be more aesthetically pleasing but obviously has no real
>> benefits. Am I missing something?
>
>Emblem started off as a Lisp2, but I as I was never using both
>function and value cells in the same symbol (even in Common Lisp),
>I decided I might as well drop them.

So you never have local variables named "list"? Schemers frequently use
the annoying abbreviation "lst" for their local variables because Lisp1
makes the regular word unavaiable to them.

--
Barry Margolin, barry.m...@level3.com
Genuity Managed Services, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

Paolo Amoroso

unread,

Feb 18, 2003, 11:11:11 AM2/18/03

to

On Tue, 18 Feb 2003 00:50:57 +0000, Donald Fisk
<hibou000...@enterprise.net> wrote:

> I have been developing Emblem, an implementation of a dialect
> of Lisp, similar to a subset of Common Lisp, but with a few
> important differences.

[...]

> The most noticeable differences between Emblem and Common Lisp are

[...]

> * Emblem's reader doesn't convert input to upper case;
> * Emblem is a Lisp 1 (like Scheme);
> * Emblem separates the empty list () from FALSE;
> * No dotted pairs -- if x is a list, (cdr x) is also a list;
> * Emblem allows you to declare the types of a function's arguments;
> * Association lists are a separate type;
> * I/O is through bidirectional channels rather than unidirectional
> files.

[...]

> I am developing this for a number of reasons:
> (
> * I intend it to be the implementation language of a language
> which I think will be even better than Lisp. I don't want to

[...]

> * I think that, in order for Lisp to continue evolving, it needs
> testing grounds for new ideas, which by necessity involve
> breaking the standard before they make the standard.

The ones you mention above are mostly basic--minor?--language features on
which there was extensive research and experimentation in past years,
especially in the Lisp world. Are there any compelling technical reasons to
deviate from the standard which emerged from that experience, i.e. Common
Lisp? Are new ideas for, say, dealing with boolean values worth such
deviations and incompatibilities?

I'm not against new ideas and experimentation. But I think it takes more
than a Lisp 1 or a different way of dealing with booleans to impress a
Lisper.

Paolo
--
Paolo Amoroso <amo...@mclink.it>

Paolo Amoroso

unread,

Feb 18, 2003, 11:11:10 AM2/18/03

to

On Tue, 18 Feb 2003 13:09:28 +0000, Donald Fisk
<hibou000...@enterprise.net> wrote:

> Plus, I can browse around data structures, while the program is
> running, using Netscape.

If I understand correctly what you mean by "browse around data structures",
you can do that with the cllib inspector which comes with CLOCC.

Christophe Rhodes

unread,

Feb 18, 2003, 11:20:11 AM2/18/03

to

Donald Fisk <hibou000...@enterprise.net> writes:

> Ray Blaak wrote:
> >
> > Donald Fisk <hibou000...@enterprise.net> writes:
> > I am not bothered by these myself, but a lot of Lispers would be. My question
> > though is why not just use Scheme?
>
> I need a more typeful language. The sort of thing I have in mind
> is
>
> (defun csvToList ((s Str))
> "Converts arg1 from a Str of CSV fields into a list of fields,
> trimmed."
> (the List (map1 trim (split \, s))))
>
> where both the arguments and the return values can be declared
> (if they aren't they default to type Any). This is a /requirement/ for
> the new language I intend to implement using Emblem. This feature is
> /not/ negotiable.

In Common Lisp, this either looks like
(defmethod csvToList ((s string))
(the list (map1 #'trim (split #\, s))))
or
(defun csvToList (s)
(declare (type string s))
(the list (map1 #'trim (split #\, s))))
depending on what the semantics of the declaration are. A word about
the second style might be adviseable: the use of DECLARE as an
assertion in high safety code is of course not portable across CL
implementations, as by the standard the consequences are undefined if
a type declaration is violated. If this bothers you, you could
instead write CHECK-TYPE for the type checking semantics.

Can you expand on why these aren't suitable for your needs?

> > This would bug me -- it destroys the simplicity of the list type, and is
> > unnecessary.
> >
> > What lisp programmer would put up with this?
>
> Well, me. Again, the reasons are so that car(x: List) -> Any and
> cdr(x: List) -> List. This is needed for the language I intend to
> implement.
> I don't want dotted pairs -- I have association lists, and it's easy to
> build up other data structures where the RHS not a list.

(defstruct (proper-list
(:constructor make-proper-list)
(:constructor kons (car cdr))
(:constructor %make-null-list (&aux car cdr)))
(car nil :type t)
(cdr (missing-arg) :type proper-list))

(defvar *null-proper-list*
(let ((x (%make-null-list)))
(setf (proper-list-car x) x)
(setf (proper-list-cdr x) x)))

Then, again, in an implementation that understands type declarations
as assertions in high safety code, this proper-list structure fulfils
these requirements.

Cheers,

Christophe
--
http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge)

Barry Margolin

unread,

Feb 18, 2003, 11:19:24 AM2/18/03

to

In article <3e525...@news.iprimus.com.au>,

Mark Hurd <mark...@ozemail.com.au> wrote:
>"Donald Fisk" <hibou000...@enterprise.net> wrote in message
>news:3E523088...@enterprise.net...
>> Ray Blaak wrote:
>> > Donald Fisk <hibou000...@enterprise.net> writes:
>> > > The most noticeable differences between Emblem and Common Lisp are
><snip>
>> > > * No dotted pairs -- if x is a list, (cdr x) is also a list;
>> >
>> > This would bug me -- it destroys the simplicity of the list type, and is
>> > unnecessary.
>
>
>/Before/ I used it, I thought the same about this restriction in Rich Hickey's
>dotlisp, announced in this newsgroup last year.
>
>> > What lisp programmer would put up with this?
>>
>> Well, me. Again, the reasons are so that car(x: List) -> Any and
>> cdr(x: List) -> List. This is needed for the language I intend to
>> implement.
>> I don't want dotted pairs -- I have association lists, and it's easy to
>> build up other data structures where the RHS not a list.
>
>Yes, after I played with dotlisp (and the features of the .NET Framework), I
>almost never noticed it missing. It was only when I was doing "very" Lisp
>things or attempting to convert existing Lisp programs that I missed it, a
>bit.

I think I can understand this.

The general nature of Lisp conses is somewhat vestigial. When they were
the *only* structured data type in the language, they needed to be
flexible.

But since then we've added arrays, structures, and CLOS instances to the
languages. If you want a data structure that can hold two arbitrary
objects, you can create it easily using DEFSTRUCT:

(defstruct kons
kar kdr)

In my experience, dotted lists are an extreme rarity, and there's likely to
be a better alternative.

Donald Fisk

unread,

Feb 18, 2003, 12:50:06 PM2/18/03

to

If Emblem was an end in itself, there would have been little
point in developing it. There are good Lisps out there, and
while they often don't have all the features I want, I could
have added them. As I've said before, a slightly better Lisp
than Common Lisp as an end in itself is probably not worth the
effort at inception (it would be like building a cairn on top
of Ben Nevis (there is one, by the way)), but even a worse Lisp
as part of a larger system /is/ often worth the effort -- think
Abuse, Autocad, Emacs. Emblem was designed to be part of a
larger language system.

So it's a means to an end, a spin-off of a longer term language
research project, released because it's usable and (to me, at any
rate) useful.

I could, in theory, have used Common Lisp (or even Java) for
implementation of my proposed language instead of Emblem but
whichever of these I used would mean compromises. I have looked
into the use of both: with Common Lisp, the problems would have
been determining the argument and values types of built-in
functions such as car and cons (with user-defined methods there
might be a way using the MOP?), determining which functions apply
to particular argument types such as list or string, and the
leniency of functions such as cdr(x:DottedPair or NIL)->Any,
cons(x:Any, y:Any) -> DottedPair, etc. It would get messy.
Java has different problems, such as the lack of a decent object
system, multiple values, and arbitrary length argument lists.
I could get round these by implementing Emblem in Java, at the
cost of a severe performance hit. And I've already run into
problems with threads on Java/Linux.

I was never expecting those who are committed to Common Lisp to
to replace it with Emblem. The new language, however is
sufficiently different that even those who are so committed
might admire it even if they rarely or never use it, much as I
admire Smalltalk.

Oh, and now is not the time to press me on details of the
proposed language. When I have something to show, I'll
show it.

> Paolo

Le Hibou

Donald Fisk

unread,

Feb 18, 2003, 1:28:39 PM2/18/03

to

"Jeremy H. Brown" wrote:

> There's a lovely, lovely paper on language design by Hoare, presently
> available online at
> http://www.cs.berkeley.edu/~necula/cs263/handouts/hoarehints.pdf
>
> One of the key points of Hoare's paper is that one should either
> invent new programming language features in the context of an extant
> language, or one should create a new language presenting a set of
> previously understood features; building a new language around new
> constructs is discouraged. "One thing [a language designer] should
> not do is to include untried ideas of his own. His task is
> consolidation, not innovation."

I read this paper a long time ago, and will read it again to refresh
my memory.

However, if John McCarthy had followed this advice, we'd never have
had Lisp. If Alain Colmerauer had followed this advice, we'd never
have had Prolog. Both Lisp and Prolog were new languages built
around new constructs.

> Jeremy

Le Hibou

Donald Fisk

unread,

Feb 18, 2003, 1:31:02 PM2/18/03

to

Kenny Tilton wrote:
>
> Jochen Schmidt wrote:
> > Donald Fisk wrote:
> >
> >
> >>I am developing this for a number of reasons:
> >>(
> >>* I intend it to be the implementation language of a language
> >> which I think will be even better than Lisp. I don't want to
> >> tell you much more than this until I have something to show,
> >> but those who do know what I am talking about are impressed.
> >
> >
> > Difficult to say without knowing anything.
> > Would those ideas be so impossible to implement in one of the many Common
> > Lisp implementations available?
>
> That was my question.
>
> > IMHO You would increase your profile by far more if you would work on
> > enhancing existing and accepted tools instead.
>
> Constraints have been hailed before and have died before, a couple of
> times. I think because folks get so excited about their ideas that they
> have the fatal thought, "Hey, let's create a whole new language!".
>
> I say, make new data structures (for Lisp, via the MOP if necessary),
> not new languages. And for god's sake, don't make new syntax.

You'll either have to trust me on this, or disbelieve me entirely.

For me, there's only one reason for creation of a new general-purpose
language -- if it's significantly better (but not necessarily more
popular) than Common Lisp. I've not entered into this lightly and
I'm not 100% sure I'll be successful in producing such a language.
But I do know what it will look like.

The new language will be /completely/ different from Lisp. If you
were to take Lisp, Cobol and the new language, and classify them into
two groups, Lisp and Cobol would belong in one group and the new
language would belong in another group. The same applies to
the syntax -- Lisp and Cobol syntax are closer to each other than
either is to the syntax of my new language. And there are very
good reasons for the syntax of my new language, just as there
are very good reasons for the syntax of Lisp, which all of us
around here presumably know.

I feel I'm being drawn too much on this, and would rather everyone
waited as long as it takes for me to have something to show.
Meantime, there's Emblem for anyone who's interested. Things
might not work out, though I'm convinced the idea is sound, for
very good reasons.

> kenny tilton

Le Hibou

Kenny Tilton

unread,

Feb 18, 2003, 1:39:30 PM2/18/03

to

Donald Fisk wrote:

> Kenny Tilton wrote:
>>I say, make new data structures (for Lisp, via the MOP if necessary),
>>not new languages. And for god's sake, don't make new syntax.
>
>
> You'll either have to trust me on this, or disbelieve me entirely.

That's OK, I see from other stuff that you are going in directions
requiring much more than just new data structures. Well, hey, good luck
with your new toy. Sounds like you are having fun and have a very clear
idea where you are going.

Me, I'm stuck these days trying to make "C" libraries happy. :(

Don Geddis

unread,

Feb 18, 2003, 3:32:44 PM2/18/03

to

Donald Fisk <hibou000...@enterprise.net> writes:
> The new language will be /completely/ different from Lisp. [...]

> And there are very
> good reasons for the syntax of my new language, just as there

> are very good reasons for the syntax of Lisp [...]

> I feel I'm being drawn too much on this, and would rather everyone
> waited as long as it takes for me to have something to show.

You seem to have a very clear path into the future, and your Emblem is only
interesting because it's a step towards your goal.

Why are you so unwilling to discuss the ideas behind where you're heading?
It would seem to be an interesting topic, if you really have some idea where
the future of programming languages should be.

Surely it's possible to discuss the theory of programming languages, without
needing to fully implement the new one first. You've mentioned Lisp, Prolog,
and Smalltalk, all of which have very different approaches to expressing
computation. If you've come up with a new one, why not share it with us,
and seed some interesting discussion?

_______________________________________________________________________________
Don Geddis http://don.geddis.org d...@geddis.org

Tim Bradshaw

unread,

Feb 18, 2003, 3:52:24 PM2/18/03

to

* Barry Margolin wrote:

> But since then we've added arrays, structures, and CLOS instances to the
> languages. If you want a data structure that can hold two arbitrary
> objects, you can create it easily using DEFSTRUCT:

> (defstruct kons
> kar kdr)

One issue with this is that, in many implementations, objects created
with DEFSTRUCT have a significantly larger size than conses - often
the type of a cons is encoded in the tag (as one of the small number
of types which can be there), while a structure has an `other' tag and
then a whole word of type information. If you have many small
(2-slot, say) objects, this can really cost. I used to *never* use
conses to mean some abstract 2-slot type because it felt so old
fashioned, but I do now.

--tim

Ray Blaak

unread,

Feb 19, 2003, 1:38:37 AM2/19/03

to

Donald Fisk <hibou000...@enterprise.net> writes:
> I need a more typeful language. The sort of thing I have in mind
> is
>
> (defun csvToList ((s Str))
> "Converts arg1 from a Str of CSV fields into a list of fields,
> trimmed."
> (the List (map1 trim (split \, s))))
>
> where both the arguments and the return values can be declared
> (if they aren't they default to type Any). This is a /requirement/ for
> the new language I intend to implement using Emblem. This feature is
> /not/ negotiable.

There are variants of Scheme that allow this kind of thing.

CL lets you do (declare type ...) and (the ...) kinds of things, which
admittedly are implementation dependent, but least give a good starting point.

Dylan also has the same essential ability. Perhaps there is a Sexp version of
it still kicking around somewhere?

I don't see anything fundamentally new that requires a fresh language yet. You
should be able to augment an existing implementation.

> > Don't use FALSE, though, since that prevents symbol manipulation involving
> > that word (as does t and nil, but they at least have an established
> > history). Scheme's notion of #t and #f, for example, lets those constants be
> > out of the user namespace of symbols.
> >
> > I would recommend #true and #false (or #t and #f) and #nil if a non-list
> > "bottom value" is needed.
>
> People get used to language details. Java has true and false, and not
> even I complain about that. But if everyone were to tell me that they
> preferred #true and #false I'd sit up and listen.

Java is not a symbol manipulation language. One would either use strings (and
thus have their "symbols" be in their own namespace) or have upper case named
constants (again in their own namespace).

With the appropriate namespace/package control, however, this issue could very
well be moot.

Erann Gat

unread,

Feb 19, 2003, 10:42:49 AM2/19/03

to

In article <ud6lo4...@telus.net>, Ray Blaak <bl...@telus.net> wrote:

> Dylan also has the same essential ability. Perhaps there is a Sexp version of
> it still kicking around somewhere?

http://www.ai.mit.edu/~jrb/goo/

E.

Donald Fisk

unread,

Feb 19, 2003, 12:56:42 PM2/19/03

to

Ray Blaak wrote:
>
> Donald Fisk <hibou000...@enterprise.net> writes:
> > I need a more typeful language. The sort of thing I have in mind
> > is
> >
> > (defun csvToList ((s Str))
> > "Converts arg1 from a Str of CSV fields into a list of fields,
> > trimmed."
> > (the List (map1 trim (split \, s))))
> >
> > where both the arguments and the return values can be declared
> > (if they aren't they default to type Any). This is a /requirement/ for
> > the new language I intend to implement using Emblem. This feature is
> > /not/ negotiable.
>
> There are variants of Scheme that allow this kind of thing.
>
> CL lets you do (declare type ...) and (the ...) kinds of things, which
> admittedly are implementation dependent, but least give a good starting point.

But can the programmer get this information back, or does it,
as I supect, merely provide information for the compiler's
benefit?

> Dylan also has the same essential ability. Perhaps there is a Sexp version of
> it still kicking around somewhere?
>
> I don't see anything fundamentally new that requires a fresh language yet. You
> should be able to augment an existing implementation.

It's possible that in future Emblem and Common Lisp might converge,
and then Emblem would become yet another Common Lisp implementation.
The difficulty is that the new language /requires/ certain type
info that simply isn't there (or isn't easy to find) in current
Common Lisp implementations, and which Common Lisp developers
cannot be relied upon to add when defining new functions. As
a concrete example, I need to know that REVERSE takes a LIST and
returns /a single argument/ which is a LIST, and that LIST has,
among many functions, one called REVERSE which operates upon it.
I suppose an existing Common Lisp could be extended to support
this, though it would require a lot of work (not least of which
would be understanding the existing implementation first).

But if I went down this route then I'd be bound to accept
whatever license the existing system came with (probably GPL,
or even worse BSD) which would permit commercial organizations
to use (and even sell) my work without paying me a penny for it.
This to me is completely unacceptable.

Don Geddis

unread,

Feb 19, 2003, 5:15:05 PM2/19/03

to

Donald Fisk <hibou000...@enterprise.net> writes:
> The difficulty is that the new language /requires/ certain type
> info that simply isn't there (or isn't easy to find) in current
> Common Lisp implementations

> I suppose an existing Common Lisp could be extended to support
> this, though it would require a lot of work (not least of which
> would be understanding the existing implementation first).

A lot of work, definitely...but surely much less work than implementing
a new language from scratch!

And of course, you'd then get the benefit of years of bug fixes, tons of
library functions you'll probably never have time to get to, plus lots of
existing CL code that will already run.

> But if I went down this route then I'd be bound to accept
> whatever license the existing system came with (probably GPL,
> or even worse BSD) which would permit commercial organizations
> to use (and even sell) my work without paying me a penny for it.
> This to me is completely unacceptable.

Perhaps you should have checked into this before you began.
For example, CMUCL
http://www.cons.org/cmucl/
is "public domain", not GPL/BSD. Which means you could fork off it,
add your extensions, and release your version with whatever license you wish.

-- Don

_______________________________________________________________________________
Don Geddis http://don.geddis.org d...@geddis.org

When I was a boy of fourteen, my father was so ignorant I could hardly stand to
have the old man around. But when I got to be twenty-one, I was astonished at
how much the old man had learned in seven years. -- Mark Twain

Roger Corman

unread,

Feb 25, 2003, 8:30:50 PM2/25/03

to

I love conses (meaning the primitive, 2-slot type).
They are so basic, and so powerful.
Lisp systems typically optimize the heck out of them, making them much
faster to allocate and initialize than arbitrary types. I can allocate
a cons in Corman Lisp something like ten times as fast as allocating
the smallest, simplest type under the MS .NET engine. This is, of
course, one of the reasons porting Lisp to .NET is problematic. A cons
also uses exactly 8 bytes (not one byte more), which is at least half
the size of any object I could allocate in .NET. As a primitive type,
it has its own tag, as does an integer.

Having such an elegant, simple and efficient data structure has always
been and will always be one of Lisp's strong points, in my opinion.

Sure, structures or objects are better in many/most cases, but it's
nice to have different levels of abstraction and peformance available.
The "consing engine" is one of the many powerful features of Lisp.

Roger

Tom Lord

unread,

Feb 26, 2003, 2:00:16 AM2/26/03

to

If this were a rock concert, I'd be screaming "OOWWW!" and
"YEEEHAW" and flailing my arms about.

I guess this is what happens when you climb near TOP on the lattice:
it takes everyone else a while to catch up -- and meanwhile, they just
spit at you.

"will always be" -- your opinion is shared, brother.

-t

Paolo Amoroso

unread,

Feb 26, 2003, 9:09:02 AM2/26/03

to

On Wed, 26 Feb 2003 01:30:50 GMT, ro...@corman.net (Roger Corman) wrote:

> faster to allocate and initialize than arbitrary types. I can allocate
> a cons in Corman Lisp something like ten times as fast as allocating
> the smallest, simplest type under the MS .NET engine. This is, of
> course, one of the reasons porting Lisp to .NET is problematic. A cons
> also uses exactly 8 bytes (not one byte more), which is at least half
> the size of any object I could allocate in .NET. As a primitive type,

How many overhead bytes does a malloc()ed memory chunk typically take? I
guess the 8 bytes you mention don't include--and don't need under
Lisp--such overhead, right?

Tim Bradshaw

unread,

Feb 26, 2003, 9:34:23 AM2/26/03

to

* Paolo Amoroso wrote:
> How many overhead bytes does a malloc()ed memory chunk typically take? I
> guess the 8 bytes you mention don't include--and don't need under
> Lisp--such overhead, right?

I shouldn't think you need any overhead for conses. But, actually, a
malloc-based system doesn't need much either if it's done `cleverly' -
instead of allocating one thing for each object, allocate a whole area
for objects, and then type the area and not the things within it. Of
course this is essentially BIBOP. Whether .NET does this, I don't
know.

I guess the problem with BIBOP this for arbitrary types is that you
need to make some decision `OK, I'm going to have lots of these
things, so I will allocate a whole region for them and type the
region'. If you have thousands of types then you either end up with a
lot of empty space because you allocate large chunks of them, or you
allocate small chunks and you have overhead. So what you probably do
is have some small number of `optimized' types which you assume you
will allocate in large numbers and only do this trick for them ... Oh,
you've just reinvented conses.

(Incidentally, though I never explored it, I got the impression that
the Symbolics memory management system could do amazing things which
smelt kind of like BIBOP but were programmer-driven, so you'd say
`make a region (were they called regions?) for this kind of object'
and it would then allocate them there.)

--tim

Donald Fisk

unread,

Feb 26, 2003, 11:57:22 AM2/26/03

to

Tim Bradshaw wrote:

> I shouldn't think you need any overhead for conses. But, actually, a
> malloc-based system doesn't need much either if it's done `cleverly' -
> instead of allocating one thing for each object, allocate a whole area
> for objects, and then type the area and not the things within it. Of
> course this is essentially BIBOP. Whether .NET does this, I don't
> know.
>
> I guess the problem with BIBOP this for arbitrary types is that you
> need to make some decision `OK, I'm going to have lots of these
> things, so I will allocate a whole region for them and type the
> region'. If you have thousands of types then you either end up with a
> lot of empty space because you allocate large chunks of them, or you
> allocate small chunks and you have overhead. So what you probably do
> is have some small number of `optimized' types which you assume you
> will allocate in large numbers and only do this trick for them ... Oh,
> you've just reinvented conses.
>
> (Incidentally, though I never explored it, I got the impression that
> the Symbolics memory management system could do amazing things which
> smelt kind of like BIBOP but were programmer-driven, so you'd say
> `make a region (were they called regions?) for this kind of object'
> and it would then allocate them there.)

The way I did things was to have an array of structures for each
fundamental type, e.g. List (same as Cons), Sym, Str, etc., plus
one for Array and one for ArrayElem. Array is used for user-
defined objects as well as arrays. Each object is then a pointer
to one of those array elements. Garbage collection on fixed-
size objects is by marking everything still in use, then sweeping
whatever runs out of space, so those GCs are very fast and pointers
don't change. However, for variable size objects such as string
and array contents, the algorithm is o(N^2) and therefore very
slow, so I'd appreciate pointers to published algorithms there.

> --tim

Jeff

unread,

Feb 26, 2003, 2:39:56 PM2/26/03

to

Thanks for introducing me to the term BIBOP. I was reminded of a
paper, "Lisp Machine Progress Report", from 1977.

http://home.attbi.com/~prunesquallor/memo444.htm

See the first paragraph of "Representation of Data". Each page was
typed.

Re your second paragraph below: Zones instead of regions?

http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?BIBOP

Jeff

Tim Bradshaw <t...@cley.com> wrote in message news:<ey3isv7...@cley.com>...
...

> I shouldn't think you need any overhead for conses. But, actually, a
> malloc-based system doesn't need much either if it's done `cleverly' -
> instead of allocating one thing for each object, allocate a whole area
> for objects, and then type the area and not the things within it. Of
> course this is essentially BIBOP. Whether .NET does this, I don't
> know.

...

Roger Corman

unread,

Feb 26, 2003, 3:38:38 PM2/26/03

to

On Wed, 26 Feb 2003 15:09:02 +0100, Paolo Amoroso <amo...@mclink.it>
wrote:

>
>How many overhead bytes does a malloc()ed memory chunk typically take?

I am sure it varies a lot. I have written malloc implementations which
used 8-16 bytes of overhead per block. I know you can do much better,
but there are pros and cons to the different strategies. If you use a
BIBOP approach you can have a certain amount of unused space that has
to be factored in as overhead. By the way, as a jazz pianist, I have
always been attracted to the BIBOP approach. ;-)

>I guess the 8 bytes you mention don't include--and don't need under
>Lisp--such overhead, right?

Not if it's done right.

Pekka P. Pirinen

unread,

Mar 3, 2003, 12:32:35 PM3/3/03

to

Donald Fisk <hibou000...@enterprise.net> writes:

> Tim Bradshaw wrote:
> > (Incidentally, though I never explored it, I got the impression that
> > the Symbolics memory management system could do amazing things which
> > smelt kind of like BIBOP but were programmer-driven, so you'd say
> > `make a region (were they called regions?) for this kind of object'
> > and it would then allocate them there.)

Yes, region was the term on Symbolics, this kind of thing is also
often called "pool". You indicated the region when allocating:
e.g. CONS-IN-REGION. You could set memory management parameters per
region. This is very useful in large applications.

> Garbage collection [...] for variable size objects such as string

> and array contents, the algorithm is o(N^2) and therefore very slow,
> so I'd appreciate pointers to published algorithms there.

The standard reference for GC is _Garbage Collection: Algorithms for
Automatic Dynamic Memory Management_ by Jones and Lins, see
<http://www.cs.ukc.ac.uk/people/staff/rej/gcbook/gcbook.html>. You
really must get this book if you work on GC.

Using BEBOP and arrays with separate headers and element blocks
shouldn't affect the time complexity. What's taking o(N^2)?
--
Pekka P. Pirinen
The great problem with Lisp is that it is just good enough to keep us
from developing something really good. - Alan Kay

Donald Fisk

unread,

Mar 4, 2003, 5:20:03 PM3/4/03

to

"Pekka P. Pirinen" wrote:
>
> Donald Fisk <hibou000...@enterprise.net> writes:
> > Tim Bradshaw wrote:
> > > (Incidentally, though I never explored it, I got the impression that
> > > the Symbolics memory management system could do amazing things which
> > > smelt kind of like BIBOP but were programmer-driven, so you'd say
> > > `make a region (were they called regions?) for this kind of object'
> > > and it would then allocate them there.)
>
> Yes, region was the term on Symbolics, this kind of thing is also
> often called "pool". You indicated the region when allocating:
> e.g. CONS-IN-REGION. You could set memory management parameters per
> region. This is very useful in large applications.
>
> > Garbage collection [...] for variable size objects such as string
> > and array contents, the algorithm is o(N^2) and therefore very slow,
> > so I'd appreciate pointers to published algorithms there.
>
> The standard reference for GC is _Garbage Collection: Algorithms for
> Automatic Dynamic Memory Management_ by Jones and Lins, see
> <http://www.cs.ukc.ac.uk/people/staff/rej/gcbook/gcbook.html>. You
> really must get this book if you work on GC.

I've read much of it, but don't own a copy. I got some funny
looks once from other tube travellers who assumed it was about
waste disposal.

> Using BEBOP and arrays with separate headers and element blocks
> shouldn't affect the time complexity. What's taking o(N^2)?

Probably because my algorithm isn't very good. There's an
array of strings (pointer into an array of chars + length), and
the array of chars. When the array of chars is full, from the
lowest string I repeatedly find the next lowest string and shift
its characters down to close any gap, then I search the array of
strings again for substrings of it, and adjust the pointers.

It's dumb, but it works, which is the main thing. At some
point, I'll return to it and figure out a way of speeding it
up, but I have other things to do which are more urgent.
Also, while it's so slow it provides me with added incentive
to avoid generating string garbage in the first place.

If someone wants to suggest a faster way, I'll note it and
try it out when I'm ready to.

> Pekka P. Pirinen

Kalle Olavi Niemitalo

unread,

Mar 5, 2003, 1:31:23 AM3/5/03

to

Pekka.P...@globalgraphics.com (Pekka P. Pirinen) writes:

> You indicated the region when allocating: e.g. CONS-IN-REGION.

Did a regular CONS always use the same region, or was there a way
to change the default -- perhaps by rebinding a per-thread
special variable *REGION*? Because manually creating -IN-REGION
variants of all consing functions would surely be a huge pain,
and I don't think it can be reliably done with macros.

Tim Bradshaw

unread,

Mar 5, 2003, 5:23:39 AM3/5/03

to

* Kalle Olavi Niemitalo wrote:

> Did a regular CONS always use the same region, or was there a way
> to change the default -- perhaps by rebinding a per-thread
> special variable *REGION*? Because manually creating -IN-REGION
> variants of all consing functions would surely be a huge pain,
> and I don't think it can be reliably done with macros.

I think there was *DEFAULT-CONS-REGION* (or something like that).

--tim

Gordon Joly

unread,

Mar 6, 2003, 3:10:05 AM3/6/03

to

Donald Fisk <hibou000...@enterprise.net> wrote in message news:<3E518371...@enterprise.net>...

> I have been developing Emblem, an implementation of a dialect
> of Lisp, similar to a subset of Common Lisp, but with a few
> important differences.
>

> The system is now useful to me and I have been using it for
> development of software applications for my own use. It is
> not perfect but I am running into problems with it less and
> less frequently.

> * It has a graphics system, implemented using Xlib, which supports
> 3d-graphics and high-level event handling.

> If anyone is interested in trying out Emblem, please mail me
> (remove nospam from my address in the message header) and I'll
> send you the source, to your home address (or college). Please
> be patient, and expect some problems, which I'll do my best to
> fix (color allocation has only been tested for 8 bit colors, for
> example).
>
> Donald Fisk (a.k.a. Le Hibou)

I have a copy (thanks Donald!); version 010. I was very impressed with
the 3D graphics, since I worked on a 2D/3D Java project last year (I
was using "Java 2D" only!). 24bit support seems to work well.

I had an idea to experiment, but as other people have pointed out,
most other (mature) packages have more library support (CPAN comes to
mind!!).

Gordo The Grey

Pekka P. Pirinen

unread,

Mar 10, 2003, 12:01:54 PM3/10/03

to

Donald Fisk <hibou000...@enterprise.net> writes:
> "Pekka P. Pirinen" wrote:
> > Donald Fisk <hibou000...@enterprise.net> writes:
> > > Garbage collection [...] for variable size objects such as string
> > > and array contents, the algorithm is o(N^2) and therefore very slow,
> >

> > [...] Using BEBOP and arrays with separate headers and element

> > blocks shouldn't affect the time complexity. What's taking o(N^2)?
>
> Probably because my algorithm isn't very good. There's an
> array of strings (pointer into an array of chars + length), and
> the array of chars. When the array of chars is full, from the
> lowest string I repeatedly find the next lowest string and shift
> its characters down to close any gap, then I search the array of
> strings again for substrings of it, and adjust the pointers.

It's a local optimum in the space of algorithms. Anything else is
going to require more memory. Compaction doesn't come for free.

Maintaining a map of old->new address during compaction would at first
glance get to O(N*log(N)), but it might run out of memory to store the
map, in which case you'd still have to fall back to the slow method.
If you set aside O(N) words to store the map, you can avoid running
out during GC. There are various ways to implement this, but they
basically have three phases: calculate new positions, update pointers,
and move.

The most parsimonious way is not a separate header and body, but a
format with the length field, the new address field and the body in a
single block. Now the new address field is guaranteed to be
available, in a position that is trivial to compute from the old
address. This gets you O(N) compaction, but you have to update
pointers everywhere, not just in a header array, so the constant
factor is large and the code is complex. And it would mean changing
all your string code! Substrings would still have be represented as
displaced arrays.

It's worth noting, that a lot of systems, such as conservative GCs and
manual allocators, get by without moving memory blocks. Compaction
might not be worth the trouble, unless you're under tight memory
budget. Even if the system does compact, it's worth putting off as
long as possible.
--
Pekka P. Pirinen
Pick your enemies carefully. They're harder to get rid of than friends.