Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
self-hosting gc
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 76 - 100 of 238 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Erik Naggum  
View profile  
 More options Mar 5 2002, 5:12 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Tue, 05 Mar 2002 22:12:56 GMT
Local: Tues, Mar 5 2002 5:12 pm
Subject: Re: self-hosting gc
* Thomas Bushnell, BSG
| Consider that if a Lisp system's GC is written in some other language
| (like, say, C) then you now need two compilers to build the language.
| If your only use for a C compiler is to compile your GC, then you have
| really wasted a vast effort in writing one.

  It seems quite natural that someone who writes a Common Lisp system would
  write its guts in some other language first.  After a while, it would be
  possible to bootstrap the building process in the system itself, but it
  would seem natural to build some lower-level Lisp that would enable a
  highly portable substrate to be written, and then cross-compilation would
  be a breeze, but it still seems fairly reasonable to start off with a
  different compiler or language if you want anybody to repeat the building
  process from scratch, not just for GC, but for the initial substrate.  I
  remember having to compile GNU CC on SPARC with the SunOS-supplied C
  compiler and then with the GNU CC thus built, in order to arrive at a
  "native build" and that when Sun stopped shipping compilers with their
  application-only operating system, someone was nice enough to make
  binaries available for the rest of the world.

  Why is GC so special in your view?

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Christopher Browne  
View profile  
 More options Mar 5 2002, 5:36 pm
Newsgroups: comp.lang.lisp
From: Christopher Browne <cbbro...@acm.org>
Date: Tue, 05 Mar 2002 17:25:18 -0500
Local: Tues, Mar 5 2002 5:25 pm
Subject: Re: self-hosting gc

I don't think the `impasse' is passable.

Consider that the various Unix kernels out there do NOT use "all of
C;" they use subsets that on the one hand likely permit all the
_operators_ and control structures of the base language, but which
_EXCLUDE_ great gobs of "The Standard C Library," notably anything
that forcibly depends on malloc().

One of the Frequently Asked Questions about Linux is "So why don't you
port it to C++?  Wouldn't that make it lots better?"

The _real_ answer to that:  "Because the developers prefer C."

But another pointed reason not to is that C++ subsumes into the base
language a bunch of stuff that, in C, is part of LIBC, and, which, in
many cases, depends on having malloc()/free() (or equivalents thereof)
around to do their work, what with constructors and destructors and
the like.

In order to build an OS kernel in C++, you have to very carefully pick
a subset that doesn't require any underlying "runtime support."  By
the time you gut C++ that way, what you've got is basically C with
classes, and there's little point to calling it a "C++-based OS."

With Lisp, it's much the same story; you will at the "base" have to
have some basic set of functions and operations that DO NOT REQUIRE
RUNTIME SUPPORT, because the point of the exercise is to _implement_
that runtime support.

This actually suggests there being merit to the hoary question of
"What's a good `base CL?'" where you bootstrap with some minimal set
of operators, functions, and macros, and then implement the rest of
the system on top of that.

A necessary "base" would include some basic set of operators/functions
necessary for writing the garbage collector which do not themselves
make any use of dynamic memory allocation.  [Might this mean that the
'base' would exclusively use stack-based memory allocation?  I'd tend
to think so...]

The notion that the system could bootstrap itself without that limited
'base' seems very wishful.

I'll bet an interesting OS to look at would be SPIN, which was
implemented in Modula-3.  M3 offers the same "chewy garbage collection
goodness" of Lisp; presumably the SPIN kernel has to have certain
sections that implement the "memory management runtime support" in
such a way that they require no such runtime support.

Forth would be another candidate; one of the longstanding traditions
there is the notion of implementing "target compilers" which start
with a basic set of CODE words (e.g. - assembly language) and then use
that as a bootstrap on top of which to implement the rest of the
language.  

That actually points to a somewhat reasonable approach:
 - Write a function that issues assembly language instructions
   into a function;
 - Write some functions that issue groups of assembly language
   instructions ("macros" in the assembler sense);
 - Implement a set of memory management functions using that
   "bootstrap";
 - Then you've got the basis for implementing everything else on
   top of that.

The notion of doing that without something like assembly language
macros underneath is just wishful thinking...
--
(reverse (concatenate 'string "moc.adanac@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/macros.html
Rules of  the Evil Overlord  #123. "If I  decide to hold a  contest of
skill  open to  the general  public, contestants  will be  required to
remove their  hooded cloaks and  shave their beards  before entering."
<http://www.eviloverlord.com/>


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 5 2002, 6:01 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Matthias Blume <matth...@shimizu-blume.com>
Date: Tue, 05 Mar 2002 17:45:09 -0500
Local: Tues, Mar 5 2002 5:45 pm
Subject: Re: self-hosting gc

On Tue, 05 Mar 2002 16:14:56 -0500, Tim Bradshaw wrote:
> * Christian Lynbech wrote:

>> The fix was to add signing of applets, such that also for Java you need
>> to trust the SW supplier.

> This is nice to know, and enables me to make my point more succinctly:
> (a) you need signing, and (b) do you think the average software vendor's
> digital signature is worth the bits its made of?  Better check those
> system calls...

No-one was talking about Java.

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 1:40 am
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 05 Mar 2002 22:41:55 -0800
Local: Wed, Mar 6 2002 1:41 am
Subject: Re: self-hosting gc

Marco Antoniotti <marc...@cs.nyu.edu> writes:
> I understand your points.  What I wanted to point out is that the
> `malloc' library you write under Unix is different from the one your
> write under Windows.  In (Common) Lisp, you have another layer to get
> past by: the specific CL implementation, which may or may not give you
> the necessary hooks to control the OS interface in a way that does not
> interfere with the (Common) Lisp system itself.

I'm not talking about writing it for an existing CL system, I'm
talking about writing it from the standpoint of a systems designer.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 1:50 am
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 05 Mar 2002 22:43:21 -0800
Local: Wed, Mar 6 2002 1:43 am
Subject: Re: self-hosting gc

Erik Naggum <e...@naggum.net> writes:
>   Why is GC so special in your view?

One might well need bootstrap in designing and initially building the
system.  But now, one needs *only* GCC to build GCC, and not anything
else.  Once one has a running system with GCC, you don't any longer
need the pcc compilers that GCC was originally built with.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Frode Vatvedt Fjeld  
View profile  
 More options Mar 6 2002, 3:59 am
Newsgroups: comp.lang.lisp
From: Frode Vatvedt Fjeld <fro...@acm.org>
Date: Wed, 06 Mar 2002 09:59:24 +0100
Local: Wed, Mar 6 2002 3:59 am
Subject: Re: self-hosting gc

Christopher Browne <cbbro...@acm.org> writes:
> That actually points to a somewhat reasonable approach:
>  - Write a function that issues assembly language instructions
>    into a function;
>  - Write some functions that issue groups of assembly language
>    instructions ("macros" in the assembler sense);
>  - Implement a set of memory management functions using that
>    "bootstrap";
>  - Then you've got the basis for implementing everything else on
>    top of that.

> The notion of doing that without something like assembly language
> macros underneath is just wishful thinking...

I'm working on a CL system that is pretty much based on (x86) assembly
macros like you are describing. It looks something like this:

(defun (setf car) (value cell)
  (check-type cell cons)
  (with-inline-assembly (:returns :eax)
    (:load-lexical cell :ebx)
    (:load-lexical value :eax)
    (:movl :eax (:ebx -1))))

--
Frode Vatvedt Fjeld


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 6 2002, 4:58 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Wed, 06 Mar 2002 09:57:15 GMT
Local: Wed, Mar 6 2002 4:57 am
Subject: Re: self-hosting gc
* tb+use...@becket.net (Thomas Bushnell, BSG)
| One might well need bootstrap in designing and initially building the
| system.  But now, one needs *only* GCC to build GCC, and not anything
| else.  Once one has a running system with GCC, you don't any longer
| need the pcc compilers that GCC was originally built with.

  I actually tried to argue that the same would true of a Common Lisp
  system, but that portability constraints dictate that those who want to
  port a Common Lisp compiler to System X on the Y processor should be able
  to use the portable assembler (C) instead of having to start off writing
  non-portable assembler and use the system's assembler to bootstrap from.

  Needing *only* GCC, as you say, is predicated on the existence of a
  binary for your system to begin with.  How do people port GCC to a new
  platform om which they intend to build the GNU system?  My take on this
  is that it is no less dependent on some other existing C compiler than
  the similar problem for CL compilers is.  Duane, please help.  :)

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Joe Marshall  
View profile  
 More options Mar 6 2002, 6:05 am
Newsgroups: comp.lang.lisp
From: "Joe Marshall" <prunesqual...@attbi.com>
Date: Wed, 06 Mar 2002 09:06:07 GMT
Local: Wed, Mar 6 2002 4:06 am
Subject: Re: self-hosting gc

"Christopher Browne" <cbbro...@acm.org> wrote in message

news:m3zo1mejox.fsf@chvatal.cbbrowne.com...

> A necessary "base" would include some basic set of operators/functions
> necessary for writing the garbage collector which do not themselves
> make any use of dynamic memory allocation.  [Might this mean that the
> 'base' would exclusively use stack-based memory allocation?  I'd tend
> to think so...]

To be somewhat pedantic, it isn't necessary to eschew *all* dynamic
allocation in a GC.  You just have to collect more than you cons.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nils Goesche  
View profile  
 More options Mar 6 2002, 6:23 am
Newsgroups: comp.lang.lisp
From: Nils Goesche <car...@cartan.de>
Date: 6 Mar 2002 11:23:15 GMT
Local: Wed, Mar 6 2002 6:23 am
Subject: Re: self-hosting gc

IIRC, they first write a /cross/ compiler for the new system that
runs on an old system.  Then they use the cross compiler to compile
gcc itself and voila... done.  Hey, sounds easy, doesn't it?  :-))

Regards,
--
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x42B32FC9


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Mar 6 2002, 7:34 am
Newsgroups: comp.lang.lisp
From: Tim Bradshaw <t...@cley.com>
Date: 06 Mar 2002 12:10:26 +0000
Local: Wed, Mar 6 2002 7:10 am
Subject: Re: self-hosting gc

* Erik Naggum wrote:
>   Needing *only* GCC, as you say, is predicated on the existence of a
>   binary for your system to begin with.  How do people port GCC to a new
>   platform om which they intend to build the GNU system?  My take on this
>   is that it is no less dependent on some other existing C compiler than
>   the similar problem for CL compilers is.  Duane, please help.  :)

I assume they add support for the new target to gcc, compile gcc on an
existing system targeted at the new system and then run this new
compiler on the new system.

--tim


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bijan Parsia  
View profile  
 More options Mar 6 2002, 7:44 am
Newsgroups: comp.lang.lisp
From: Bijan Parsia <bpar...@email.unc.edu>
Date: Wed, 6 Mar 2002 07:44:44 -0500
Local: Wed, Mar 6 2002 7:44 am
Subject: Re: self-hosting gc
On 5 Mar 2002, Thomas Bushnell, BSG wrote:

> Erik Naggum <e...@naggum.net> writes:

> >   Why is GC so special in your view?

> One might well need bootstrap in designing and initially building the
> system.  But now, one needs *only* GCC to build GCC, and not anything
> else.  Once one has a running system with GCC, you don't any longer
> need the pcc compilers that GCC was originally built with.

But then why the restriction that you "must" have the "full" langauge
available? Sure, Squeak uses a subset "slang" which maps fairly directly
to C and is intended to generate C which is compiled by a separate C
compiler, but it *runs* inside Squeak. You can run/debug a slang based VM
in Squeak (well, it can be done, at least :)). It's *way* slower, but
presumably that's a "mere" implementational issue (the Squeak community
doesn't have the resources to be able to afford *not* to delegate this bit
to C compilers).

There are Smalltalks (and lisps) that let you inline C or asm code...would
that be ok?

Cheers,
Bijan Parsia.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Simmons  
View profile  
 More options Mar 6 2002, 9:49 am
Newsgroups: comp.lang.lisp
From: "Martin Simmons" <zne...@xanalys.com>
Date: Wed, 6 Mar 2002 14:49:11 -0000
Local: Wed, Mar 6 2002 9:49 am
Subject: Re: self-hosting gc

"Tim Bradshaw" <t...@cley.com> wrote in message news:ey3g03dgaml.fsf@cley.com...
> * Erik Naggum wrote:
> >   Needing *only* GCC, as you say, is predicated on the existence of a
> >   binary for your system to begin with.  How do people port GCC to a new
> >   platform om which they intend to build the GNU system?  My take on this
> >   is that it is no less dependent on some other existing C compiler than
> >   the similar problem for CL compilers is.  Duane, please help.  :)

> I assume they add support for the new target to gcc, compile gcc on an
> existing system targeted at the new system and then run this new
> compiler on the new system.

Correct, though it is often complicated by object file formats.  One approach is
to generate textual assembly language on the host machine, which is then
assembled and linked on the target machine (using existing tools).  Another
approach is to retarget the equivalent GNU tools and generate the binaries
directly on the host machine.
--
Martin Simmons, Xanalys Software Tools
zne...@xanalys.com
rot13 to reply

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ray Dillinger  
View profile  
 More options Mar 6 2002, 12:10 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Ray Dillinger <b...@sonic.net>
Date: Wed, 06 Mar 2002 17:09:55 GMT
Local: Wed, Mar 6 2002 12:09 pm
Subject: Re: self-hosting gc

NAK!  This implies that nobody can modify the compiler.  If you
have a compiler that signs its output, then somebody can open up
the source code and find the signing key.  Then the signing key
can be used to sign arbitrary output.  That means you cannot
release the source code for your compiler.  

Or maybe read priveleges to it are root-only and root can set the
signing key for a particular installation -- but then you have a
problem that nobody can compile on one system and run on another.

Far far better to have potentially-dangerous processes running in
their own memory arenas where the OS can keep an eye on them in
case they try messing anything up.

                                Bear


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 6 2002, 12:32 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Wed, 06 Mar 2002 17:32:16 GMT
Local: Wed, Mar 6 2002 12:32 pm
Subject: Re: self-hosting gc
* Nils Goesche
| IIRC, they first write a /cross/ compiler for the new system that
| runs on an old system.  Then they use the cross compiler to compile
| gcc itself and voila... done.  Hey, sounds easy, doesn't it?  :-))

  It sounds like _vastly_ more work than building on the native system with
  a native assembler and linker to build the first executables until you
  could replace those, too.  

  Back in the old days, I wrote 8080 and Z80 code on the PDP-10 and its
  cross-assembler for "microcomputers", because it was so fantastically
  more convenient to work on a real computer and deploy on a toy than work
  on the toy computer -- mostly all I did on the toy computer was to write
  an excellent terminal emulation program, in assembler.  However, the only
  reason this was more convenient was that it was a royal pain in the butt
  to try to use the toy computer for any development.  However, I had to
  copy the ROMs in that machine to the PDP-10 and basically regenerate its
  symbol table in order to make things work correctly.  Luckily, it had an
  emulator, and curiously, the PDP-10 emulated the code about 100 times
  faster than my toy computer executed it.  Were it not for the 100,000
  times difference in the cost of acquisition and ownership of the two
  computers, I would certainly have replaced my Exidy Sorcerer with a
  PDP-10.  Come to think of, my current home computer is strong enough to
  emulate a PDP-10 about 100 times faster than the real thing, too...

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthias Blume  
View profile  
 More options Mar 6 2002, 12:47 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: Matthias Blume <matth...@shimizu-blume.com>
Date: Wed, 06 Mar 2002 12:35:23 -0500
Local: Wed, Mar 6 2002 12:35 pm
Subject: Re: self-hosting gc

Yes.  But there are far better methods than just signing the output of
the compiler.  In particular, read up on proof-carrying code:  It does
not require a certifying compiler (you can even write the code by hand
as long as you also write the corresponding proof).  Code (and
proof!) can come from anywhere. Finally, the trusted computing base can be
far smaller than a typical compiler.

Matthias


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bulent Murtezaoglu  
View profile  
 More options Mar 6 2002, 1:17 pm
Newsgroups: comp.lang.lisp
From: Bulent Murtezaoglu <b...@acm.org>
Date: Wed, 06 Mar 2002 18:14:01 GMT
Local: Wed, Mar 6 2002 1:14 pm
Subject: Re: self-hosting gc
>>>>> "NN" == Nicolas Neuss <Nicolas.Ne...@iwr.uni-heidelberg.de> writes:

[...]
    NN> I'm sorry for it, but the above is nonsense.  CMUCL allocates
    NN> full words also for booleans (as can be seen from the consed
    NN> bytes).  [Additionally, the original code contained an
    NN> omission (it does not reinitialize the array for each test
    NN> run), which I have augmented with another error...]

I went back and forth with Doug on this.  There are several issues you
need to think about:

-- If you use bit vectors, you pay for some shifting of bits etc.
   CMUCL actually generates somewhat suboptimal code for the X86 platform
   (and extra mask and returning a value that's not used)

-- If you use eight bit BYTE's and don't coerce the compiler to use machine
   integers to address the array.  You again pay for shifting for fixnum
   untagging.

-- If you use fixnums or machine integers (32 bit) fixnums can address
   them w/o untagging, and you get fast results BUT this is cheating (won't
   scale and it will spill over to L2 cache even with a small array.  Doug's
   machine has 1/2 speed L2 cache (P-II) so your results will vary if you
   have a full speed L2 (eg celeron will beat regular P-II).

this is about all I remember.  

There was also the additional issue of loop macro and declarations if
remember correctly.

Doug probably has a changelog of all this somewhere.  But disassemble and the
compiler trace facility of CMUCL should be helpful also.

cheers,

BM


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
mdanish  
View profile  
 More options Mar 6 2002, 2:07 pm
Newsgroups: comp.lang.lisp
From: mdan...@andrew.cmu.edu
Date: Wed, 6 Mar 2002 14:01:32 -0500
Local: Wed, Mar 6 2002 2:01 pm
Subject: Re: self-hosting gc

No need to sign output.  Simply disallow any binaries that were not
created by that machine's compiler.  In order to run source code it
must be passed through, and checked, by the compiler on that machine.

> Or maybe read priveleges to it are root-only and root can set the
> signing key for a particular installation -- but then you have a
> problem that nobody can compile on one system and run on another.

Oh well.  FreeBSD gets by (though it's not required to compile, they
still do a lot).

> Far far better to have potentially-dangerous processes running in
> their own memory arenas where the OS can keep an eye on them in
> case they try messing anything up.

Context-switches are expensive, remember.  An OS/compiler that removed
as many layers as possible between program and underlying hardware would
be much faster; if the compiler has a chance to examine every piece
of code that goes in the system then it may be able to do this.

--
; Matthew Danish <mdan...@andrew.cmu.edu>
; OpenPGP public key: C24B6010 on keyring.debian.org
; Signed or encrypted mail welcome.
; "There is no dark side of the moon really; matter of fact, it's all dark."


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Mar 6 2002, 2:20 pm
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.net>
Date: Wed, 06 Mar 2002 19:20:27 GMT
Local: Wed, Mar 6 2002 2:20 pm
Subject: Re: self-hosting gc
* Tim Bradshaw
| I assume they add support for the new target to gcc, compile gcc on an
| existing system targeted at the new system and then run this new compiler
| on the new system.

  This is probably doable, but in my experience with cross-compilation, you
  do not just generate code, you effectively generate a module that works
  with a much larger system.  To make this _really_ work, you have to have
  intimate knowledge of the target system.  Since the compiler is often the
  first thing you build on a new system in order to build the other tools
  you want to use there, my thinking is that you save a lot of time using a
  pre-existing compiler and like tool, particularly to ensure that you get
  the linking information right for that particular environment, what with
  all the shared library dependencies and whatnot.

///
--
  In a fight against something, the fight has value, victory has none.
  In a fight for something, the fight is a loss, victory merely relief.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 4:50 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 13:47:57 -0800
Local: Wed, Mar 6 2002 4:47 pm
Subject: Re: self-hosting gc

Erik Naggum <e...@naggum.net> writes:
>   I actually tried to argue that the same would true of a Common Lisp
>   system, but that portability constraints dictate that those who want to
>   port a Common Lisp compiler to System X on the Y processor should be able
>   to use the portable assembler (C) instead of having to start off writing
>   non-portable assembler and use the system's assembler to bootstrap from.

You'll always need an assembler, of course; there isn't any way around
that.  And there are advantages for systems like the old KCL and its
descendents which use GCC as the back end for the compiler.

But I'm thinking about a different problem space, not the one you are.

>   Needing *only* GCC, as you say, is predicated on the existence of a
>   binary for your system to begin with.  How do people port GCC to a new
>   platform om which they intend to build the GNU system?  My take on this
>   is that it is no less dependent on some other existing C compiler than
>   the similar problem for CL compilers is.  Duane, please help.  :)

People port GCC to new platforms by having GCC cross-compile code.
No reliance on other compilers is necessary.

MIT Scheme gets ported the same way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 4:50 pm
Newsgroups: comp.lang.lisp, comp.lang.scheme
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 13:45:32 -0800
Local: Wed, Mar 6 2002 4:45 pm
Subject: Re: self-hosting gc

Ray Dillinger <b...@sonic.net> writes:
> NAK!  This implies that nobody can modify the compiler.  If you
> have a compiler that signs its output, then somebody can open up
> the source code and find the signing key.  Then the signing key
> can be used to sign arbitrary output.  That means you cannot
> release the source code for your compiler.  

No, a trusted compiler is simply the only object that has the ability
to create compiled-procedure objects.  No problem at all!  

Well, the problem is still that only the one compiler is the trusted
one.  Two solutions for that problem are to use a subsetted bytecode
thing, like the Java VM, and to use proof-carrying code to validate
compiler output.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 4:50 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 13:48:48 -0800
Local: Wed, Mar 6 2002 4:48 pm
Subject: Re: self-hosting gc

Erik Naggum <e...@naggum.net> writes:
>   This is probably doable, but in my experience with cross-compilation, you
>   do not just generate code, you effectively generate a module that works
>   with a much larger system.  To make this _really_ work, you have to have
>   intimate knowledge of the target system.  Since the compiler is often the
>   first thing you build on a new system in order to build the other tools
>   you want to use there, my thinking is that you save a lot of time using a
>   pre-existing compiler and like tool, particularly to ensure that you get
>   the linking information right for that particular environment, what with
>   all the shared library dependencies and whatnot.

No, Tim was totally right.  You don't use the pre-existing compiler in
general; often times the manufacturer isn't providing one.

Often you are the first person writing one: this is now rather often
the case with GCC.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 4:50 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 13:49:59 -0800
Local: Wed, Mar 6 2002 4:49 pm
Subject: Re: self-hosting gc

Bijan Parsia <bpar...@email.unc.edu> writes:
> But then why the restriction that you "must" have the "full" langauge
> available?

Because I'm looking for solutions to the hard problem, not ways of
solving a different problem.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 4:50 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 13:49:24 -0800
Local: Wed, Mar 6 2002 4:49 pm
Subject: Re: self-hosting gc

Erik Naggum <e...@naggum.net> writes:
> * Nils Goesche
> | IIRC, they first write a /cross/ compiler for the new system that
> | runs on an old system.  Then they use the cross compiler to compile
> | gcc itself and voila... done.  Hey, sounds easy, doesn't it?  :-))

>   It sounds like _vastly_ more work than building on the native system with
>   a native assembler and linker to build the first executables until you
>   could replace those, too.  

You really have no clue how GCC works if you think it's more trouble.
Really, GCC is totally equipped to do cross-compilation (as are all
the other parts of the toolchain).

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Duane Rettig  
View profile  
 More options Mar 6 2002, 5:00 pm
Newsgroups: comp.lang.lisp
From: Duane Rettig <du...@franz.com>
Date: Wed, 06 Mar 2002 22:00:01 GMT
Local: Wed, Mar 6 2002 5:00 pm
Subject: Re: self-hosting gc

I don't view self-hosting as precluding any bootstrapping which is
necessary to get to that self-hosting state.  In fact, I would be
surprised to hear of _any_ kind of self-hosting which doesn't require
a non-self-hosted bootstrap.  This applies to both cross-compiling
from another architecture and re-compiling on the same architecture
starting with a different compiler.

Thomas Bushnell's challenge is a good one.  And this thread has been
a good one, as well.  Several times I considered answering some of the
statements made on this thread, but have refrained because there are
so many issues and stochastic requirements.  So I thought I'd put
together several ideas and present them at once, from the point of
view of an Allegro CL developer.

As an initial summary, I submit that the entire lisp _could_ be
written entirely in lisp, but that it is not convenient to do so,
given the fact that we run our lisp on Unix and MS systems, which
are all C based, and even embedded systems tend to have libc
equivalent interfaces.  However, I do disagree that it is necessary
to require that the whole language be available for a GC written in
lisp, and will explain that later as well.

First, a background review of Allegro CL's structure, for those
who don't yet know:

 1. Most of Allegro CL is written in Allegro CL, and compiles per
architecture to bits (represented in code vector lisp objects)
using the Allegro CL compile and compile-file functions.

 2. A subsection of the kernel or "runtime" of Allegro CL is an
extension of CL I call runtime or "rs" code, which also use the
Allegro CL compiler, extended and hooked to produce assembler
source as output.

 3. Some small part of Allegro CL is written in C.  On some
architectures, the C++ compiler is used, but it is mostly written
in C style.  The major purpose of the C code is to parse the .h
header files of the system for the os interface.  We try mostly
to limit our C code to os-interface functionality and regularization.

In addition, as a kind of #3a: We also have written our garbage-collector
and our fasl-file reader in C.

The binaries from 2, 3, and 3a are all linked together using the system
linker to either produce a single executable, or to produce a simple
executable main and a shared-library.  In both cases, that link output
serves dual purpose as a bootstrap mechanism to load pure lisp code in
(i.e. from #1) or to re-estalish the environment dumped in a previous
lisp session.

The rs code in #2 is sort of a combination of superset/subset of regular
CL code; it understands both C and Lisp calling conventions, but does not
set up a "current function" for its own operation.  Since the produced
code is just assembler source, and does not set up a function object,
local constants are not allowed; only constants that are in the lisp's
global table can be referenced by rs code.  Recently, I added an
exception to this; string constants can now be represented in rs
code - these will become .asciz or equivalent directives in the
assembler source.  This allows such rs functions as

(def-runtime-q print-answer (n)
  (q-c-call printf "The answer is %d
" n))

I have also recently extended the rs code to allow for large
stack-allocated specialized arrays; we've always been able to
allocate stack-based simple-vectors in rs code, but due to the
rs code stack frame descriptors we provided for gc purposes, non-lisp
data had been restricted to a few hundred bytes until now.

Theoretically, due to these and other changes, we should now be
able to rewrite both the fasl reader and the garbage-collector
in rs code, but it hasn't been a high priority.  For the garbage
collector especially, there must be an incentive to make such a
potentially regressive move; it may be that a new gc to handle
concurrent MP might be just that incentive.

For #3, I was almost ready to disagree with Thomas Bushnell
because I believed that it is necessary to use C functionality
to interface to C library functions.  This is especially true
for the need to parse .h files, and to get the correct
definitions and interfaces based on particular #define constants.
If you doubt this, just try to figure out, for example, HP's
sigcontext structure, which has layer upon layer of C macrology
to define a large number of incompatible structure and interface
definitions.

However, I had to back off on any such disagreement, becuase it
certainly is _possible_ to write any of these interface functions
in lisp, using such facilities as our Cbind tool to pre-parse the
header files and to thus present all pertinent information to
the lisp-in-lisp code.  However, I still am not inclined to do such
a thing, because it would be specialized toward lisp bootstrap, and
thus not useful for anything else.  And why not use C at what it does
best (parse C header files)?  Besides, even our Cbind facility uses
the gcc front-end to do the initial parsing, so in essence a non-lisp
compiler part would still be used.  Bottom line; it is more convenient
to write our os-interface code in C, because it interfaces to C
libraries.  I suppose that we would remove such C interfaces if we
were porting our lisp to a Lisp operatring system.

Finally, I'd like to disagree wholeheartedly with the notion that
the full language must be available for the whole lisp implementation.
Specifically, I am responding to this point by Thomas Bushnell:

>> I thought I was bright-shining clear.  What I want is a GC written in
>> the language itself, with all the normal language facilities
>> available.

It is the notion of "availability" that I take issue with.  To
make my point, consider the statement that in every English
sentence, all letters of the alphabet are available.  That is,
of course, a true statement.  And as in the specific example where
"The quick brown fox jumped over the lazy dogs." it is obviously
possible to construct a sentence which _indeed_ does use every
letter in the alphabet.  However, does this require that every
sentence be constructed in such a way?  Of course not!  It is thus
not the whole alphabet which is available to a particular sentence,
but only those letters which in fact work toward constructing the
sentence, which are in fact "available".  Thus, for normal
conversation, the letter "q" is not generally available to me to
use unless I am using a word which has a "q" in it (or unless I'm
specifically talking about the letter "q" itself).

Let's extend this notion to an extensible language like Lisp.
Consider the start of a CL function foo:

(defun foo ()
   ...)

Now, the body of foo can refer to any CL functionality, including
foo itself.  However, it would generally be bad programming (i.e.
a bug) to allow a call to foo within foo which results in an
infinite recursion.  Thus, to some extent, foo is not fully
available to use as one wishes within foo.

Similar truths apply to a garbage-collector.  It might be
perfectly acceptable for a gc function to call cons, but it
had better be prepared to deal with the case where there is
no more room for a new cons cell, which would thus cause a
recursive call to the garbage-collector (presumably an infinite
recursion, since the reason for the initial gc call might have
been for lack of space).

And, as The Oracle in Matrix says, "What's really going to
bake your noodle ..." is that at least in CL, there is no
definition of what a garbage-collector actually _is_.  There are
a few references, but no definitions or specs...

--
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   du...@Franz.COM (internet)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Thomas Bushnell, BSG  
View profile  
 More options Mar 6 2002, 5:20 pm
Newsgroups: comp.lang.lisp
From: tb+use...@becket.net (Thomas Bushnell, BSG)
Date: 06 Mar 2002 14:19:30 -0800
Local: Wed, Mar 6 2002 5:19 pm
Subject: Re: self-hosting gc

Duane Rettig <du...@franz.com> writes:
> As an initial summary, I submit that the entire lisp _could_ be
> written entirely in lisp, but that it is not convenient to do so,
> given the fact that we run our lisp on Unix and MS systems, which
> are all C based, and even embedded systems tend to have libc
> equivalent interfaces.  

So it should be pointed out that one of the reasons I'm interested in
this question is that I'm interested in lisp systems running on bare
metal.

> However, I do disagree that it is necessary
> to require that the whole language be available for a GC written in
> lisp, and will explain that later as well.

I agree that it may not be *necessary* depending on what that means.

But note that I began by asking about both Scheme and CL; the point is
that of course I could confine myself to a tiny subset of CL and do
things in PL/I (er, I mean "the loop macro").  

However, the real things I want are fairly simple:  I want complex
closures and I want cons.  I might want call/cc, at least, I'm not
willing to exclude that a priori.

> For #3, I was almost ready to disagree with Thomas Bushnell
> because I believed that it is necessary to use C functionality
> to interface to C library functions.  

If you really need to, you can do that, and it may well be the most
efficient implementation strategy if you want to run on Unix.  (As, of
course, you do.)

A "pure" implementation means that you would do the same work the C
library people do, and make Lisp equivalents for the C header files
yourself.  Remember, *I'm* always thinking of this from a systems
design perspective, so "tell the other group to do the work" isn't
really a solution. :) But if the other group is doing the work anyway
(as is the case for people running a Lisp environment on Unix, then of
course, it's convenient to piggyback on them.

> For #3, I was almost ready to disagree with Thomas Bushnell
> because I believed that it is necessary to use C functionality
> to interface to C library functions.  

The actual interfaces you need are the *kernel* interfaces, not
interfaces to the C library.  From the systems design perspective,
your system would be *replacing* the C library, not borrowing it.  If
you do want to borrow it, then it might be most convenient to use C to
hook into it, though as you correctly note, even then you can get
around it.

> Similar truths apply to a garbage-collector.  It might be
> perfectly acceptable for a gc function to call cons, but it
> had better be prepared to deal with the case where there is
> no more room for a new cons cell, which would thus cause a
> recursive call to the garbage-collector (presumably an infinite
> recursion, since the reason for the initial gc call might have
> been for lack of space).

So the *point* of my question is, in part, just this problem.  Now, if
there *isn't* a solution, then you have to subset the language, omit
cons, and then code your GC.

But why do that if there is a convenient solution?

One strategy:  suppose to GC an arena of N bytes takes N/10 bytes of
memory to hold dynamically allocated GC data structures.

One strategy is to just save that space always, so it's there.  Or, if
one is using stop-and-copy, then it's even easier to find space.

If one is in a multi-threaded world, and each thread gets its own
allocation arena for normal allocation, and you are using a
stop-the-world approach to GC, then you can't reliably assume
(perhaps) that all the threads have left their arena in an ideal
state.  That means that the GC will probably have to allocate out of a
totally separate arena from what other programs use.  When it's done,
a quick GC pass (allocating from the main heap) can be run to clean
the special GC arena, and copy anything remaining there onto the main
heap.  


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 76 - 100 of 238 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »