Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

scheme coding help...

1 view
Skip to first unread message

tem...@yahoo.com

unread,
Mar 26, 2008, 5:13:42 PM3/26/08
to
Just a couple questions about scheme, thanks for all your help in
advance.

Say for example, you have a file with one line of text in it.
How would you go about doing the following:
Take the file name as a command line argument
Retrieve the first and only line of the file
Place the line into a list with each word as separate elements

Just to clarify,
Say the user wants to access a file in the current working directory
with the name "sample.txt"
And inside the file is a line of text that looks just like a list such
as:
(This is a list of elements)
How would u parse that text into a list that looks just like it?

Griff

unread,
Mar 26, 2008, 9:35:54 PM3/26/08
to
Start by downloading DrScheme:

http://www.plt-scheme.org/software/drscheme/

tem...@yahoo.com

unread,
Mar 27, 2008, 1:01:52 AM3/27/08
to
On Mar 26, 9:35 pm, Griff <gret...@gmail.com> wrote:
> Start by downloading DrScheme:
>
> http://www.plt-scheme.org/software/drscheme/

thanks, i have mzscheme and i've been playing with it.
i was looking for more of a practical coding way to go through it.

phi5...@yahoo.ca

unread,
Mar 27, 2008, 1:36:16 AM3/27/08
to


I recommend that you try Bigloo. It is more powerful than other brands
of Scheme, and much easier to use. If you are using Linux or
MacIntosh, then instalation is straightforward:

./configure
make
make install

If you prefer, open Synaptic, search for the package, press Apply;
that is all. If you are using Windows, things are slightly more
difficult. Happily enough, Junia prepared an installation package,
with a nice GUI, and a very complete manual. The address of her page
is

code.google.com/o/stalingui

Now, let us see your problem. You should solve it by parts; this is
called incremental programming strategy, or IPS, for short. Let us
write a program that reads all lines of a given file, whose name you
will provide via command line. Here is the program:

(module readfile)

(with-input-from-file (cadr (command-line))
(lambda()
(print (read-lines))))

That is all. This program reads all the lines of the file, and puts
them in a list. Now, let us see how to solve the problem, the way you
clarified it. If the first line is the only one you want, and it is in
the form of a list, then nothing could be easier than reading it:

(module readfile)

(with-input-from-file (cadr (command-line))
(lambda()
(print (read))))

Now, let us increase the difficulty of the problem just a little bit.
Let us suppose that the words are not between parentheses. Then, the
solution is:

(module readfile)

(with-input-from-file (cadr (command-line))
(lambda()
(print (string-split (read-line) " "))) )

It seems that we have solved you problem completely. There is only one
last thing that we can improve: The program has no error handling
capacity. However, this is very easy to fix:

(module readfile)

(with-handler (lambda(e) (print "Usage: rdfile <filename>") )
(with-input-from-file (cadr (command-line))
(lambda()
(print (string-split (read-line) " "))
); end of lambda
)
)

That is your program: Small, simple, and safe. It is easy to compile
it in Bigloo:

bigloo -static-bigloo -Obench rdfile.scm -o rdfile

The option -static-bigloo makes sure that all runtime will go with the
executables. In Bigloo, the runtime is very small. If your program is
an assignment, it is a good idea to bundle everything in the same
file; this prevents failure in the machine of your TA. Of course, if
you want to run the program in a machine that has Bigloo installed,
then you can drop the -static-bigloo option:

bigloo rdfile.scm -o rdfile

BTW, I also tried to write the same program in DrScheme. The
distribution is three times larger than in the case of static Bigloo,
it takes longer to compile, and the code is slower. In any case, here
is one of the programs in DrScheme:

(module rdfile mzscheme
(with-input-from-file
(vector-ref (current-command-line-arguments) 0)
(lambda()
(display (read))))
)

Abdulaziz Ghuloum

unread,
Mar 27, 2008, 4:48:55 AM3/27/08
to
phi5...@yahoo.ca wrote:

> I recommend that you try Bigloo. It is more powerful than other

> brands of Scheme, and much easier to use. ...

Weren't you hyping Stalin as the fastest, most powerful and easiest
to use just a couple of weeks ago? How did Bigloo rise on top of
Stalin in such short time?

Griff

unread,
Mar 27, 2008, 8:26:45 AM3/27/08
to
Phi will you do my homework for me, too? :)

tem...@yahoo.com

unread,
Mar 27, 2008, 9:46:27 AM3/27/08
to

phi5...@yahoo.ca

unread,
Mar 27, 2008, 10:54:49 PM3/27/08
to
On 27 mar, 05:48, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu>
wrote:

Hi,...
If you take a look at the page of my group, you will see that we use
both Stalin, and Bigloo; I favor Bigloo, but my fellow students favor
Stalin. I remember that I said that Stalin is the most friendly, but I
do not remember of saying that it is the fastest. I really do not know
whether it is the fastest. Is it? However, I decided not recommend
Stalin in the present case, because

1) Stalin is very slow to compile. It takes 20, 30 seconds to compile
even the simplest program. Therefore, my fellow students and I use it
only for the final compilation; we also limit the use of Stalin to
number crunching problems (Numerical Analysis) because...

2) It has no libraries that I know. Perhaps I am wrong in saying that,
but I never came across Stalin libraries to process strings, connect
to the Internet, tokenize texts, do syntactical analysis, etc.

My group tries every Scheme that we succeed in installing in all
machines that are available to us. Then we choose the best one for a
given problem. We also try to give an interesting solution to all
problems. Our goal is not get rid of the assignment, but learn how to
do things well. We want fast programs, small exec files, small
distributions, beautiful gui, etc. For instance, tomorrow we will
post a new GUI for Bigloo and Stalin.

Therefore, do not understand me wrong. I am not a propagantist of
Stalin, or Bigloo. I will be happy if I come across another Scheme
that is faster than Stalin (or Bigloo), easier to use, generates small
distributions, and has good libraries. From these attributes, my group
cherishes most speed and the size of compiled stand alone code. These
attributes get us better marks in college.

Now, let me tell you what I did before answering to tem0's mail. The
first thing was to download DrScheme, since Griff has recommended it.
Then I tested the following Schemes, to see which one would provide
the best solution: 1) DrScheme, 2) Bigloo, 3) Stalin, 4) Gambit, 5)
Chicken. I also tried Ikarus, Lacerny, and Chez Scheme. Since Chez
Scheme is not free, I ruled it out; I tried to install Ikarus on my
Windows platform, under mingw, and I failed; I heard that one can
compile it under cygwin, but since the rules of the group require
native feeling, and small stand alone distribution in all platforms
available to us, I gave up Ikarus. Now, let us see the Schemes that I
succeeded in installing.

4) Gambit. Instalation was easy, but I failed in generating a stand
alone executable file. The manual does not help very much, since it
says that compiling the C files generated by the compiler depends on
the platform.

5) Chicken. Small stand alone, but very slow (tipically, from 5 to 10
times slower than Bigloo). Not very friendly, i.e., does not help
programmers to catch errors at compile time. Libraries distributed
through eggs; since my college blocks external proxies, I was not able
to download string libraries.

3) Stalin. It does not provide string processing libraries; at least I
am not aware of a repository of Stalin libraries; since you seem to be
offended because I gave up Stalin for this particular problem, tell me
about good libraries for Stalin, and I will be happy in using it for
similar problems. In any case, I solved tem0's problem in Stalin, but
the solution was quite large (20 lines) because Stalin lacks libraries
to process strings. In any case, it generated the smallest stand alone
file.

2) Bigloo. It is from 2 to 3 times slower than Stalin when it comes to
number crunching. For other kind of problems, it is about as fast as
Stalin. It has good string processing libraries, good GUI (based on
GTK, on dotNet, and on JVM), etc. When it comes to string processing,
it has regular expressions, Lalr grammars, etc. Distribution is
somewhat smaller than 1Mb, that is the limit set by the TA of
functional programming for the assignments.

1) DrScheme. It is very unfriendly, and very slow. Let me explain
what I mean by very slow. It takes about 15 seconds to enter the IDE.
It chooses the language that it thinks that you should speak (in my
case, it guessed wrong smile :-). Perhaps because I use a Canadian OS,
it guessed French. I suppose that there is a way to switch to English,
or to Greek, but I did not have time or disposition to find out. I
typed my program, and selected the Scheme/Build option (I am
translating from French). I was not prompted for optimizations. I
suppose that there are options for optimizations, but I could not find
any help explaining how to use them. I tried the option --help from
the command line, to no avail. The result was that the distribution
turned out to be huge (5M), and slow. Stalin distribution for the same
problem is less than 200K, and Bigloo static distribution is 945K. I
tried a few benchmarks pitting mzscheme against Bigloo, and Stalin
(binary tree, recusive programs, etc.) and it turned out to be from 6
to 20 times slows. I insist, I was not able to do any optimization in
mzscheme. The worse part is that I made two mistakes, and did not
receive any warning! In one of the mistakes, I had a wrong close
parentheses, in such a way that I got a (display ...) with many
arguments; only after running the program many times, I got a runtime
error. Stalin and Bigloo refused the buggy program right away.


BTW, I asked here for help in compiling programs with Gambit, but I
believe that I did not receive any answer. I would be happy in getting
help in setting optimization options for mzscheme, and in installing
Ikarus for mingw gcc.

phi5...@yahoo.ca

unread,
Mar 27, 2008, 11:08:15 PM3/27/08
to
On 27 mar, 09:26, Griff <gret...@gmail.com> wrote:
> Phi will you do my homework for me, too? :)

Hi, Griff.
You are most welcome to our group. However, I advise you that the
group does not do homework for people who do not want to work hard.
What we do is the following:

(1) Select the most interesting assignments, and work together to
provide elegant and complete solutions; for instance, a TA asks for a
pretty print, and gets a fairly complete editor, with pretty print,
parentheses checking, etc. This example is real, and we will post it
tomorrow.

(2) Help beginners to learn Scheme. We have concise tutorials (number
maximum of pages: 100), written in LaTeX. You can get a flavor of
these tutorials on our page:

code.google.com/p/stalingui

If a member of the group is not working hard, he is expelled, i.e., he
has no right to propose assignments to the members of the group. In
any case, you are most welcome. However, I think that you are a good
Scheme programmer. I believe that you will help us more, than get help
from us (smile :-)

Griff

unread,
Mar 29, 2008, 5:55:30 PM3/29/08
to
On Mar 27, 9:54 pm, phi50...@yahoo.ca wrote:
> 1) DrScheme. It is very unfriendly,

Ack! It is not supposed to be! DrScheme is really set up to make it
easy for folks to use. I would be interested to know, what were the
obstacles for you getting what you need in terms of usability?

> Let me explain what I mean by very slow. It takes about 15 seconds to enter the IDE.

It sounds like your primary goal is speed and compactness of compiled
artifacts. By your criteria, the best folks with whom to speak about
performing such optimizations are on the PLT discussion list:

http://www.plt-scheme.org/maillist/

My take on PLT is that they provide an "all-around best approach" in
terms of interpreter, language, libraries, ide, documentation, and
vision. That might not be the right place for your group! But, it is
right for some folks!

> The result was that the distribution turned out to be huge (5M), and slow.

I haven't solved the problem presented by the original poster, but I
can tell you that on Windows XP Pro with DrScheme v372, a "Hello,
world." application gets compiled down to 240KB. 5M sounds way too
big. Here is the code:

(module hello-world mzscheme
(display "Hello, world."))

put it in a file called "hello-world.ss"

and compile it with

mzc --exe hello-world hello-world.ss

Aaron Hsu

unread,
Mar 29, 2008, 7:22:21 PM3/29/08
to
Griff <gre...@gmail.com> writes:

>On Mar 27, 9:54 pm, phi50...@yahoo.ca wrote:
>> 1) DrScheme. It is very unfriendly,

>Ack! It is not supposed to be! DrScheme is really set up to make it
>easy for folks to use. I would be interested to know, what were the
>obstacles for you getting what you need in terms of usability?

Part of the issues this poster is having, I believe, stems from the
PLT's handling of type mismatches, which are considered runtime errors
in PLT Schemes.

>> Let me explain what I mean by very slow. It takes about 15 seconds
>> to enter the IDE.

>It sounds like your primary goal is speed and compactness of compiled
>artifacts. By your criteria, the best folks with whom to speak about
>performing such optimizations are on the PLT discussion list:

>http://www.plt-scheme.org/maillist/

>My take on PLT is that they provide an "all-around best approach" in
>terms of interpreter, language, libraries, ide, documentation, and
>vision. That might not be the right place for your group! But, it is
>right for some folks!

PLT's strengths also serve to make for some confusing weaknesses. While
what you can do in PLT Scheme is vast and powerful, and while DrScheme
serves as a great pedagogical tool, sometimes figuring out how to fit
this whole beast into your workflow takes some effort. For example, I
know that I often see questions about why PLT Scheme is so slow. This
usually comes from people who enjoyed using PLT Scheme to learn Scheme
but who now wish to make faster programs. They do not realize that
mzscheme does JIT compilation and benefits from module encapsulation.
This kind of thing isn't always obvious, simply because there is so much
that PLT Scheme does.

>> The result was that the distribution turned out to be huge (5M), and slow.

>I haven't solved the problem presented by the original poster, but I
>can tell you that on Windows XP Pro with DrScheme v372, a "Hello,
>world." application gets compiled down to 240KB. 5M sounds way too
>big. Here is the code:

>(module hello-world mzscheme
> (display "Hello, world."))

>put it in a file called "hello-world.ss"

>and compile it with

>mzc --exe hello-world hello-world.ss

I believe the original poster is using some default settings in
DrScheme, IIRC, and with debugging and all the other pieces of
information out there, along with runtime libraries and the like, things
turn out to be rather big. I think this is just a matter of non-obvious
features, since there is almost always more than one way to do something
in PLT Scheme.
--
Aaron Hsu <arc...@sacrideo.us> | Jabber: arc...@jabber.org
``Government is the great fiction through which everybody endeavors to
live at the expense of everybody else.'' - Frederic Bastiat

George Neuner

unread,
Mar 30, 2008, 2:19:58 AM3/30/08
to
On Sat, 29 Mar 2008 18:22:21 -0500, Aaron Hsu <arc...@sacrideo.us>
wrote:

Very true. PLT is an excellent learning tool and a good all-around
development platform, but it is quite difficult to figure out how to
create and deliver an optimized program using it. Much of the
relevant documentation is not included in the default distribution and
must be downloaded separately from the PLT site.

George
--
for email reply remove "/" from address

Griff

unread,
Mar 30, 2008, 12:35:52 PM3/30/08
to
On Mar 30, 1:19 am, George Neuner <gneuner2/@/comcast.net> wrote:

> Much of the relevant documentation is not included in the default distribution and
> must be downloaded separately from the PLT site.

George are you referring to the relevant documentation on
optimization, or on the documentation in general, that needs to be
downloaded separately?

George Neuner

unread,
Mar 30, 2008, 9:26:52 PM3/30/08
to
On Sun, 30 Mar 2008 09:35:52 -0700 (PDT), Griff <gre...@gmail.com>
wrote:

Sorry ... I should have been more specific.

The distribution does not include help desk files for a number of
advanced subjects: compiler, runtime, FFI, GUI framework, web server,
certain of the libraries, etc. These subjects are (supposed to be)
links to optional doc packages which can be demand download from the
PLT site when referenced.

Writing optimized programs for PLT requires (minimally) reading the
following manuals:

- Inside PLT MzScheme
- PLT mzc: MzScheme Compiler Manual
- PLT Foreign Interface Manual

I can't speak for v372 (the current version), but as of v370 - the
last one I installed - the links for these and a few other help desk
books did nothing at all ... the doc packages had to downloaded and
installed manually (after which the links did work locally).

PDF versions of all the manuals are also available on the PLT site and
they remain manual downloads.

If the situation has changed with the latest version then I apologize
for injecting confusion.

Jens Axel Soegaard

unread,
Mar 31, 2008, 11:00:40 AM3/31/08
to
George Neuner skrev:

> I can't speak for v372 (the current version), but as of v370 - the
> last one I installed - the links for these and a few other help desk
> books did nothing at all ... the doc packages had to downloaded and
> installed manually (after which the links did work locally).

One man's manuals is another's almost automatically.

The are links to the "missing" manuals in the HelpDesk, and
if they are not present, you are presented for a link which
automatically downloads and installs them.

> PDF versions of all the manuals are also available on the PLT site and
> they remain manual downloads.
>
> If the situation has changed with the latest version then I apologize
> for injecting confusion.

An easy way to make sure you have all manuals is to fetch the
"full" version on the prerelease page.

--
Jens Axel Søgaard

George Neuner

unread,
Mar 31, 2008, 9:51:07 PM3/31/08
to
On Mon, 31 Mar 2008 17:00:40 +0200, Jens Axel Soegaard
<inv...@soegaard.net> wrote:

>George Neuner skrev:
>
>> I can't speak for v372 (the current version), but as of v370 - the
>> last one I installed - the links for these and a few other help desk
>> books did nothing at all ... the doc packages had to downloaded and
>> installed manually (after which the links did work locally).
>
>One man's manuals is another's almost automatically.
>
>The are links to the "missing" manuals in the HelpDesk, and
>if they are not present, you are presented for a link which
>automatically downloads and installs them.

I'll say it again.

Some (not all) of the help desk auto-download links in v370 (at least
for Windows) were broken ... they did not work.

_After_ manually downloading and installing the missing doc packages
those same links worked locally.


>An easy way to make sure you have all manuals is to fetch the
>"full" version on the prerelease page.

I don't use pre-release software so I've never seen this link.

Ray Dillinger

unread,
Mar 31, 2008, 11:28:29 PM3/31/08
to
George Neuner wrote:

> The distribution does not include help desk files for a number of
> advanced subjects: compiler, runtime, FFI, GUI framework, web server,
> certain of the libraries, etc. These subjects are (supposed to be)
> links to optional doc packages which can be demand download from the
> PLT site when referenced.

Rrr. I hate systems that involve "on-demand download" - I do lots
of my programming work on my laptop, on the move, and it's not at
all uncommon for me to not be connected to a local network. I
start using something, I pull down "help" on a particular topic,
and it says "cannot connect to ...." and fails. I grind my teeth and
say "strike one." Then I try the man page for the application. If
that's also missing (or merely useless), strike two.


Bear

Jens Axel Soegaard

unread,
Apr 1, 2008, 3:52:56 AM4/1/08
to
George Neuner skrev:

> On Mon, 31 Mar 2008 17:00:40 +0200, Jens Axel Soegaard
> <inv...@soegaard.net> wrote:
>
>> George Neuner skrev:
>>
>>> I can't speak for v372 (the current version), but as of v370 - the
>>> last one I installed - the links for these and a few other help desk
>>> books did nothing at all ... the doc packages had to downloaded and
>>> installed manually (after which the links did work locally).
>> One man's manuals is another's almost automatically.
>>
>> The are links to the "missing" manuals in the HelpDesk, and
>> if they are not present, you are presented for a link which
>> automatically downloads and installs them.
>
> I'll say it again.
>
> Some (not all) of the help desk auto-download links in v370 (at least
> for Windows) were broken ... they did not work.

I used v370 for Windows and I can't remember any problems.

--
Jens Axel Søgaaard

Jens Axel Soegaard

unread,
Apr 1, 2008, 3:55:31 AM4/1/08
to
Ray Dillinger skrev:

If memory serves me, on the "Manuals" page in the HelpDesk there
were a "Download all" button.

In the SVN version the HelpDesk has been replaced with browser friendly
documentation.

--
Jens Axel Søgaard

andreu...@yahoo.com

unread,
Apr 1, 2008, 9:53:46 AM4/1/08
to
On Mar 29, 4:55 pm, Griff <gret...@gmail.com> wrote:

> I haven't solved the problem presented by the original poster, but I
> can tell you that on Windows XP Pro with DrScheme v372, a "Hello,
> world." application gets compiled down to 240KB.

Ehm. That's a program twice the length of Shakespeare's "Hamlet"
that prints 13 characters. Not quite as bad as the proverbial
monkey at a typewriter, but nothing to brag about either.

Andre

Pascal J. Bourguignon

unread,
Apr 1, 2008, 10:30:11 AM4/1/08
to
andreu...@yahoo.com writes:

Ah, but it does more than "Hamlet" with less resources.

To interpret "Hamlet" you need a pack of actors and a scene.
To interpret hello-world, you only need a few grains of sand and some
electrons.

And the script of hello-world, with tis 240KB surely is able to react
to some strange conditions you don't even imagine, like I/O errors, or
different kind of output devices.


So the question, is whether you prefer thick or thin interpreters
vs. thin or thick scripts.

--
__Pascal Bourguignon__

Message has been deleted

andreu...@yahoo.com

unread,
Apr 1, 2008, 10:59:33 AM4/1/08
to
On Apr 1, 9:30 am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:

> andreuri2...@yahoo.com writes:
>
> > Ehm. That's a program twice the length of Shakespeare's "Hamlet"
> > that prints 13 characters. Not quite as bad as the proverbial
> > monkey at a typewriter, but nothing to brag about either.
>
> Ah, but it does more than "Hamlet" ...

Let's get a life. The program displays 13 blooming characters.

> And the script of hello-world, with tis 240KB surely is able to react
> to some strange conditions you don't even imagine, like I/O errors, or
> different kind of output devices.

Really? If I/O is why it is so big, why then does the following
one-character program

1

have to compile to 342,138 bytes?

Andre

George Neuner

unread,
Apr 1, 2008, 2:44:07 PM4/1/08
to
On Tue, 01 Apr 2008 09:55:31 +0200, Jens Axel Soegaard
<inv...@soegaard.net> wrote:

>Ray Dillinger skrev:
>> George Neuner wrote:
>>
>>> The distribution does not include help desk files for a number of
>>> advanced subjects: compiler, runtime, FFI, GUI framework, web server,
>>> certain of the libraries, etc. These subjects are (supposed to be)
>>> links to optional doc packages which can be demand download from the
>>> PLT site when referenced.
>>
>> Rrr. I hate systems that involve "on-demand download" - I do lots
>> of my programming work on my laptop, on the move, and it's not at
>> all uncommon for me to not be connected to a local network. I
>> start using something, I pull down "help" on a particular topic,
>> and it says "cannot connect to ...." and fails. I grind my teeth and
>> say "strike one." Then I try the man page for the application. If
>> that's also missing (or merely useless), strike two.
>
>If memory serves me, on the "Manuals" page in the HelpDesk there
>were a "Download all" button.

Looking at it right now, there is no "download all" but there is a
list of what has not yet been installed with the link to download it
... assuming that link works 8-)


For Ray: I also do a lot on laptops. I look at optional stuff and
download whatever I think might be important immediately when I
install the software. Only rarely have I missed something critical
that stopped me until I got a connection.

I know a lot of people don't like to carry around huge installations
on their laptops, but honestly what else are you gonna use those 250GB
disks for? Porn? Better not let airline security examine your drive.
Besides which, it is very hard to use MSDN effectively online.


>In the SVN version the HelpDesk has been replaced with browser friendly
>documentation.

Probably a smart move but not crucial AFAIC. Years of exposure to
MSDN has numbed me to using dedicated help browsers.

Jens Axel Soegaard

unread,
Apr 1, 2008, 4:24:01 PM4/1/08
to
George Neuner wrote:

>> In the SVN version the HelpDesk has been replaced with browser friendly
>> documentation.
>
> Probably a smart move but not crucial AFAIC. Years of exposure to
> MSDN has numbed me to using dedicated help browsers.

Hopefully the new HelpDesk is easier to navigate than MSDN.
I always get lost in MSDN ...

Here are the new docs:

http://pre.plt-scheme.org/docs/html/

If you haven't seen them before then the Getting Started
series is worth a read.

--
Jens Axel Søgaard

Pascal J. Bourguignon

unread,
Apr 2, 2008, 10:35:21 AM4/2/08
to
andreu...@yahoo.com writes:

As I said, because you've got a dumb interpreter, in the form of a
microprocessor with whatever OS you put on it. You don't believe
such an interpreter is smarter than, let's say, even Leonardo Di
Caprio, do you?


--
__Pascal Bourguignon__

Anton van Straaten

unread,
Apr 2, 2008, 1:43:15 PM4/2/08
to
Pascal J. Bourguignon wrote:
> As I said, because you've got a dumb interpreter, in the form of a
> microprocessor with whatever OS you put on it. You don't believe
> such an interpreter is smarter than, let's say, even Leonardo Di
> Caprio, do you?

This Intel-bashing is getting out of hand!

andreu...@yahoo.com

unread,
Apr 4, 2008, 8:18:09 AM4/4/08
to
On Apr 2, 9:35 am, p...@informatimago.com (Pascal J. Bourguignon)

wrote:
> andreuri2...@yahoo.com writes:
> >
> > Really? If I/O is why it is so big, why then does the following
> > one-character program
>
> > 1
>
> > have to compile to 342,138 bytes?
>
> As I said, because you've got a dumb interpreter, in the form of a
> microprocessor with whatever OS you put on it.

The reason cannot lie with the microprocessor, though.
A naive, unoptimized x86 assembly language version of the above
program should occupy 4, maybe 8 bytes. A very elementary
optimization will reduce it to 0 bytes. No serious compiler
should take 1/3 of a Megabyte for this.

Andre

Max Hailperin

unread,
Apr 4, 2008, 10:30:18 AM4/4/08
to
andreu...@yahoo.com writes:
...

> A naive, unoptimized x86 assembly language version of the above
> program should occupy 4, maybe 8 bytes. A very elementary
> optimization will reduce it to 0 bytes. No serious compiler
> should take 1/3 of a Megabyte for this.

The designer of any compiler for any language -- even a serious one --
will make a conscious decision to not include optimizations that are
irrelevant for the kinds of real programs the designer expects real
programmers to write and compile using the compiler. This is not just
a matter of laziness on the compiler designer's part. Instead, it is
also a matter of reducing how many bugs are in the comiler -- and
compiler bugs are terrible things, which compiler designers go to
great lengths to try to avoid. One of the ways to reduce bugs is by
reducing code. The smaller and simpler the compiler, the fewer bugs
it will have. That virtue can be legitimately be traded away in order
to include useful features in the compiler, such as including
optimizations that have significant impact on real programs. But
there is no question that simplicity (and hence bug freeness) should
not be traded away for an optimization that would be totally useless
on the sorts of real programs for which the compiler is intended.

As a concrete example of this, consider an optimization to only
include the garbage collector in the executable for those programs
that the compiler can't prove to do only a limited amount of memory
allocation. This optimization would be necessary to get your trivial
test program down to anything like 0, 4, or 8 bytes. A simple version
of this optimization, sufficient to handle your test program, wouldn't
even be that hard -- though there is a definite slippery slope, in
that compile-time proofs of limited memory allocation can get
arbitrarily hairy. But why would a rational compiler designer take
the risk of including such code in the compiler? (To say nothing of
expending the resources to include it -- of which time spent on
testing is probably the greatest.) How many real programs that people
actually care about do you think do such limited memory allocation --
and so transparently do so as to allow an easy compile-time proof of
this property?

Now we can quibble over whether 1/3 of a megabyte is in fact too
much. After all, there were garbage collectors considerably smaller
than that in the old days. But my general point is valid. No
compiler can get down to your 8 byte range without including
optimizations -- like garbage collector elimination -- that are
irrelevant for the programs anyone using these compilers cares about.
And, not only is there no reason for a compiler designer to include
such a useless-in-practice optimization, there is a positive reason to
leave it out.

andreu...@yahoo.com

unread,
Apr 4, 2008, 3:01:16 PM4/4/08
to
On Apr 4, 10:30 am, Max Hailperin <m...@gustavus.edu> wrote:

> The designer of any compiler for any language -- even a serious one --
> will make a conscious decision to not include optimizations that are
> irrelevant for the kinds of real programs the designer expects real
> programmers to write and compile using the compiler.

Yet any serious compiler should, at a minimum, eliminate
dead code and data. This is not rocket science, and in
the example it would indeed discard the garbage collection
code.

Andre

Abdulaziz Ghuloum

unread,
Apr 4, 2008, 3:31:11 PM4/4/08
to
andreu...@yahoo.com wrote:

> Yet any serious compiler should, at a minimum, eliminate
> dead code and data. This is not rocket science, and in
> the example it would indeed discard the garbage collection
> code.

By this definition, there are no serious compilers that I can
think of. Go ahead and create one if you feel so compelled.

Aziz,,,

Anton van Straaten

unread,
Apr 4, 2008, 4:02:22 PM4/4/08
to

Two points here:

First, "serious compiler" has to be defined. Serious about what? As
Aziz points out, there aren't many serious compilers by your metric, and
it's not clear what the purpose of the metric is, e.g. what sort of
seriousness is it measuring, other than "smallest executable size"?
What is the purpose of this seriousness?

The second point is that the compiler being used as an example is one
which might not even be claimed to be a "serious compiler" by its
creators, at least for the kinds of purposes you appear to have in mind.

Anton

Jeffrey Mark Siskind

unread,
Apr 4, 2008, 4:11:58 PM4/4/08
to
On Apr 4, 3:31 pm, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu>
wrote:

Stalin eliminates dead code and data, including the code for the four
or five different kinds of storage allocation/reclamation mechanisms
when each of them is unused. It has done so for more than a dozen
years. And when used properly, it generates much faster code and much
smaller executables than any Scheme compiler that came before it
or after after it.

Abdulaziz Ghuloum

unread,
Apr 4, 2008, 4:48:55 PM4/4/08
to
Jeffrey Mark Siskind wrote:

> Stalin eliminates dead code and data, including the code for the four
> or five different kinds of storage allocation/reclamation mechanisms
> when each of them is unused. It has done so for more than a dozen
> years. And when used properly, it generates much faster code and much
> smaller executables than any Scheme compiler that came before it
> or after after it.

Sure. But that doesn't make Stalin a serious compiler by Andre's
metric (i.e., compiling the program 1 to an 8-, 4-, or 0-byte
executable).

Jeffrey Mark Siskind

unread,
Apr 4, 2008, 5:42:52 PM4/4/08
to
On Apr 4, 4:48 pm, Abdulaziz Ghuloum <aghul...@cee.ess.indiana.edu>
wrote:

I think Andre was making a conceptual definition, about dead code and
data
elimination and gc elimination. Not a hard definition about number of
bytes.
By that definition, Stalin is a serious compiler and the only serious
Scheme
compiler.

Stalin even comes damn close to the number-of-bytes definition. The
only
dead code/data that Stalin doesn't eliminate is the tables to
implement the predicates
like char-whitespace?. An earlier version of Stalin didn't have such
tables anyway, and
used a different implementation of the corresponding predicates.
Ignoring those tables,
(which are trivial to automatically eliminate) Stalin generates the
following code for
Andre's example:

int f0(void)
{int t17;
t17 = 20;
return 1;}
int main(void)
{return f0();}

The only thing that appears to violate Andre's principle is t17. That
comes from the buffer used
in the standard library to support converting integers to strings in
the implementation of
display. The buffer is declared to be 20 characters long. Stalin
eliminates all of the code
implementing display, and eliminates the buffer. It just doesn't
eliminate the constant
20 for the size of the buffer. But gcc will do that. So that doesn't
count as violating Andre's
principle. The rest (i.e. returning 1 as the return code) is dictated
by Stalin's semantics. With
appropriate options, gcc will inline f0 into main and yield the
minimal code possible.

Brad Lucier

unread,
Apr 4, 2008, 9:07:43 PM4/4/08
to
Here's how I built Gambit on my machine (unnecessary results removed);
the compiler options come directly from FLAGS_OBJ in lib/makefile:

frying-pan:~/software/gambc-v4_2_5> ./configure --enable-single-host -
enable-shared
frying-pan:~/software/gambc-v4_2_5> make mostlyclean
frying-pan:~/software/gambc-v4_2_5> time make -j 2
frying-pan:~/software/gambc-v4_2_5> sudo make install
frying-pan:~/software/gambc-v4_2_5> grep FLAG lib/makefile
FLAGS_OBJ = -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-
insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-
pointer -fPIC -fno-common -mieee-fp
FLAGS_DYN = -Wall -W -Wno-unused -O1 -fno-math-errno -fschedule-
insns2 -fno-trapping-math -fno-strict-aliasing -fwrapv -fomit-frame-
pointer -fPIC -fno-common -mieee-fp -rdynamic -shared
FLAGS_LIB = -rdynamic -shared
FLAGS_EXE = -rdynamic
$(C_COMPILER) $(INCLUDES) $(FLAGS_OBJ) $(DEFS) -D___PRIMAL -
D___LIBRARY -D___GAMBCDIR="\"$(prefix)$(PACKAGE_SUBDIR)\"" -
D___SYS_TYPE_CPU="\"x86_64\"" -D___SYS_TYPE_VENDOR="\"unknown\"" -
D___SYS_TYPE_OS="\"linux-gnu\"" -c $(srcdirpfx)$*.c
$(C_COMPILER) $(FLAGS_LIB) -o $(LIBRARY) $(LIBRARY_OBJECTS) $
(MAKE_LIBRARY_LIBS) $(LIBS)
frying-pan:~/software/gambc-v4_2_5> cat hello-world.scm
(display "Hello world!\n")
frying-pan:~/software/gambc-v4_2_5> gsc -link hello-world.scm
frying-pan:~/software/gambc-v4_2_5> gcc -Wall -W -Wno-unused -O1 -fno-
math-errno -fschedule-insns2 -fno-trapping-math -fno-strict-aliasing -
fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp -I/usr/local/
Gambit-C/current/include -L/usr/local/Gambit-C/current/lib -o hello-
world hello-world*.c -lgambc -lm -ldl
frying-pan:~/software/gambc-v4_2_5> strip hello-world
frying-pan:~/software/gambc-v4_2_5> ll hello-world
-rwxr-xr-x 1 lucier lucier 8024 2008-04-04 21:04 hello-world*
frying-pan:~/software/gambc-v4_2_5> printenv LD_LIBRARY_PATH
/usr/local/Gambit-C/current/lib:
frying-pan:~/software/gambc-v4_2_5> ./hello-world
Hello world!

So, it's not a dozen bytes, but it's not 1/3 of a meg either.

Brad

0 new messages