Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A lisp challenge.

29 views
Skip to first unread message

Thaddeus L Olczyk

unread,
Mar 18, 2002, 8:07:00 AM3/18/02
to
After thinking about the "philosophy" question it occured to me
that here is an ideal problem which will illustrate the Lisp way of
doing things.

You are input a list made up exclusively of
structs ( or classes, your choice ) of type foo.
foo contains at least five members, strings of length 20.
Because of this, copying of foos should be kept to a minimum.

The list will contain at least 10,000 elements and at most 1,000,000
elements.

One thing that will be required is that you maintain lists of elements
sorted with different members as key ( you would prefer to maintain
references of the elements rather than copies for two reasons:
1) You want to be able to insert modifications to elements and have it
reflect in all the lists.
2) You want to minimize used memory and so not copy anything large.
3) You want to make it fast, so try not to copy anthing large.


The operations allowed are:
1) Adding a list ( call it a key list ), this list is sorted by taking
a specified member.
2) Remove a key list.
3) Find an element or elements by a specified set of members.
3) Add an element ( and update all key lists ).
4) Delete a key element ( and update all key lists ).
5) Modify a specific element.

The program needs to be a fast as possible, and it needs to use as
little memory as is possible.

Kent M Pitman

unread,
Mar 18, 2002, 12:54:10 PM3/18/02
to

Maybe someone will in fact write this for you. I happen not to be that
person.

I'd bet you'll do better to write the program and offer it for critique.

Thaddeus L Olczyk

unread,
Mar 18, 2002, 1:46:39 PM3/18/02
to
On Mon, 18 Mar 2002 17:54:10 GMT, Kent M Pitman <pit...@world.std.com>
wrote:

If it was C++ I could write it in about two hours, but I couldn't
write it in Lisp both efficiently and stingy on memory. That's the
point. This problem has all the elements that styme me. So instead of
worrying about being productive, I keep worrying about how to be more
efficient.

Daniel Pittman

unread,
Mar 18, 2002, 9:27:24 PM3/18/02
to
On Mon, 18 Mar 2002, Thaddeus L. Olczyk wrote:
> After thinking about the "philosophy" question it occured to me
> that here is an ideal problem which will illustrate the Lisp way of
> doing things.

It doesn't illustrate anything about the "Lisp" way of doing things.
It is a reasonable test to see if someone has a very basic understanding
of data structuring, though.

> You are input a list made up exclusively of
> structs ( or classes, your choice ) of type foo.
> foo contains at least five members, strings of length 20.
> Because of this, copying of foos should be kept to a minimum.

This is the point at which I would object to the test, because the data
structure is broken by design. :)

Each foo should contain five or more keys, each of which is a reference
to the relevant string in an associative container (probably a hash)
somewhere.

Each foo should be identified by a key, so that each foo is stored in an
associative container.

> The list will contain at least 10,000 elements and at most 1,000,000
> elements.
>
> One thing that will be required is that you maintain lists of elements
> sorted with different members as key ( you would prefer to maintain
> references of the elements rather than copies for two reasons:

...so, you are manipulating lists of keys, each of which can be turned
into a foo...

> 1) You want to be able to insert modifications to elements and have it
> reflect in all the lists.
> 2) You want to minimize used memory and so not copy anything large.
> 3) You want to make it fast, so try not to copy anthing large.

...and these are all easy to achieve. It's a simple matter of
programming. Just don't forget to write the custom find/insert code for
your lists. A balanced tree would probably be reasonable, so long as the
rebalancing time wasn't too high for you.

[...]

> The program needs to be a fast as possible, and it needs to use as
> little memory as is possible.

Build from the sane data structure I describe and, no matter what the
language, you are going to manage this. Use arrays, not lists, as well,
since they can be constructed with zero-byte overhead per element. :)

Daniel

--
Remember that under the skin you fondle lie the bones,
waiting to reveal themselves.
-- Ikkyu

Scott McKay

unread,
Mar 18, 2002, 10:30:28 PM3/18/02
to

"Thaddeus L Olczyk" <olc...@interaccess.com> wrote in message
news:3c97357a....@nntp.interaccess.com...

> >> The program needs to be a fast as possible, and it needs to use as
> >> little memory as is possible.
> >
> >Maybe someone will in fact write this for you. I happen not to be that
> >person.
> >
> >I'd bet you'll do better to write the program and offer it for critique.
> If it was C++ I could write it in about two hours, but I couldn't
> write it in Lisp both efficiently and stingy on memory. That's the
> point. This problem has all the elements that styme me. So instead of
> worrying about being productive, I keep worrying about how to be more
> efficient.

You would be better off worrying about how to be more productive,
so you can spend your time working on efficient algorithms. If I
understand you're not particularly well specified problem, it hardly
seems a challenge, since it can be trivially implemented in practically
any language efficiently.

Michael Parker

unread,
Mar 18, 2002, 10:45:27 PM3/18/02
to
Thaddeus L Olczyk wrote:
> >I'd bet you'll do better to write the program and offer it for critique.
> If it was C++ I could write it in about two hours, but I couldn't
> write it in Lisp both efficiently and stingy on memory. That's the
> point. This problem has all the elements that styme me. So instead of
> worrying about being productive, I keep worrying about how to be more
> efficient.

First, don't worry about being efficient, unless it becomes necessary.

In this case, efficiency was emphasized pretty heavily in the problem
statement, so "worrying about how to be more efficient" seems to be
exactly what is called for :-)

Keep in mind that C++ has been paying my bills for over a decade now.
That said...

> After thinking about the "philosophy" question it occured to me
> that here is an ideal problem which will illustrate the Lisp way of
> doing things.

Well, the problem statement is much more in the spirit of C or C++
than Lisp, which is really going to color the implementation.



> You are input a list made up exclusively of
> structs ( or classes, your choice ) of type foo.

Some implementations do better (memory wise) with one or the other.
There doesn't seem to be any consistency between implementations here...

> foo contains at least five members, strings of length 20.
> Because of this, copying of foos should be kept to a minimum.

Since the strings will typically be stored by reference, copying foos
isn't a big deal... But I'd still avoid it if I really want to be
efficient. But short-lived allocations aren't as expensive in most
Lisps as they are in C.



> The list will contain at least 10,000 elements and at most 1,000,000
> elements.

m'kay. Personally I'd keep the "master list" in a vector. Usually
better space efficiency than a list, and better locality.


> One thing that will be required is that you maintain lists of elements
> sorted with different members as key ( you would prefer to maintain
> references of the elements rather than copies for two reasons:

Again, since references are the default in Lisp, this will tend to just
happen...

> 1) You want to be able to insert modifications to elements and have it
> reflect in all the lists.
> 2) You want to minimize used memory and so not copy anything large.
> 3) You want to make it fast, so try not to copy anthing large.

Having a garbage collector *really* reduces the need to copy things,
but you can't program in a functional style. Imperative style is
perfectly compatible with the CL way. I tend to mix and match the
two pretty freely, with some CLOS sprinkled in for spice.


> The operations allowed are:
> 1) Adding a list ( call it a key list ), this list is sorted by taking
> a specified member.
> 2) Remove a key list.
> 3) Find an element or elements by a specified set of members.
> 3) Add an element ( and update all key lists ).
> 4) Delete a key element ( and update all key lists ).
> 5) Modify a specific element.
>
> The program needs to be a fast as possible, and it needs to use as
> little memory as is possible.

Well, you don't mention which of these is to be preferred should
they come into conflict. If I'm gonna be dealing with wads of data,
I generally prefer to keep it in an array (whether I'm writing
Lisp or C). Modern processors are much better at dealing with
arrays than linked structures, plus linked structures tend to have
poor predictability which causes pipeline stalls, and poor locality
which increases cache miss rate and page-fault rate. The upshot
of these considerations is that an efficient lisp implementation
would likely look a lot like an efficient C implementation.

Do the "lists" really need to be lists? Can they be vectors? If
so, then vectors (a) use less space than lists (b) sort quicker
(c) can be binary-searched more easily (d) are more pipeline/cache/
paging-friendly. Insertion is the same big-O as linked lists, but
those constant factors for a list (pipeline stalls, increased cache
misses, page faults) can eat your lunch. You can leave holes in the
vector when you delete, which makes deletion O(1) per list, and you
can take advantage of these to help reduce insertion overhead to
O(n)/#holes which is the same big-O time, but will improve locality
and reduce your paging rate -- again, those constant factors can
easily swamp the theoretical considerations. You can keep track
of the #holes and compact when things get too sparse.

Can the "lists" be b+trees? More memory overhead, more complicated
implementation, but improved insertion/deletion time over a vector
implementation. How much improved depends. For C++ you could maybe
use the red-black tree implementation, but that imposes a fairly
high per-element memory overhead.

If they've gotta be lists, then both C++ (STL) and CL have good list
implementations. And CL offers destructive versions of most of its
list operations. You can use resources to reduce consing further,
which *really* makes the Lisp code look like C :-)

As for the list-of-lists, I'd just make that a list. It's not likely
to be enough to worry about.

Erik Naggum

unread,
Mar 19, 2002, 1:47:18 AM3/19/02
to
* Thaddeus L Olczyk

| If it was C++ I could write it in about two hours, but I couldn't write
| it in Lisp both efficiently and stingy on memory. That's the point. This
| problem has all the elements that styme me. So instead of worrying about
| being productive, I keep worrying about how to be more efficient.

How far along your learning path are you? Please remember that
efficiency is encountered very early on the C++ learning path, and very
late on the Common Lisp learning path. If you think "by now, I should
have learned how to be efficient", you are effectively still walking on
the C++ learning path. How come that path is the correct one? Would you
have worried so much about efficiency if you had not known C++ before
Common Lisp?

///
--
In a fight against something, the fight has value, victory has none.
In a fight for something, the fight is a loss, victory merely relief.

Damond Walker

unread,
Mar 19, 2002, 9:52:13 AM3/19/02
to
olc...@interaccess.com (Thaddeus L Olczyk) wrote in message news:<3c97357a....@nntp.interaccess.com>...

[..snip..]

> If it was C++ I could write it in about two hours, but I couldn't
> write it in Lisp both efficiently and stingy on memory. That's the
> point. This problem has all the elements that styme me. So instead of
> worrying about being productive, I keep worrying about how to be more
> efficient.


I don't know if it's out of scope for this forum but maybe you could
code one in C++ and we, as a group, could run through the exercise of
converting portions of it to Lisp (or even the whole thing).

I don't know if it would be of interest to anyone else but I'd like to
see a Lisp version take shape with all the grey matter we have in this
group.

It could easily turn into a flame war though -- I can see it already.

Damond

Jon Allen Boone

unread,
Mar 19, 2002, 2:47:42 PM3/19/02
to
dwa...@syncreticsoft.com (Damond Walker) writes:

> olc...@interaccess.com (Thaddeus L Olczyk) wrote in message news:<3c97357a....@nntp.interaccess.com>...
>
> [..snip..]
>
>> If it was C++ I could write it in about two hours, but I couldn't
>> write it in Lisp both efficiently and stingy on memory. That's the
>> point. This problem has all the elements that styme me. So instead of
>> worrying about being productive, I keep worrying about how to be more
>> efficient.
>
> I don't know if it's out of scope for this forum but maybe you could
> code one in C++ and we, as a group, could run through the exercise of
> converting portions of it to Lisp (or even the whole thing).

Perhaps this is something that we newbies could undertake...?

> I don't know if it would be of interest to anyone else but I'd like to
> see a Lisp version take shape with all the grey matter we have in this
> group.

I know that I could benefit from getting feedback from the gurus
here on coding issues...

> It could easily turn into a flame war though -- I can see it already.

Even /more/ reason to do it... :-)

-jon
--
------------------
Jon Allen Boone
ipmo...@delamancha.org

Kenny Tilton

unread,
Mar 19, 2002, 4:12:21 PM3/19/02
to

Jon Allen Boone wrote:
>
> Perhaps this is something that we newbies could undertake...?

Great idea. And if the OP does not come through, maybe the newbies could
first do it in their preferred language (if they are not total
programming newbies) and share those.

Keep track of the time required as well.

I'm no newbie, but I learn a lot when the real heavyweights start
slinging code.

> > It could easily turn into a flame war though -- I can see it already.
>

Oh, as long you use IF* I do not think there will be any trouble.

--

kenny tilton
clinisys, inc
---------------------------------------------------------------
"Well, I've wrestled with reality for thirty-five years, Doctor, and
I'm happy to state I finally won out over it." Elwood P. Dowd

Bruce Miller

unread,
Mar 19, 2002, 11:53:10 PM3/19/02
to
On Mon, 18 Mar 2002 08:07:00 -0500, Thaddeus L Olczyk wrote:

> After thinking about the "philosophy" question it occured to me that
> here is an ideal problem which will illustrate the Lisp way of doing
> things.

[...]
> The operations allowed are:
[...]
Nope, you've already violated "the Lisp way of doing things".

larry

unread,
Mar 20, 2002, 1:56:26 PM3/20/02
to
> On Mon, 18 Mar 2002, Thaddeus L. Olczyk wrote:
> > After thinking about the "philosophy" question it occured to me
> > that here is an ideal problem which will illustrate the Lisp way of
> > doing things.

Trying to solve this problem won't illustrate the lisp way of doing
things.
It's a well defined (or could be well defined problem). Isn't the
advantage of lisp is that it's well suited to solving problems which
are not well specified, and whose specification in fact would be a
computer program.So maybe a better problem to illustrate the strengths
of lisp over C++ would be to write a program that understands written
commands in some limited domain of discourse. Here the problem at the
outset is not how to do things efficiently but just how to do it at
all. Or write a program that simulates human behavior like the sim
people game.

Damond Walker

unread,
Mar 20, 2002, 10:25:51 PM3/20/02
to
larry...@hotmail.com (larry) wrote in message news:<7b8f89d6.02032...@posting.google.com>...

> It's a well defined (or could be well defined problem). Isn't the
> advantage of lisp is that it's well suited to solving problems which
> are not well specified, and whose specification in fact would be a
> computer program.So maybe a better problem to illustrate the strengths
> of lisp over C++ would be to write a program that understands written
> commands in some limited domain of discourse. Here the problem at the
> outset is not how to do things efficiently but just how to do it at
> all. Or write a program that simulates human behavior like the sim
> people game.

In my short career (12 years) I've seen many problems/applications
which were "not well specified."

Are you talking about problems which are *hard* to specify or are we
talking about systems with just a *poor* definition?

Damond

Erik Naggum

unread,
Mar 21, 2002, 5:00:37 AM3/21/02
to
* Damond Walker

| In my short career (12 years) I've seen many problems/applications
| which were "not well specified."
|
| Are you talking about problems which are *hard* to specify or are we
| talking about systems with just a *poor* definition?

In my experience, the most interesting problems are very different before
and after they have a solution. If you implement the problem as it was
understood before it had a real solution, you have one more problem. If
you try to understand the problem as you keep trying to solve it, you may
end up with a lot fewer problems than you solved. I find it extremely
hard to do this in languages that require (1) an enormous amount of
typing to get anything useful done, and (2) an enormous amount of
thinking to get anything useful typed.

Now, since problems to which we already have one correct solution are
hardly _problems_ any longer, I consider only the problems that have no
solution to be worth solving, and quite frequently, the solution to such
a problem is to discover that it really is a different problem. This is
what understand "not well specified" to mean in a Common Lisp context.
In a C++ context, such a problem would just be dumb to give to a coder.

Damond Walker

unread,
Mar 21, 2002, 11:49:51 AM3/21/02
to
Erik Naggum <er...@naggum.net> wrote in message news:<32256936...@naggum.net>...

[..snip..]

>
> Now, since problems to which we already have one correct solution are
> hardly _problems_ any longer, I consider only the problems that have no
> solution to be worth solving, and quite frequently, the solution to such
> a problem is to discover that it really is a different problem. This is
> what understand "not well specified" to mean in a Common Lisp context.
> In a C++ context, such a problem would just be dumb to give to a coder.
>

Agree with the C++ reference. I'd figure that 60% to 90% of a C++
coders experience with a problem "worth solving" would be spent
slinging infrastructure. Odds are if the problem definition changes
then much of that infrastructure, if tied to closely to the original
problem, would be thrown out.

As an aside: A correct solution is not always the optimal one. Would
you consider sub-optimal solutions to still be in the class of
problems "worth solving?" I'd say yes though I'd qualify it with a
"it depends."

Damond

Erik Naggum

unread,
Mar 21, 2002, 5:13:00 PM3/21/02
to
* dwa...@syncreticsoft.com (Damond Walker)

| Agree with the C++ reference. I'd figure that 60% to 90% of a C++
| coders experience with a problem "worth solving" would be spent
| slinging infrastructure. Odds are if the problem definition changes
| then much of that infrastructure, if tied to closely to the original
| problem, would be thrown out.

Precisely, and the personal attachment that mere mortals feel towards
what they have written is such a strong barrier to throwing things out
that it is even a stronger motivator to keep going than all the waste of
time that went into what is thrown away. However, anyone who has ever
successfully been published (even in academia), will know that in order
to get an idea across, you need to be able to phrase it as least half a
dozen different ways and be able to expand on each of them. Successful
writing is much harder than chess or go in terms of thinking ahead and
setting up for future moves. Writing good software is even harder, and
that is why setting up the proper abstractions is so amazingly hard and
why both library and language design requires so much forethought and
experience. The problem with C++ is basically that it was not thrown out
in time, that Bjarne accepted almost every suggestion he received.

The more experience you have writing, the less attached you become to
what you have written. This does not appear to be true for programmers,
maybe because the function the copy-editor and editor has in a publishing
company is performed by the compiler, so everything that passes the
compiler is effectively "published".

| As an aside: A correct solution is not always the optimal one. Would you
| consider sub-optimal solutions to still be in the class of problems
| "worth solving?" I'd say yes though I'd qualify it with a "it depends."

Whether a problem has been solved sufficiently or not is usually an
engineering decision, not an academic exercise in perfection, but if
there is a satisfaactory general solution, the engineering decisions
change dramatically in nature.

William Newman

unread,
Mar 21, 2002, 7:40:40 PM3/21/02
to
olc...@interaccess.com (Thaddeus L Olczyk) wrote in message news:<3c97357a....@nntp.interaccess.com>...

It might help, then, if you'd show us the C++ version.
* It'd inspire the "anything C++ can do Lisp can do better" crowd.:-|
* It'd help supplement your specification. E.g. by reading it we
could guess
** Is it acceptable in this application to write your own
specialized memory manager which ties up at all times as much
memory as it needs for peak usage in this task?
** What is to be done with duplicate elements? In particular, if
they're not to be discarded, does your "avoid copies" mean that
the system should optimize performance in this case by keeping
a count of duplicates, instead of multiple copies?
* It'd also give a better idea of what level of microoptimization
cruft you want, thus giving people a better idea of whether it's
their kind of challenge.
* It'd soothe possible anxieties that you're going to do the debating
fallacy/trick/whatever of criticizing a proposal for not being
perfect, while ignoring any shortcomings of your explicit or implicit
alternative.
* It'd soothe possible anxieties that you might expect people to
work harder on what is, after all, your problem than you're willing
to work yourself.

Damond Walker

unread,
Mar 22, 2002, 10:32:47 AM3/22/02
to
Erik Naggum <er...@naggum.net> wrote in message news:<32257375...@naggum.net>...

>
> Precisely, and the personal attachment that mere mortals feel towards
> what they have written is such a strong barrier to throwing things out
> that it is even a stronger motivator to keep going than all the waste of
> time that went into what is thrown away. However, anyone who has ever
> successfully been published (even in academia), will know that in order
> to get an idea across, you need to be able to phrase it as least half a
> dozen different ways and be able to expand on each of them. Successful
> writing is much harder than chess or go in terms of thinking ahead and
> setting up for future moves. Writing good software is even harder, and
> that is why setting up the proper abstractions is so amazingly hard and
> why both library and language design requires so much forethought and
> experience. The problem with C++ is basically that it was not thrown out
> in time, that Bjarne accepted almost every suggestion he received.
>

The "One Right Way", as defined by one person -- and pushed by
industry, becomes the anchor that holds successive programmers back.

Coding doesn't have to be painful. C++ just makes it that way. I
keep searching for someone who has a Really Good Time(tm) writing C++
code.

Ignoring various implementation issues I havn't talked to one person
who really *enjoys* the C++ coding experience.

> The more experience you have writing, the less attached you become to
> what you have written. This does not appear to be true for programmers,
> maybe because the function the copy-editor and editor has in a publishing
> company is performed by the compiler, so everything that passes the
> compiler is effectively "published".
>

Code reviews are a terribly humbling experience. I havn't had to
endure one for years -- but they do force you to justify what you did
(assuming you have a good reviewer) -- much like the editor(s) looking
over a paper/article.


Damond

Damond Walker

unread,
Mar 22, 2002, 11:07:11 AM3/22/02
to
As a variant on the "Take a C++ program and show me the Lisp Way" how
about we take a Lisp program and "show me the C++ way?"

We would need a good, short (say under 5k to 10k of Lisp source --
excluding comments), example of how one solves a less-than-easy
problem in Common Lisp.

Would that be easier? It would allow the original poster to get the
code, run it through the interpreter, ask all the questions in the
world, and then implement a C++ solution.

I think the end result (learning) would be nearly the same.

Damond

Gareth McCaughan

unread,
Mar 22, 2002, 9:34:51 PM3/22/02
to
Damond Walker wrote:

Good idea. Here's a relatively easy example. At

http://homepage.ntlworld.com/g.mccaughan/software/oysters.lisp

there is a little bit of Lisp. 146 lines, 5283 bytes, including
a very few comments; 117 non-blank, non-comment lines, of which
10 could be omitted with no loss of functionality or speed.

It was prompted by reading that "Oysters oysters oysters
split split split" is a meaningful (albeit useless) English
sentence. The same is true "for arbitrary values of 3".

The interesting (i.e., "interesting if you find this sort
of weirdness interesting") functions are:

PARSE-SENTENCE

takes a list, each of whose elements is one of the
two symbols OYSTERS and SPLIT. Returns a list of
possible parses of that sequence of words as a sentence.
(I hope it returns a list of *all* possible parses.)

TRY-ALL

takes a length, and an optional threshold.
Constructs all sequences of the given length
built out of OYSTERS and SPLIT, and tells you
about all the ones that make sentences. Reports
on multiple parses. If the threshold is given,
it reports only sequences that have at least
that many different parses.

TRY-ALL-2

same semantics as TRY-ALL, but uses less memory.

For example, doing (TRY-ALL 7 2) reveals that there are
exactly two sequences with more than one parse. One of
them follows:

(SPLIT OYSTERS OYSTERS SPLIT OYSTERS SPLIT SPLIT)
parses as
(SENTENCE
(SUBJECT
(NOUN (NOUN SPLIT OYSTERS)
(QUALIFIER (SUBJECT OYSTERS) (TRANS-VERB SPLIT)))
(QUALIFIER (SUBJECT OYSTERS) (TRANS-VERB SPLIT)))
(INTRANS-VERB SPLIT))
or
(SENTENCE (IMP-VERB SPLIT)
(OBJECT (NOUN OYSTERS)
(QUALIFIER
(SUBJECT (NOUN OYSTERS)
(QUALIFIER (SUBJECT SPLIT OYSTERS) (TRANS-VERB SPLIT)))
(TRANS-VERB SPLIT))))

On my machine (Athlon, 1GHz, 256Mb, FreeBSD), running CMU CL 18c,
(TRY-ALL-2 12 5) takes just under 2 seconds of real time to find
the five sequences of length 12 that have 5 parses each, and describe
all those parses. TRY-ALL takes about the same time but conses
much more.

I'd be very interested to see how well this can be translated
into C++. It probably wouldn't be as grim as some non-C++-ers
expect (the standard library has a fairly decent selection of
data structures), but I think combining decent performance and
decent brevity could be tricky.

--
Gareth McCaughan Gareth.M...@pobox.com
.sig under construc

0 new messages