Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 51 - 71 of 71 - Collapse all  -  Translate all to Translated (View all originals) < Older 
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Barry Margolin  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Barry Margolin <bar...@bbnplanet.com>
Date: 1999/01/18
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)
In article <wkr9ss1obg....@mit.edu>, David Bakhash  <ca...@mit.edu> wrote:

>This, somehow, never was overlooked.  I cannot thank you enough for
>clearing up this issue.  it was bothering me.  I never thought about
>using the package system + CLOS for hiding data.  I guess I'll just
>_NOT_ export the slot names, and be done with that.  That makes perfect
>sense, though doing it might open up some issues.

Be careful, though.  I've seen some people try to use the package system to
implement the same granularity of data hiding as they can get with some
other OO languages' public/private distinctions.  It may be possible to do
this, but I would recommend against trying.  The package system is tricky,
even for expert CL programmers (it's no coincidence that the package system
is described in Chapter 11 of both CltL editions and the ANSI spec).  It's
best used for coarse levels of sharing -- typically an application or
library would define one or two packages.  You'll likely go crazy if you
try to implement per-class packages.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "hashtable w/o keys stored..." by Marc Battyani
Marc Battyani  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "Marc Battyani" <Marc_Batty...@csi.com>
Date: 1999/01/18
Subject: Re: hashtable w/o keys stored...

David Bakhash wrote in message ...
>"Marc Battyani" <Marc_Batty...@csi.com> writes:

>> There are also the "Ternary Search Trees" a TRIE variation.
>> They can be faster than hashtables.
>> More on these here: http://www.cs.princeton.edu/~rs/strings/index.html

>this sounds really interesting.  I must admit that I had not known of
>TRIE, and still know nothing.

>I highly appreciate this information, though I must admit that I'm
>skeptical at how it can _possibley_ be faster than a hashtable.

It's because to compute a hash value you have to process all the characters
of your string.

in a hash table you
   1 compute the hash value in O(length(key))
   2 compare the found item  to the key also in O(lengthKey)
   3 you compare more keys if they are collisions

On the other hand, TRIES and their look alike process character by
character.
so if the key is not there you can know if faster.

All this depends on the statistical repartition of your keys.
You can imagine a data set with 100 characters keys on where the first 5 are
usually enough to differentiate them.
In that case It's likely that hashtables will be slower.

Cheers,
Marc Battyani


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Barry Margolin  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Barry Margolin <bar...@bbnplanet.com>
Date: 1999/01/18
Subject: Re: hashtable w/o keys stored...
In article <7F6861BC8A65CC8B.F0645AE687214FDC.64E9F9773E5A7...@library-proxy.airnews.net>,

Marc Battyani <Marc_Batty...@csi.com> wrote:
>You can imagine a data set with 100 characters keys on where the first 5 are
>usually enough to differentiate them.
>In that case It's likely that hashtables will be slower.

If you have control over the hashing function, you could simply hash on
only the first 5 characters.  Items where the difference is only in the
remaining characters would collide, and be differentiated by the collision
resolution routine.

However, tries and their ilk don't require you to know where the cutoff is
a priori.  They'll get faster all by themselves depending on the actual
distribution of the data.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/18
Subject: Re: hashtable w/o keys stored...
* "Marc Battyani" <Marc_Batty...@csi.com>
| It's because to compute a hash value you have to process all the
| characters of your string.
:
| All this depends on the statistical repartition of your keys.   You can
| imagine a data set with 100 characters keys on where the first 5 are
| usually enough to differentiate them.  In that case It's likely that
| hashtables will be slower.

  in all likelihood, a hashing function is able to take advantage of the
  same property of the input, and but then somewhat more for collisions.

  the real difference, as I see it, is that tries will be able to take
  advantage of _any_ smallest set of differences, so if you have a bunch of
  people names starting with "Smith", the trie will effectively just move
  past the first 5 and use the next 5 characters, instead.  a hash function
  would have to be excessively intelligent to adapt the same way.

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Barry Margolin  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Barry Margolin <bar...@bbnplanet.com>
Date: 1999/01/18
Subject: Re: hashtable w/o keys stored...
In article <3125682171162...@naggum.no>, Erik Naggum  <e...@naggum.no> wrote:

>  in all likelihood, a hashing function is able to take advantage of the
>  same property of the input, and but then somewhat more for collisions.

>  the real difference, as I see it, is that tries will be able to take
>  advantage of _any_ smallest set of differences, so if you have a bunch of
>  people names starting with "Smith", the trie will effectively just move
>  past the first 5 and use the next 5 characters, instead.  a hash function
>  would have to be excessively intelligent to adapt the same way.

Well, everyone can stop worrying about the Y2K problem.  Erik and I just
posted almost the same comment, and I think agreement between us is one of
the signs of the Apocalypse. :-)

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)" by Lyman S. Taylor
Lyman S. Taylor  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: ly...@cc.gatech.edu (Lyman S. Taylor)
Date: 1999/01/18
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)
In article <AXLo2.319$oD6.26...@burlma1-snr1.gtei.net>,
Barry Margolin  <bar...@bbnplanet.com> wrote:

>In article <wkr9ss1obg....@mit.edu>, David Bakhash  <ca...@mit.edu> wrote:
....
>Be careful, though.  I've seen some people try to use the package system to
>implement the same granularity of data hiding as they can get with some
>other OO languages' public/private distinctions.  

   Yes.  I'd recommend more of Modula3 type of approach with an
   interface file which did the namespace definition and the import/export
   business and a definition file which did the "defining".  [ Even though
   keeping the interface and definition synchronized involves lots of
   manual grunt work.]

   Long chains of "public" hierarchies can all be in the same file since
   they, for the most part, share namespace.  Even "private" hierarchies as
   long at the author interally respects the abstraction boundaries.
   You do have to be  more "disciplined" when using CLOS+packages then when
   depending upon the C++ compiler as an abstraction boundary "enforcer".

   This gives you "libraries" (or a framework) of classes.

   Probably a closer analogy would be to Java (and its packages) than to
   C++ (namespaces).  You can develop quite a maze of namespaces
   with names space control comingled with the object class definitions quite
   a bit more easily with public/protected/private than with namespace
   and class definition mechanisms seperated.

>                                        The package system is tricky,
>even for expert CL programmers

   Unfortunately this is true. :-(   Not so much that the mechanisms' syntax
   is tricky,  but it is the "landmines" you may stumble across if certain
   aspects of the language interact in nonintuitive ways.  And the fact
   that package system changed significantly so there is legacy issues
   to deal with.

   The  "defsystem" , "modules" , and "package"  solution is one of those
   things CL doesn't quite do in an "inspired" fashion.  You can make it work.
   However, it usually seems more painful than it "should be".  

   I think some people get "bit" by some "package problem" and then just
   shy away from it in the future.  And I'm sure there are also more than
   few for whom in their standard "modus operandi" it all works well.  

--

Lyman S. Taylor          "The Borg --  party poopers of the Galaxy. "
(ly...@cc.gatech.edu)                 EMH Doctor  Star Trek Voyager.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "hashtable w/o keys stored..." by Erik Naggum
Erik Naggum  
View profile  
 More options Jan 18 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/18
Subject: Re: hashtable w/o keys stored...
* Barry Margolin <bar...@bbnplanet.com>
| Well, everyone can stop worrying about the Y2K problem.  Erik and I just
| posted almost the same comment, and I think agreement between us is one
| of the signs of the Apocalypse. :-)

  problem is, we might have caused the OSCE incident in Kosovo.  the sum
  total of hostilities in the world has to remain constant, so what was the
  world to do?  things just _had_ to get ugly somewhere else, instead.

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco Antoniotti  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Marco Antoniotti <marc...@copernico.parades.rm.cnr.it>
Date: 1999/01/19
Subject: Re: hashtable w/o keys stored...

Erik Naggum <e...@naggum.no> writes:
> * Barry Margolin <bar...@bbnplanet.com>
> | Well, everyone can stop worrying about the Y2K problem.  Erik and I just
> | posted almost the same comment, and I think agreement between us is one
> | of the signs of the Apocalypse. :-)

>   problem is, we might have caused the OSCE incident in Kosovo.  the sum
>   total of hostilities in the world has to remain constant, so what was the
>   world to do?  things just _had_ to get ugly somewhere else, instead.

On top of that I just got (again) the Internet Joke stating that

        (reduce #+ (beastly-encode "BILL GATES III")) => 666

and that you can encode my home telephone number in Rome as

        666 666 666

and you can start to think that every hash function eventually
converges to the Number of The Beast.

Cheers

--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - (0)6 - 68 10 03 17, fax. +39 - (0)6 - 68 80 79 26
http://www.parades.rm.cnr.it


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Warnock  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: r...@rigden.engr.sgi.com (Rob Warnock)
Date: 1999/01/19
Subject: Re: hashtable w/o keys stored...
Erik Naggum  <e...@naggum.no> wrote:
+---------------
|   the real difference, as I see it, is that tries will be able to take
|   advantage of _any_ smallest set of differences, so if you have a bunch of
|   people names starting with "Smith", the trie will effectively just move
|   past the first 5 and use the next 5 characters, instead.
+---------------

Exactly.  Plus, if you're doing typical symbol-table or other plaintext stuff,
you can *significantly* cut the amount of storage used with tries by various
forms of "compression" on each node (much like the way most C compilers
generate different styles of code for "switch" statements depending on
the pattern of "case" values), tagging each node with its "style" or kind:

- As Erik notes, you can collapse a run of fanout==1 nodes into a single
  "match"-only node consisting of a string and one further node pointer.

- If the degree of fanout is >1 but still very low, the node might be
  encoded as a small string that's searched sequentially, together with
  a vector of the same length (indexed by the searched-for character's
  position in the string). While this is a good deal slower than a pure
  "radix-sort" style index, it's also *much* smaller.

- With higher fanout, a node might contain "min" & "max" character codes,
  and a vector as large as "max - min + 1", indexed by "this_char - min".

- With still higher fanout, revert to a full max-char-code size vector.
  In most practical applications, you'll only need a few nodes like this
  (maybe even only the root node).

+---------------
| a hash function would have to be excessively intelligent to adapt
| the same way.
+---------------

That reminds me of a trick that can help hash tables: If you use a really
good hash, one for which nearly every bit position contains nearly a full
bit of information (a CRC-32 is such a hash), then if you store the *full*
hash value with the entry, when you want grow the hash table because the
per-bucket chains get too long, you don't have to re-hash the keys -- you
can just use more bits of the original hash value.

Finally, there's an interesting hybrid of hash tables and tries -- a
multi-level hash table that uses different bits of the hash at each level
of the tree/trie (that is, it's a binary tree if you only use one bit of
the hash per level). When you hit a leaf node you do a comparison on
the full hash first before wasting time with a full string compare.
If the hashes are the same but the strings differ, then the strings are
probably *very* different (assuming a "good" hash, such as a CRC), so
hanging a "compressed trie" (as above) off that node should be a cheap
way to resolve the conflict (that is, the strings will very likely differ
in the first character!).

[Hmmm... It would be interesting to compare these three methods for
space & speed to implement "intern"...]

-Rob

-----
Rob Warnock, 8L-855             r...@sgi.com
Applied Networking              http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.          Phone: 650-933-1673
2011 N. Shoreline Blvd.         FAX: 650-964-0811
Mountain View, CA  94043        PP-ASEL-IA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco Antoniotti  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Marco Antoniotti <marc...@copernico.parades.rm.cnr.it>
Date: 1999/01/19
Subject: Re: hashtable w/o keys stored...

r...@rigden.engr.sgi.com (Rob Warnock) writes:
> That reminds me of a trick that can help hash tables: If you use a really
> good hash, one for which nearly every bit position contains nearly a full
> bit of information (a CRC-32 is such a hash), then if you store the *full*
> hash value with the entry, when you want grow the hash table because the
> per-bucket chains get too long, you don't have to re-hash the keys -- you
> can just use more bits of the original hash value.

It is not really a 'trick' it is the base for some fancy indexing
schemes used in data bases in alternative to BTree based data structures.

> Finally, there's an interesting hybrid of hash tables and tries -- a
> multi-level hash table that uses different bits of the hash at each level
> of the tree/trie (that is, it's a binary tree if you only use one bit of
> the hash per level). When you hit a leaf node you do a comparison on
> the full hash first before wasting time with a full string compare.
> If the hashes are the same but the strings differ, then the strings are
> probably *very* different (assuming a "good" hash, such as a CRC), so
> hanging a "compressed trie" (as above) off that node should be a cheap
> way to resolve the conflict (that is, the strings will very likely differ
> in the first character!).

> [Hmmm... It would be interesting to compare these three methods for
> space & speed to implement "intern"...]

Very good point indeed.

Cheers

--
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - (0)6 - 68 10 03 17, fax. +39 - (0)6 - 68 80 79 26
http://www.parades.rm.cnr.it


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "CLOS and Packages (Was: Why all virtual)" by Kenneth P. Turvey
Kenneth P. Turvey  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: ktur...@pug1.sprocketshop.com (Kenneth P. Turvey)
Date: 1999/01/19
Subject: CLOS and Packages (Was: Why all virtual)

On Mon, 18 Jan 1999 19:28:32 GMT, Barry Margolin <bar...@bbnplanet.com> wrote:
>In article <wkr9ss1obg....@mit.edu>, David Bakhash  <ca...@mit.edu> wrote:

>Be careful, though.  I've seen some people try to use the package system to
>implement the same granularity of data hiding as they can get with some
>other OO languages' public/private distinctions.  It may be possible to do
>this, but I would recommend against trying.  The package system is tricky,
>even for expert CL programmers (it's no coincidence that the package system
>is described in Chapter 11 of both CltL editions and the ANSI spec).  It's
>best used for coarse levels of sharing -- typically an application or
>library would define one or two packages.  You'll likely go crazy if you
>try to implement per-class packages.

This is exactly what I'm doing in an application I'm currently working
on.  Why would per-class packages be so difficult.  They seem to
implement the exact behavior I desire without any great complexity.

Could you please expand on the difficulties that one might encounter
using this method of data encapsulation?

Thanks,

--
Kenneth P. Turvey <ktur...@SprocketShop.com>

  Anyone who challenges the prevailing orthodoxy finds himself silenced
  with surprising effectiveness.  A genuinely unfashionable opinion is
  almost never given a fair hearing.  -- George Orwell


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "hashtable w/o keys stored..." by Gareth McCaughan
Gareth McCaughan  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Gareth McCaughan <gj...@dpmms.cam.ac.uk>
Date: 1999/01/19
Subject: Re: hashtable w/o keys stored...

Barry Margolin wrote:

[I wrote, replying to Erik, replying to Barry]

>> If he does much CLOS, he'll get so fed up of calling *everything*
>> virtual that he'll stop bothering. Then there'll be no problem.
>> (Until he goes back to C++ and starts asking about generic functions
>> in C++. (-: )

> He wasn't calling everything virtual, he was making a point in a discussion
> about programming methodologies.  The fact that CLOS methods have the
> properties associated with virtual member functions in another language I
> won't name was germaine to his point.  He expressed it very clearly, IMHO.

He said "all methods in CLOS are virtual". I have no quarrel with
that statement; I agree that it's a sensible thing for someone
whose background is in C++ to use; I have (so far as I can see)
no disagreement with you here. (I also understand why Erik is
bothered by people using foreign terminology to describe Lisp,
though I think he's wrong. That's why I made the observation that
anyone who spends a lot of time doing CLOS will get used to all
methods being "virtual" in CLOS, and therefore stop using the word
except when trying to relate CLOS and C++; for which reason, I
don't think his concern that you were encouraging someone in a
bad habit was justified.)

--
Gareth McCaughan       Dept. of Pure Mathematics & Mathematical Statistics,
gj...@dpmms.cam.ac.uk  Cambridge University, England.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)" by Lieven Marchand
Lieven Marchand  
View profile  
 More options Jan 19 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Lieven Marchand <m...@bewoner.dma.be>
Date: 1999/01/19
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)
ly...@cc.gatech.edu (Lyman S. Taylor) writes:

> In article <AXLo2.319$oD6.26...@burlma1-snr1.gtei.net>,
> Barry Margolin  <bar...@bbnplanet.com> wrote:
> >In article <wkr9ss1obg....@mit.edu>, David Bakhash  <ca...@mit.edu> wrote:
> ....
> >Be careful, though.  I've seen some people try to use the package system to
> >implement the same granularity of data hiding as they can get with some
> >other OO languages' public/private distinctions.  

>    Yes.  I'd recommend more of Modula3 type of approach with an
>    interface file which did the namespace definition and the import/export
>    business and a definition file which did the "defining".  [ Even though
>    keeping the interface and definition synchronized involves lots of
>    manual grunt work.]

An other alternative I've seen is having an interface file just defining
the package and a macro DEFEXPORT to define exported functions (and
analoguous macros for exported macros, constants etc.). This solves the
syncronisation problem.

Does anyone have experience with this style?

I find it hard to understand what exactly about the CL package mechanism
hinders people to create many small (fine grained) name spaces. Perhaps
the fact that for easy interactive use they all would tend to get used
in COMMON-LISP-USER anyway. Garnet seems to be the exception as their
recommended style advises against using the package and for an explicit
package:symbol style.

--
Lieven Marchand <m...@bewoner.dma.be>
--------------------------------------------------------------------------- ---
Few people have a talent for constructive laziness. -- Lazarus Long


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Rudin  
View profile  
 More options Jan 20 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Paul Rudin <pa...@shodan.demon.co.uk>
Date: 1999/01/20
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)

Barry Margolin <bar...@bbnplanet.com> writes:

[]

>                                              The package system is tricky,
> even for expert CL programmers (it's no coincidence that the package system
> is described in Chapter 11 of both CltL editions and the ANSI spec).  It's
> best used for coarse levels of sharing -- typically an application or
> library would define one or two packages.

Choosing an appropriate size for packages, epecially in large
applications, is something that is difficult to call and, I guess,
there's no "right" answer.

What do people tend to do? How large a logical unit should a package
be? How large in terms of lines of code or exported symbols should a
package be? What other measures might be useful when considering the
size of packages? Are mutually dependent (either directly or
indirectly) packages always a Bad Thing? Are there any style guides
that address these issues in detail?

TIA.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Barry Margolin  
View profile  
 More options Jan 21 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Barry Margolin <bar...@bbnplanet.com>
Date: 1999/01/21
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)
In article <787nse$ni...@nickel.uunet.be>,
Lieven Marchand  <m...@bewoner.dma.be> wrote:

>I find it hard to understand what exactly about the CL package mechanism
>hinders people to create many small (fine grained) name spaces. Perhaps
>the fact that for easy interactive use they all would tend to get used
>in COMMON-LISP-USER anyway. Garnet seems to be the exception as their
>recommended style advises against using the package and for an explicit
>package:symbol style.

Unlike most other things in Lisp, it can be tricky to make package changes
incrementally.  Things work best if all the package settings (imports,
exports, shadowing, etc.) are made before any other uses of the package,
because of the way they pervade.  If you have lots of little packages, the
web of relationships gets more complicated.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Lieven Marchand  
View profile  
 More options Jan 22 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Lieven Marchand <m...@bewoner.dma.be>
Date: 1999/01/22
Subject: Re: Why all virtual ( was .... (Ex: Re: hashtable w/o keys stored...)

Barry Margolin <bar...@bbnplanet.com> writes:
> In article <787nse$ni...@nickel.uunet.be>,
> Lieven Marchand  <m...@bewoner.dma.be> wrote:
> >I find it hard to understand what exactly about the CL package mechanism
> >hinders people to create many small (fine grained) name spaces.

> Unlike most other things in Lisp, it can be tricky to make package changes
> incrementally.  Things work best if all the package settings (imports,
> exports, shadowing, etc.) are made before any other uses of the package,
> because of the way they pervade.  If you have lots of little packages, the
> web of relationships gets more complicated.

Good point. The languages in which I most like the module/package
system, Ada and Modula-3, are both statically compiled and come with a
build system that figures out the package dependencies automagically
which takes care of the web of relationships. In such a language,
small packages are beneficial to avoid long compilation times for
localized changes, but in Lisp you already have a better way for this.

--
Lieven Marchand <m...@bewoner.dma.be>
--------------------------------------------------------------------------- ---
Few people have a talent for constructive laziness. -- Lazarus Long


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "hashtable w/o keys stored..." by Graham Hughes
Graham Hughes  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Graham Hughes <ghug...@cs.ucsb.edu>
Date: 1999/01/23
Subject: Re: hashtable w/o keys stored...

>>>>> "David" == David Bakhash <ca...@mit.edu> writes:

    David> this sounds really interesting.  I must admit that I had
    David> not known of TRIE, and still know nothing.

    David> I highly appreciate this information, though I must admit
    David> that I'm skeptical at how it can _possibley_ be faster than
    David> a hashtable.

I'll use the regular trie here, because I know it better.  Roughly
speaking:

Searching a trie and coming up with an element if one exists is
roughtly an O(n) operation, with n being the length of the indexing
string.  So (trie-ref trie "foo") takes three steps.  But if the
element does not exist (suppose there are no strings starting with #\f
in the trie), then it can take less than n steps to reject the
reference.

A hashing function on strings takes some number of characters and
combines them to come up with an index into an array.  This is an O(n)
operation with n being the length of the indexing string (although
some implementations arbitrarily cut the hashing function off at say
the 10th character).  This hashing function must be performed
regardless of success or failure, and so a hashing function always
requires n steps.

In practice, this isn't really very interesting, and tries may or may
not have poor cache behaviour etc., so a simple version just says hash
tables and tries are about as fast unless you deal with ridiculously
long strings with the same, say, first 10 characters as a prefix.

Tries have another advantage; they grow better than hash tables
(lookup *always* takes about O(n) in length of string, regardless of
how big the trie is; closed hash tables often are worse than O(n) in
the length of the string for large tables and open hash tables often
need resizing, so they're only O(n) in the length of the string
amortized), but they're also nice from a functional point of view
because you can add nodes to a trie and retain the old one.

It's an interesting data structure to be sure, but not particularly
well known, when it is better than a hash table it is often not much
better, implementing them without wasting a great deal of space can be
very annoying, and various other caveats.  It is very useful for a
programmer in a purely functional language like Haskell because
Haskell doesn't *get* hash tables (copying the array around to update
one element is ridiculous but necessary), and I can see the use for a
symbol table in a compiler, but for the most part it's not really
worth bothering with, like splay trees[0].

[0] Splay trees are a famous amortized data structure that have
advantages but are often not *so* much better than e.g. red black
trees that it's worth the effort to implement them.
--
Graham Hughes <ghug...@cs.ucsb.edu>
PGP Fingerprint: 36 15 AD 83 6D 2F D8 DE  EC 87 86 8A A2 79 E7 E6


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Bakhash  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: David Bakhash <ca...@engc.bu.edu>
Date: 1999/01/23
Subject: Re: hashtable w/o keys stored...
hey,

It turns out that these data structures -- tries -- are exactly what I
was looking for.  I guess it's not important to explain why, because
what I'm doing is so specfic that it's not too interesting.  But since
you guys have been so helpful, I may as well try to thank you by
describing what I was doing.

I am making a huge string table.  There are times when I'll look up
the string "the" and, right afterwards, I might look up the string
"them".  In this case, it might help to retain the node that the "the"
got me to, and just keep going from there.  Either way, the point is
that there's a lot of shared data, and so the potential for these
tries in my code is great.  I was still hoping that someone would lend
me some code that implements tries (in CL, of course).  but otherwise,
I'll simply try to find out where the algorithms are documented and
redo the work.

but I have, very successfully, implemented keyless hash-tables (I call
them `array-tables'), and they perform comparably to ACL's built-in
hash-tables, and (in my opinion) are extremely easily integrated into
programs which use hash-tables.  I implementing a macro called
`array-table-loop', which lets you do things like:

(array-table-loop for entry being the array-table-values of atable
    [do|maximize|..etc..] ...[etc.])

so you can still get your hash-table iteration working.  CLOS made
making this very elegant, and it seems to run pretty fast.  Oh, and
one nice thing about the keyless array-tables: they arn't limited in
the :test that they can use.  so, for example, you can use
#'string-equal instead of #'equalp.  I think this is a big plus, and
I'm still wondering why ANSI CL limits the :test for hash-tables to
(member #'eql #'eq #'equal #'equalp).  I guess it's because
hash-tables in CL can store _any_ type of object, and #'string-equal,
#'char=, and friends don't operate on just any Lisp object.

dave


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hrvoje Niksic  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Hrvoje Niksic <hnik...@srce.hr>
Date: 1999/01/23
Subject: Re: hashtable w/o keys stored...

David Bakhash <ca...@engc.bu.edu> writes:
> I think this is a big plus, and I'm still wondering why ANSI CL
> limits the :test for hash-tables to (member #'eql #'eq #'equal
> #'equalp).  I guess it's because hash-tables in CL can store _any_
> type of object, and #'string-equal,
> #'char=, and friends don't operate on just any Lisp object.

I don't think this is the real reason the test functions are limited
to those you enumerated.

The choice of test function is limited because the test function must
always correspond to a hash function (specifically, for any two keys A
and B, it must be that `A equals B' implies `hash(A) == hash(B)'; the
reverse needn't be true.)

So if the user specifies a test function of his own, he should also
specify a hash function of his own, and that hash function should be
able to cope with all the Lisp types that could be a part of the key.
Implementing such a hash function without delving into object
allocation internals does not sound feasible.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Lyman S. Taylor  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: ly...@cc.gatech.edu (Lyman S. Taylor)
Date: 1999/01/23
Subject: Re: hashtable w/o keys stored...
In article <cxjk8ydoqh7....@engc.bu.edu>,
David Bakhash  <ca...@engc.bu.edu> wrote:
...

>I'm still wondering why ANSI CL limits the :test for hash-tables to
>(member #'eql #'eq #'equal #'equalp).  I guess it's because

    If your key is the object identity  EQ ( or to a lesser extent EQL )
    think about what happens when you do a garbage collection. Namely
    you have a nonconservative collector that moves objects.

    When those objects "move" their identity changes. The gc may
    change all of the references to the "new" identity, but the
    hash table was indirectly dependent upon those "old" ones. Namely
    your object will likely hash to different spot now.  

    That's OK. you can simply rehash the whole table before the next lookup.
    Or follow some other amortized resolution scheme.

    EQUAL and EQUALP should be uninpacted by this. They don't care because
    they're based on a different sense of identity.  So you need not only
    a test but also something that "says" what sort of policy you do
    post GC.    I think common lisp designers chose to let only the
    implementors worry about these issues.

   P.S.  this may mean that EQ is probably _not_ a good :TEST argument
         to your new construct.   You just may not have been bitten by
         it yet.
         [ if you exhaustively search the table until all keys are
            examined then fine. However, you can't really say that
            it has O(1) lookup characteristics.  ]

      You may have used  SXHASH in all cases regardless of resolution
      predicate.  The negates the problem of starting to "look"  in a
      different place. I'm not sure the designeres wanted to constrain the
      implementors to this or if this doesn't have hash code distrubution
      problems where there are more collisons than "necessary".
      [ For instance I think I posted an example where an implementation
        use the size of a vector as its SXHASH code, irregardless of
        contents.  A hash table of same size vectors would have horrific
        ( OK,  linear) performance in this case. ]

--

Lyman S. Taylor          "The Borg --  party poopers of the Galaxy. "
(ly...@cc.gatech.edu)                 EMH Doctor  Star Trek Voyager.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Steve Gonedes  
View profile  
 More options Jan 24 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Steve Gonedes <jgone...@worldnet.att.net>
Date: 1999/01/24
Subject: Re: hashtable w/o keys stored...

David Bakhash <ca...@engc.bu.edu> writes:

< hey,
<
< It turns out that these data structures -- tries -- are exactly what I
< was looking for.  I guess it's not important to explain why, because
< what I'm doing is so specfic that it's not too interesting.  But since
< you guys have been so helpful, I may as well try to thank you by
< describing what I was doing.
<
< I am making a huge string table.  There are times when I'll look up
< the string "the" and, right afterwards, I might look up the string
< "them".  In this case, it might help to retain the node that the "the"
< got me to, and just keep going from there.  Either way, the point is
< that there's a lot of shared data, and so the potential for these
< tries in my code is great.  I was still hoping that someone would lend
< me some code that implements tries (in CL, of course).  but otherwise,
< I'll simply try to find out where the algorithms are documented and
< redo the work.

I wrote some tries just recently... I think I got the main idea from
AICP - I've read about them in many algorithm books, but they usually
make it seem much more complicated than it needs to be. These aren't
too `fancy', but that's what I like about them. Ideally you would be
able to specify the equality function and other such-nots...

;;; Tries

(defstruct (trie)
  (value nil)
  (arcs ())
  parent)

(defun follow-trie-no-extend (key trie)
  (cdr (assoc key (trie-arcs trie))))

(defun follow-trie-extending (key trie)
  (let ((arc (assoc key (trie-arcs trie))))
    (if arc (cdr arc)
        (let ((newtrie (make-trie :parent trie)))
          (push (cons key newtrie) (trie-arcs trie))
          newtrie))))

(defun find-trie-extending (key trie)
  (dotimes (i (length key) trie)
    (setq trie (follow-trie-extending (aref key i) trie))))

(defun find-trie (key trie)
  (loop for code across key
      for newtrie = (follow-trie-no-extend code trie)
        then (follow-trie-no-extend code newtrie)
      while newtrie do (setq trie newtrie)
      finally (return trie)))

(defun put-trie (key trie value)
  (setf (trie-value (find-trie-extending key trie)) value))

(defun get-trie (key trie)
  (let ((found (find-trie key trie)))
    (if (not found) nil
        (trie-value found))))

(defsetf get-trie put-trie)

(defvar *trie* (make-trie))

(setf (get-trie "them" *trie*) "Nope!")
(setf (get-trie "the" *trie*) "Closer")
(setf (get-trie "the dog" *trie*) "has hair")

(trie-value (find-trie "them" *trie*)) => "Nope!"
(trie-value (find-trie "the" *trie*)) => "Closer"
(get-trie " dog" (trie-parent (find-trie "them" *trie*))) => "has hair"

Useful stuff; can be much better than hash tables.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages < Older 
« Back to Discussions « Newer topic     Older topic »