> ? That runs out of fixnum range on 32-bit machines pretty quickly.
I admit I haven't had time to really think about the problem, so I am not satisfied I know the optimal answer, but the above approach is not the most economical. A more economical way would be to use a variable-base positional system (I am not sure this is the correct English term). There the weights of the different digit positions are independent of one another (instead of all of them being powers of one and the same base), e.g. 1 for the rightmost digit, 31 for the middle one, and 31*12 for the leftmost one in a triplet that represents (year, month, day). So producing the integer for the triplet can be done e.g. by
(of course, the weights vector would be precomputed, and consing would be avoided, etc.---this is just the illustration). This is more economical in terms of space than allocating 5 bits for the day (32 different values, 1 unused) and 4 bits for the month (16 different values, 4 unused).
Vassil Nikolov <vniko...@poboxes.com> www.poboxes.com/vnikolov (You may want to cc your posting to me if I _have_ to see it.) LEGEMANVALEMFVTVTVM (Ancient Roman programmers' adage.)
-----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
> In article <7e9pec$64...@nnrp1.dejanews.com>, > Vassil Nikolov <vniko...@poboxes.com> wrote: > >In theory at least, an implementation could have a specific tag for > >a singleton reference so that the reference itself does take half the > >space for a cons cell. But even though it could, I don't expect that > >it would, it being too much trouble for something that's rarely needed.
I'd never really thought of it that way, but I can see why someone might. I can't remember all the details. A car of a dtp-locative returns the same as the car of a dtp-list, but I don't remember what cdr returns. However, since a locative is just a pointer into virtual memory, it can be used for pointing to anything, not just a list.
In article <7ec77e$6m...@nnrp1.dejanews.com>, Vassil Nikolov <vniko...@poboxes.com> wrote:
>Assuming Lisp Machine locatives are singleton references, I'm also >curious to know what were the reasons why they didn't make it into >Common Lisp (to avoid complexity? to avoid expensive implementation >on stock hardware? ...)
I only recall them being used in two ways:
1) As the reference to disembodied property lists (I think the GET and PUT functions automatically followed a locative, so they can be used interchangeably with symbols). In Maclisp we used an extra cons for this. Common Lisp provides GETF, and SETF of GETF updates the place that holds the reference to the property list, so this isn't as necessary.
2) As pointers in low-level system programming. This is the most common use.
Since low-level system programming was not a target use of Common Lisp, there was no incentive to include them in CL.
-- Barry Margolin, bar...@bbnplanet.com GTE Internetworking, Powered by BBN, Burlington, MA *** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups. Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
No. They are pointers to any location. One doesn't allocate a locative. One obtains a locative that points to an existing location. I don't recall the exact primitives on the Lisp Machine, but using Scheme-like notation they were:
Rob Warnock wrote: > It all depends on the implementation, I'd guess. In MzScheme a box is an > instance of "Scheme_Small_Object", which is a type field [a short, but > most compilers will force that to a PSV] and a pointer-sized value (PSV). > A cons is an instance of "Scheme_Object", which is a type field and > two PSVs. So for MzScheme, a box is 2/3 (or maybe 3/5) the size of a cons.
I modified the Boehm GC and MzScheme to implement CDR-coding and represent non-cdr-coded conses with 2 words on SGI. But, like my (tiny)clos support, those things did not make it into the distribution (Matt suggested that it was probably time for a `Fernando Scheme'). I sent the GC mods to Hans also, but I doubt they are there either.
So in this case, a box and a cons are the same size.
-- Fernando D. Mato Mira Real-Time SW Eng & Networking Advanced Systems Engineering Division CSEM Jaquet-Droz 1 email: matomira AT acm DOT org CH-2007 Neuchatel tel: +41 (32) 720-5157 Switzerland FAX: +41 (32) 720-5720
> Does anyone know of any good sources to information on the hardware > design of the Lisp Machines? OS information would be nice as well.
I have some manuals if you will pay postage. There are also some papers which I think are available at MIT on the CAR/CADR machines, and probably info on later commercial boxes was published -- there's certainly a cool document on the LMI K machine (of which there may have been one or two made) somewhere on the net.
>>>>> "Erik" == Erik Naggum <e...@naggum.no> writes:
Erik> I'm hinting at encoding year-month-day and hour-minute-second as bytes in Erik> a fixnum. little did I know this would be paper material in complexity.
Do you mean a single byte for each? Or multiple bytes for each?
Since there are (* 24 60 60) = 86400 possible values for hour-minute-second, and they're all equally likely, I don't see how you can take much less than 17 (= log2(86400)) bits for this.
Similarly, for year-month-day, there are at most 366 month-day combinations and 100 year values (I assume year is the year in a century, with the century stored somewhere else). This is a total of at most 366000 values but at least 36500 values (not every year is a leap year). This takes another 16 bits.
The total is 33 bits which won't fit in a typical fixnum of a 32-bit machine.
Obviously, there's some redundancy I'm overlooking. And of course, with this type of encoding, you have to do all sorts of divisions to get the individual pieces out, but that may not be so bad if the machine has integer division built in.
> No. They are pointers to any location. One doesn't allocate a locative.
(...) ;description of locatives
Thanks, that was interesting and useful.
I wonder if the following terminology would work: a locative is a _reference value_ and not a _reference object_ (and thus it is not a half-cons, i.e. a singleton reference).
Vassil Nikolov <vniko...@poboxes.com> www.poboxes.com/vnikolov (You may want to cc your posting to me if I _have_ to see it.) LEGEMANVALEMFVTVTVM (Ancient Roman programmers' adage.)
-----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
In article <3132487825915...@naggum.no>, Erik Naggum <e...@naggum.no> wrote: (...)
> I'm hinting at encoding year-month-day and hour-minute-second as bytes > in a fixnum.
Thinking about this problem, I encountered the following question.
Imagine I obtain some value out of two (or more) fixnums, each within a specific range (not too big). To do this fast, I can precompute a table represented as a two-dimensional array indexed by the fixnums in question. Accessing the array, however, will involve multiplication which I find too slow,^1 so I want to replace the two-dimensional array with a vector of vectors, roughly like this:
(defvar *table* (make-array max1 :element-type `(vector fixnum ,max2) :initial-contents ;; MAX1 fresh (VECTOR FIXNUM max2)'s: (map 'list #'(lambda (_) (declare (ignore _)) (make-array max2 :element-type 'fixnum)) (make-array max1))) "Precomputed values.") __________________________ ^1 if MAX2 is known to be a power of 2, multiplication can be replaced by bitwise operations which are fast, but this is not the point here
(Using C notation seems to provide the shortest way to describe the memory layout.) Normally, *TABLE* would be structured as table here (modulo headers and tags):
typedef int ivector[max2]; ivector *table[max1];
However, is an aggressively optimising CL compiler allowed to do as in
int table[max1][max2];
i.e. `open-code' the nested vectors?
If the latter representation happens, then again slow multiplication will have to be used.
(Note that in C the same expression will be used in both cases to obtain a value: ``table[i][j]'' but in the former case this would compile into two indexing operations plus one indirect reference through a pointer while in the latter case this would compile into a multiplication and an indexing operation (assuming the addition can be integrated with the indexing operation).)
Vassil Nikolov <vniko...@poboxes.com> www.poboxes.com/vnikolov (You may want to cc your posting to me if I _have_ to see it.) LEGEMANVALEMFVTVTVM (Ancient Roman programmers' adage.)
-----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
* Raymond Toy <t...@rtp.ericsson.se> | Since there are (* 24 60 60) = 86400 possible values for | hour-minute-second, and they're all equally likely, I don't see how | you can take much less than 17 (= log2(86400)) bits for this.
the whole idea is to avoid division by table lookups. therefore, we're talking about a means to ensure that the bits of a fixnum can hold 24 hours, 60 minutes, and 60 seconds in separate bytes, or 400 years, 12 months, and 31 days in separate bytes. then define a simple vector of precomputed values indexed by the undivided value.
| Obviously, there's some redundancy I'm overlooking. And of course, | with this type of encoding, you have to do all sorts of divisions to | get the individual pieces out, but that may not be so bad if the | machine has integer division built in.
division was the _primary_ cost in the algorithm...
>* Raymond Toy <t...@rtp.ericsson.se> >| Since there are (* 24 60 60) = 86400 possible values for >| hour-minute-second, and they're all equally likely, I don't see how >| you can take much less than 17 (= log2(86400)) bits for this.
> the whole idea is to avoid division by table lookups. therefore, we're > talking about a means to ensure that the bits of a fixnum can hold 24 > hours, 60 minutes, and 60 seconds in separate bytes, or 400 years, 12 > months, and 31 days in separate bytes. then define a simple vector of
^^^^^
If you were not the author of this text, Eric, I would assume that the author was lazy and didn't meant byte==piece of 8 bits, but now I'm doubting :-).
Anyway, I can't see the problem: (log 60 2) ~ 5.9 bits (log 24 2) ~ 4.6 bits
and (+ 6 6 5) = 17 bits -> the first fixnum, used in the table to convert between seconds-since-midnight -> (hour, minute, second).
(+ 5 4 9) -> 18 bits -> the second fixnum, in the table days-since-newyear -> (day, month, year). But I would do (days-since-newyear, year) -> (day, month), not?
And because all the fields are lower than 8 bits we can use pieces of 8 bit to encode them (could be more efficient on come CPU's, mainly Alpha's I think).
Groetjes, Peter
-- It's logic Jim, but not as we know it. | pvane...@debian.org for pleasure, "God, root, what is difference?",Pitr | pvane...@inthan.be for more pleasure!
* pvane...@mail.inthan.be (Peter Van Eynde) | If you were not the author of this text, Eric, I would assume that the | author was lazy and didn't meant byte==piece of 8 bits, but now I'm | doubting :-).
well, a "byte" is not "8 bits of a machine word at an 8-bit boundary", but "8 bits of a machine word at an 8-bit boundary" _is_ a "byte". that IBM usurped a perfectly general concept and used it for byte-addressable machines with 8-bit bytes, only, is an historic accident that Common Lisp does not accept. on the PDP-10, on which I for all practical purposes grew up, byte operations were very powerful instructions to work on bytes of user-specified sizes. LoaD Byte and DePosit Byte have survived into Common Lisp. Increment Byte Pointer and ILDB and IDBP have not -- they were used to read successive bytes out of a sequence of machine words, and that is properly handled by better primitives, such as STRING operations today. the byte pointer itself did not survive: it had a memory address, too, but the byte specification did -- see the functions BYTE, BYTE-SIZE, and BYTE-POSITION.
| Anyway, I can't see the problem: | (log 60 2) ~ 5.9 bits | (log 24 2) ~ 4.6 bits
um, how can I put this? INTEGER-LENGTH returns an integer, not a floating point number. LOG is the wrong function to use when measuring the required size of an integer, although CEILING of LOG yields the same value as INTEGER-LENGTH for positive numbers.
| and (+ 6 6 5) = 17 bits -> the first fixnum, used in the table to convert | between seconds-since-midnight -> (hour, minute, second).
yup, this is the idea. (it turned out not to be worth the hassle to use only 16 bits by partition the day into two 12-hour halves. if I were cramped for space, I'd revisit that space optimization.)
| (log 31 2) ~ 4.9 bits | (log 12 2) ~ 3.6 bits | (log 400 2) ~ 8.6 bits | | (+ 5 4 9) -> 18 bits -> the second fixnum, | in the table days-since-newyear -> (day, month, year). | But I would do | (days-since-newyear, year) -> (day, month), not?
I use day-since-beginning-of-leap-day-period. as I explained earlier, the current leap day period started at 1600-03-01, and ends 2000-02-29. the next leap day period starts 2000-03-01. since we're going to face that day pretty soon, I chose that day as day 0.
| And because all the fields are lower than 8 bits we can use pieces of 8 | bit to encode them (could be more efficient on come CPU's, mainly | Alpha's I think).
hm. I haven't actually investigated the possibility of using a (vector (unsigned-byte 8)) for the representation. I'll try that. (I guess I'm still not quite used to 8-bit addressable memory, and from what I read in the specifications of modern CPU's, the designers work really hard to avoid forcing people to stop believing in that old myth, too. think about it: if we scuttled the stupid 8-bit legacy and chose 32 bits as the smallest addressable unit and reinstated byte pointers, we'd have four times more addressable memory, and we wouldn't need 64-bit processors for another, um, 18 months!)
>* pvane...@mail.inthan.be (Peter Van Eynde) >| If you were not the author of this text, Eric, I would assume that the >| author was lazy and didn't meant byte==piece of 8 bits, but now I'm >| doubting :-).
> well, a "byte" is not "8 bits of a machine word at an 8-bit boundary", .. > BYTE, BYTE-SIZE, and BYTE-POSITION.
Hmm. Yes, now I remember how confused I was when I read the ldb spec... I should learn to say "octet"...
>| Anyway, I can't see the problem: >| (log 60 2) ~ 5.9 bits >| (log 24 2) ~ 4.6 bits
> um, how can I put this? INTEGER-LENGTH returns an integer, not a > floating point number. LOG is the wrong function to use when measuring > the required size of an integer, although CEILING of LOG yields the same > value as INTEGER-LENGTH for positive numbers.
I just wanted to show how many bits are needed to represent the ranges. I was expecting a trap, like a non-uniform encoding.
..
> I use day-since-beginning-of-leap-day-period. as I explained earlier, > the current leap day period started at 1600-03-01, and ends 2000-02-29. > the next leap day period starts 2000-03-01. since we're going to face > that day pretty soon, I chose that day as day 0.
So now it's day -329?
>| And because all the fields are lower than 8 bits we can use pieces of 8 >| bit to encode them (could be more efficient on come CPU's, mainly >| Alpha's I think).
> hm. I haven't actually investigated the possibility of using a (vector > (unsigned-byte 8)) for the representation. I'll try that. (I guess I'm > still not quite used to 8-bit addressable memory, and from what I read in > the specifications of modern CPU's, the designers work really hard to > avoid forcing people to stop believing in that old myth, too. think > about it: if we scuttled the stupid 8-bit legacy and chose 32 bits as the > smallest addressable unit and reinstated byte pointers, we'd have four > times more addressable memory, and we wouldn't need 64-bit processors for > another, um, 18 months!)
IIRC the Alpha only had 64 and 32 bit instructions for the first 2 generations, but with the 21264 they introduced 8 bit instructions especially for emulation and multi-media. Anyway we _need_ those extra 4 bits, where else would we place the type info?
Groetjes, Peter
-- It's logic Jim, but not as we know it. | pvane...@debian.org for pleasure, "God, root, what is difference?",Pitr | pvane...@inthan.be for more pleasure!
| Anyway we _need_ those extra 4 bits, where else would we place the type | info?
well, there's the "Big Bag of Pages" design that allocates objects of individual types to large areas of virtual memory which makes it possible to use the highest few bits of an address as an index into a type table, instead of using the bits _explicitly_ to encode type. since very few objects would require less than 64 bits, they could all be aligned on even addresses, and fixnums could be a whopping 31 bits wide. whee!
anyway... the 8-bit byte-addressable memory is here to stay as long as C is with us. are you happy now, Dennis Ritchie!? are you? </frustrated>
> (+ 5 4 9) -> 18 bits -> the second fixnum, > in the table days-since-newyear -> (day, month, year). > But I would do > (days-since-newyear, year) -> (day, month), not?
> And because all the fields are lower than 8 bits we can use pieces > of 8 bit to encode them (could be more efficient on come CPU's, mainly > Alpha's I think).
OK, this is where you lost me. 400 years takes 9 bits but you're going to stick it in an 8 bit memory location? How are you doing that again?
* mike...@mikemac.com (Mike McDonald) | OK, this is where you lost me. 400 years takes 9 bits but you're going | to stick it in an 8 bit memory location? How are you doing that again?
Mike, do me a favor, now. carefully, on your own, not posting anything until you're done, go away from the computer, pick up a refence book on Common Lisp, preferably the actual standard, but either CLtL or CLtL2 will do just fine, and _read_ about bytes operations in Common Lisp.
here's a summary of the situation from historic perspective: the 8-bit byte at an 8-bit boundary is a special case of the more general byte concept. the 1990's hardware "byte" is 8 bits with the same machine address. the _software_ "byte" is any contiguous number of bits in an _integer_. some processors can extract a byte out of a machine integer in one operation, while others need two: a shift and a mask. because Common Lisp is not a hardware-oriented language, it does not deal with the machine concepts, but the historically correct _concept_.
and _no_ bullshit about how this will confuse other ignorants, OK?
In article <3132593479131...@naggum.no>, Erik Naggum <e...@naggum.no> writes:
> * mike...@mikemac.com (Mike McDonald) >| OK, this is where you lost me. 400 years takes 9 bits but you're going >| to stick it in an 8 bit memory location? How are you doing that again?
> Mike, do me a favor, now. carefully, on your own, not posting anything > until you're done, go away from the computer, pick up a refence book on > Common Lisp, preferably the actual standard, but either CLtL or CLtL2 > will do just fine, and _read_ about bytes operations in Common Lisp.
Eric, do me a favor, pull your head out of your ass, you pompous twit! I am well aware that CL allows arbitrary size bytes. However, Peter said:
> (log 400 2) ~ 8.6 bits
and 6 lines later he said:
> And because all the fields are lower than 8 bits we can use pieces > of 8 bit to encode them
In the math they teach and use around here, 8.6 in not "lower" that 8. Since I know Peter is a reasonable person, I don't think that's what he really meant. So I'd still like Peter, not you, to clearify what he meant.
>> And because all the fields are lower than 8 bits we can use pieces >> of 8 bit to encode them
I meant that we can waste space on the fields that require less than 8 bits. Instead of packing them together as close as possible. On the x86 at least (and on modern Alpha's I believe, but I'm no expert, I just keep dreaming of buying one) the extract-8bits-on-a-8bit-boundary-from-that-register[1] commands are pretty common and fast[2], so this would give a speed bonus.
> In the math they teach and use around here, 8.6 in not "lower" that 8. >Since I know Peter is a reasonable person, I don't think that's what he >really meant. So I'd still like Peter, not you, to clearify what he meant.
Thanks for the complement, but as countless math teachers have told me: I should learn to express myself a bit more correctly.
Groetjes, Peter
1: like (rusty x86 assembler, I want to _forget_ this) mov day,al mov month,ah shr eax,16 mov year, ax
instead of masking bits and shifting 3 times.
2: except on a pentium pro, but Intel saw the errors of their ways...
-- It's logic Jim, but not as we know it. | pvane...@debian.org for pleasure, "God, root, what is difference?",Pitr | pvane...@inthan.be for more pleasure!
* pvane...@mail.inthan.be (Peter Van Eynde) | I meant that we can waste space on the fields that require less than 8 | bits. Instead of packing them together as close as possible. On the x86 | at least (and on modern Alpha's I believe, but I'm no expert, I just keep | dreaming of buying one) the | extract-8bits-on-a-8bit-boundary-from-that-register[1] commands are | pretty common and fast[2], so this would give a speed bonus.
this presupposes a number of hard-to-presuppose things. fixnums have to live in the same 32-bit space as addresses. first, this frequently means that some least or most significant bits are used for type information, and this impacts the 8-bit boundaries. second, even if you could extract a byte with a register operation, the same considerations as for the whole fixnum would apply to the result. in practice, it's easier to shift.
since we're dealing with Intels and I'm sure everybody has Pentium II by now :), a shift immediate is only one µop, and if you interleave the shifts, masks, and the register transfers or memory writes, you can fill the decoder and exploit the entire pipeline; you would save nothing by using other sequences of µops that also filled the pipeline, so there is nothing to be gained by handling 8-bit "shifts" specially. I don't think we want to get into the details, though...
Welcome back my friends, to the show that never ends, Ladies and gentlemen, INTEL EX EIGHTY SIX...
Vassil Nikolov <vniko...@poboxes.com> www.poboxes.com/vnikolov (You may want to cc your posting to me if I _have_ to see it.) LEGEMANVALEMFVTVTVM (Ancient Roman programmers' adage.)
-----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Barry Margolin <bar...@bbnplanet.com> writes: > In article <7e9pec$64...@nnrp1.dejanews.com>, > Vassil Nikolov <vniko...@poboxes.com> wrote: > >In theory at least, an implementation could have a specific tag for > >a singleton reference so that the reference itself does take half the > >space for a cons cell. But even though it could, I don't expect that > >it would, it being too much trouble for something that's rarely needed.
A locative is not a cons at all. In fact, its feature is that it is an immediate pointer to a storage location in a way that is (mostly) neutral as to what it's pointing into. (location-contents <locative>) would read and its setf would write any location in any data structure without regard to whether it was a pointer into a cons or whatever, but the locative itself could be created without consing, and it was in fact critical that this was so. In effect, a locative was a pointer datatype and not dissimilar to the C style of pointer except that because the Lisp Machine is very cool, locatives in the Lisp Machine respect the data model and you can't (easily) get them to arbitrary memory so they don't confuse the GC.
However, the underlying storage still requires a cons. Then you can locf it and make it seem neater and prettier
HOWEVER, all is not lost because the lisp machine also has cdr-coding, so it only takes 2 extra bits to represent an empty cdr, and conveniently there are enough bits in a word that this doesn't take extra memory to do, so (LIST x) takes only one Q (pointer-width) of storage because it's a single cell with a 2-bit cdr-code of cdr-nil. And then since locf'ing takes no additional storage, voila' you get one-word storage for free. But then, you'd have gotten one word storage if you just used (LIST x) [though for trivia buffs, not if you did (CONS x NIL) which gets a two-cell cons with the first cell being CDR-NORMAL with a NIL pointer in it; the LispM makes an assumption which it publishes that (CONS x NIL) is more likely to be RPLACD'd than (LIST x), and LispM programmers know to use (CONS x NIL) when they plan to rplacd so the RPLACD doesn't have to cons. In the (LIST x) case, RPLACD has to replace the one cell with a dtp-body-forward that the hardware knows means "find the real cons elsewhere--there wasn't room here". It's an amazing piece of craftsmanship which is invisible to users and which works statistically very well with most people being wholly unaware, and even better if they are aware...].
I'm doing this from memory and haven't recently used any of it, but I think I got it right. If anyone thinks I've slipped, tell me and I'll check. I have a lispm here to do that with--but I have other news to answer after having been away a week, so I'm skimping on the details.
Also, Barry probably already knew all of the above and I bet he had it compiled in at such a low level that he was remembering the LOCF=1 word thing becuase he'd long ago collapsed out the fact that two serendipitous facilities were conspiring to give this one nice pretty result.
I vaguely recall writing things like (defun half-cons (x) (locf (car (list x)))) I also recall there is a thing about the cdr of the list where you don't want to try locfing that because the complexity of locfing the cdr is so bad that locf of cdr just returns the cons and location-contents knows specially how to deal with conses as if they were locatives by doing cdr because the cdr microcode (or ivory chip hardware) is so complicated (because of cdr codes) that it was too expensive to effectively duplicate in the locative-manipulating instructions and it was cheaper to just pretend a cons was a locative in that one case. So (let ((x (cons 1 2))) (cons x (locf (car x)))) ((1 . 2) . #<LOCATIVE 123456>) but (let ((x (cons 1 2))) (cons x (locf (cdr x)))) => ((1 . 2) 1 . 2)
That avoids the semicolon. Does that make it better? ;-)
I recall the Lisp Machine has some funny case where you find something on the stack that is a throw tag with the name If-you-throw-to-this-tag-you-deserve-to-lose. My assumption has always been that it is the secret stack thing used to implement lexical blocks or some such, but I never tracked it down. If anyone happens to know...