Is this undefined behavior ?
2. Why the requirement that the dimensions of an array are fixnums ?
3. Even if there is good reason that the dimensions of an array be
fixnums you cannot do (make-array most-positive-fixnum) but the most
you can do is (make-array (1- most-positive-fixnum)) because the page
for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
exclusive bound on each individual dimension of an array". Why not make
it an inclusive bound ?
--
Judicial punishment is revenge abstracted.
> 1. (setq *PRINT-ARRAY* t)
> (make-array 1)
>
> Is this undefined behavior ?
No, it is implementation dependant, but not undefined.
> 2. Why the requirement that the dimensions of an array are fixnums ?
Why not? This is not a big restriction, if an implementation wants
bigger arrays, it only has to provide bigger fixnums.
Really, the reason is that Common Lisp is not purely a high level
programming language, it is specified in such a way that
implementation can easily be efficient. If you allowed bignums as
index for arrays, each AREF would imply a bignum multiplication, which
would be slow. Moreover, there would be no point, because in general
the size of arrays slots is one word, and fixnum is all you need to
index all the words you can store in memory. (eg. with a 32-bit
processor, a fixnum of 30-bit is enough to index the whole memory.
Why would you want a bigger index?).
> 3. Even if there is good reason that the dimensions of an array be
> fixnums you cannot do (make-array most-positive-fixnum) but the most
> you can do is (make-array (1- most-positive-fixnum)) because the page
> for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
> exclusive bound on each individual dimension of an array". Why not make
> it an inclusive bound ?
(0 1 2 3 4) How many elements?
--
__Pascal Bourguignon__
> Spiros Bousbouras <spi...@gmail.com> writes:
>
>> 1. (setq *PRINT-ARRAY* t)
>> (make-array 1)
>>
>> Is this undefined behavior ?
>
> No, it is implementation dependant, but not undefined.
Oops, sorry, indeed it's undefined. That's what you get for asking to
a human instead of reading the reference!
Wrong. The bound is on the dimension, which is the number of elements, not on
the index of the highest element. So if the array-dimension-limit is 5,
it being exlusive means that you can make an array that has up to 4 elements,
which are addressed from 0 to 3.
Why array-dimension-limit is an exclusive limit is probably for historic
reasons.
Maybe when Lisp was standardized, implementations did not agree whether or not
this limit is inclusive or exclusive. If implementations don't agree about a
parameter like this, what can you recommend to programmers? Treat it as
exclusive, of course! That's the weaker, more portable assumption.
Or maybe all of the implementations had the exclusive behavior, which
would have been codified as-is.
And again, once you have filled the memory with your array, you need
at least one word to store the only instruction that can make you
program up, so there's no point in going to fixnum+1.
--
__Pascal Bourguignon__
The HS page on make-array says "If initial-element is not supplied, the
consequences of later reading an uninitialized element of new-array are
undefined unless either initial-contents is supplied or displaced-to is
non-nil". In order to print the elements of the array doesn't the
implementation need to read them ? If yes then it's undefined behavior.
> > 2. Why the requirement that the dimensions of an array are fixnums ?
>
> Why not? This is not a big restriction, if an implementation wants
> bigger arrays, it only has to provide bigger fixnums.
fixnums are supposed to be efficient , that's the criterion for
choosing the size of fixnums not the maximum size of arrays. On the
other hand the maximum size of arrays should be what the operating
system can handle even if it means that part of the array would be in
swap memory.
> Really, the reason is that Common Lisp is not purely a high level
> programming language, it is specified in such a way that
> implementation can easily be efficient. If you allowed bignums as
> index for arrays, each AREF would imply a bignum multiplication, which
> would be slow.
Multiply what with what ?
> Moreover, there would be no point, because in general
> the size of arrays slots is one word, and fixnum is all you need to
> index all the words you can store in memory. (eg. with a 32-bit
> processor, a fixnum of 30-bit is enough to index the whole memory.
Is it ?
> Why would you want a bigger index?).
> > 3. Even if there is good reason that the dimensions of an array be
> > fixnums you cannot do (make-array most-positive-fixnum) but the most
> > you can do is (make-array (1- most-positive-fixnum)) because the page
> > for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
> > exclusive bound on each individual dimension of an array". Why not make
> > it an inclusive bound ?
>
> (0 1 2 3 4) How many elements?
5 , so ? In fact if the point for the limitation in the size of arrays
is to never have a bignum as an array subscript then each dimension
could be up to (1+ most-positive-fixnum) so the definition of
ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
possibility.
--
Who's your mama ?
Nuh-uh.
1) That's already the case in plenty of CL implementations -- unless a
lot of people have 60 bits of physical memory they're not telling anyone
about. (Even on 32-bit operating systems, a single fixnum-denominated
array can already use up at least all of physical memory and quite
possibly more, depending on what's in the array.)
2) Now array-size limit depends not on the lisp but on the OS and the
disks you have handy and how much someone allocated for /swap and what
other programs are running. Good luck with that. Or do you just mean the
maximum allocatable file size, in which case array-size limit varies
dynamically with the contents of the array?
3) Abandoning the fixnum limit, especially given adjustability, is
potentially really bad for performance even for arrays that aren't
bigger than that limit; if you want really bad performance, you can roll
it yourself.
4) The problem has historically been with array-size-limit rather
smaller than the fixnum limit.
Speaking of which, does anyone remember the source of a quote that goes
something like "There's a problem with arrays larger than memory; in
particular, they have to be paged in twice to create them"?
paul
And if you find yourself concerned with such issues you are almost
certainly doing something wrong.
rg
> On Fri, 04 Dec 2009 01:10:12 +0100
> p...@informatimago.com (Pascal J. Bourguignon) wrote:
>> Spiros Bousbouras <spi...@gmail.com> writes:
>>
>> > 1. (setq *PRINT-ARRAY* t)
>> > (make-array 1)
>> >
>> > Is this undefined behavior ?
>>
>> No, it is implementation dependant, but not undefined.
>
> The HS page on make-array says "If initial-element is not supplied, the
> consequences of later reading an uninitialized element of new-array are
> undefined unless either initial-contents is supplied or displaced-to is
> non-nil". In order to print the elements of the array doesn't the
> implementation need to read them ? If yes then it's undefined behavior.
Indeed, and Pascal corrected his reply above.
>> > 2. Why the requirement that the dimensions of an array are fixnums ?
>>
>> Why not? This is not a big restriction, if an implementation wants
>> bigger arrays, it only has to provide bigger fixnums.
>
> fixnums are supposed to be efficient , that's the criterion for choosing
> the size of fixnums not the maximum size of arrays. On the other hand
> the maximum size of arrays should be what the operating system can
> handle even if it means that part of the array would be in swap memory.
Fixnums _can_ be made efficient, if the implementation allows that
(but it is not obligated to do it). I would guess that the underlying
motivation behind making array sizes fixnum is that one can address
elements using row-major-aref and fixnum arithmetic, which is indeed
a very efficient thing to do in most reasonable implementations.
You are approaching this problem from the wrong perspective. There is
nothing to prevent fixnums from being very large, and you can use them
to address your array, so there is no effective limitation.
>> Really, the reason is that Common Lisp is not purely a high level
>> programming language, it is specified in such a way that implementation
>> can easily be efficient. If you allowed bignums as index for arrays,
>> each AREF would imply a bignum multiplication, which would be slow.
>
> Multiply what with what ?
Aref converts indexes to a flat row major index. What is multiplied
with what is an exercise left to the reader.
> 5 , so ? In fact if the point for the limitation in the size of arrays
> is to never have a bignum as an array subscript then each dimension
> could be up to (1+ most-positive-fixnum) so the definition of
> ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> possibility.
Again, that is not the point. The point is that row major indexes are
fixnums. See array-row-major-index.
I am curious: why are you worried about this? Have you ran into a
difficulty where, for example, you find fixnums on 64-bit SBCL
limiting?
Tamas
The fixnums on some popular implementations (like CLISP on my 32-bit
XP machine) have a dramatically smaller range. 17 million element
arrays really are too small for many applications.
Cheers,
Pillsy
Yes, but I am under the impression that CLISP is not the
implementation one would pick for computationally intensive
applications, so I see no problem with it making this particular
choice.
I like the CL standard because it leaves a lot of choice to the
implementor regarding these efficiency hacks, and at the same time the
programmer can rely on certain things being true. Eg if you are
writing some method which handles array indexes, you can always rely
on them being fixnums, regardless of the implementation. And then the
implementation is allowed to say what it considers a fixnum, and if
you don't like that, just pick another implementation.
I was wondering about the following: could there be an implementation
where all possible integers are fixnums? Then I encountered the
"Issue FIXNUM-NON-PORTABLE Writeup", which says that
"It is possible that an implementation have a single representation
for all integers, and no way to identify any efficient range of
integers. Those implementations might need to set
MOST-POSITIVE-FIXNUM and MOST-NEGATIVE-FIXNUM to arbitrary values,
consistent with the requirement that (SIGNED-BYTE 16) is a subtype
of FIXNUM."
So apparently there is no way to say that most-positive-fixnum is not
binding (ie it is something like infinity), but an implementation
could still do everything with arbitrarily large integers, and just
pick an exorbitant value for most-positive-fixnum, without the
partition having any consequence performance wise. However, this is
not a practical concern at the moment :-)
Tamas
Don't complain:
Constant Variable ARRAY-TOTAL-SIZE-LIMIT
Constant Value:
A positive fixnum, the exact magnitude of which is
implementation-dependent, but which is not less than 1024.
^^^^
And since fixnums are at least (signed-byte 16), it's all that is needed.
--
__Pascal Bourguignon__
> > > 2. Why the requirement that the dimensions of an array are fixnums ?
> >
> > Why not? This is not a big restriction, if an implementation wants
> > bigger arrays, it only has to provide bigger fixnums.
>
> fixnums are supposed to be efficient , that's the criterion for
> choosing the size of fixnums not the maximum size of arrays.
Yes. And you want to make sure array access is efficient. Thus the
limitation on only allowing fixnums as indicies
> On the
> other hand the maximum size of arrays should be what the operating
> system can handle even if it means that part of the array would be in
> swap memory.
Um, the question of fixnum sizes and the physical/virtual memory sizes
are completely orthogonal. As Pascal pointed out, most lisp systems can
generally index their entire address space (at a word level) using their
fixnums, so that means that fixnums would be sufficient to address all
of the memory. Even if a lot of systems use 29 rather than 30 bits for
fixnums, this still gives you most of memory, and some copying GCs need
can only safely use half of the virtual memory in order to guarantee
enough space to copy into....
Besides, it would be ludicrous to expect that the fixnum size of a lisp
implementation would depend on the actual amount of physical RAM
installed in a computer, rather than the processor (and software)
architecture.
> > Really, the reason is that Common Lisp is not purely a high level
> > programming language, it is specified in such a way that
> > implementation can easily be efficient. If you allowed bignums as
> > index for arrays, each AREF would imply a bignum multiplication, which
> > would be slow.
>
> Multiply what with what ?
How do you address a multiple dimension array in RAM without
multiplying?
> > Moreover, there would be no point, because in general
> > the size of arrays slots is one word, and fixnum is all you need to
> > index all the words you can store in memory. (eg. with a 32-bit
> > processor, a fixnum of 30-bit is enough to index the whole memory.
>
> Is it ?
Sure. Why do you think it isn't?
> 5 , so ? In fact if the point for the limitation in the size of arrays
> is to never have a bignum as an array subscript then each dimension
> could be up to (1+ most-positive-fixnum) so the definition of
> ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> possibility.
Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
arrays, including single-dimensional arrays that you want to displace to
your multi-dimensional array.
I also suspect that ROW-MAJOR-AREF also plays into this limit.
There is a lot of interaction between various parts of the specification
that you have to understand and consider when examining the design
decisions. It isn't enough to focus only on one small part without
considering how that impacts the language definition as a whole.
--
Thomas A. Russ, USC/Information Sciences Institute
> >> for ARRAY-DIMENSION-LIMIT says that it's a fixnum and "The upper
> >> exclusive bound on each individual dimension of an array". Why not make
> >> it an inclusive bound ?
...
> Why array-dimension-limit is an exclusive limit is probably for historic
> reasons.
>
> Maybe when Lisp was standardized, implementations did not agree whether or not
> this limit is inclusive or exclusive. If implementations don't agree about a
> parameter like this, what can you recommend to programmers? Treat it as
> exclusive, of course! That's the weaker, more portable assumption.
>
> Or maybe all of the implementations had the exclusive behavior, which
> would have been codified as-is.
Well, my guess would be related to considering how one often uses a loop
to access elements in an array. Typically, you would iterate starting
at 0 and ending when the index is >= the dimension.
So, by making the maximum dimension be max-positive-fixnum minus 1, you
can safely use fixnums to iterate through the indices without worrying
that the final increment will create a bignum.
This, like the limitation on using only fixnums an indicies is in the
spec to facilitiate efficient computation.
At least that is my take on this -- I wasn't there for any of the
standardization discussions.
> Yes. And you want to make sure array access is efficient. Thus the
> limitation on only allowing fixnums as indicies
This is an important point. People's expectations of arrays are that
they will be fast. If array indexing involves consing bignums all over
the place that is not likely to be true. People who are using machines
with relatively small word sizes where (by some trick), a Lisp
implementation can map much more memory than the machine can address (I
suppose this might be the case on some architecture with long pointers
but small integers) should still be able to assume that array access is
quick. If they need much larger objects, they can use arrays of
arrays, for instance.
I can't imagine any case where the limitation of array indices to
fixnums is a limitation: are there any current systems where an
individual process can address more than a machine-word's worth of
space? (Note, not where the system as a whole can have more physical
memory than that - I think that was reasonably common on some later
32-bit systems wasn't it?)
--tim
AFAIK, there remains no such processors currently.
On the Intel processors before the 386 (you know, when 640 KB was
enough for anybody), where segmented addressing was the rule, arrays
were often limited to 64 KB, so a 16-bit fixnum was largely enough to
index them.
--
__Pascal Bourguignon__
> I can't imagine any case where the limitation of array indices to
> fixnums is a limitation: are there any current systems where an
> individual process can address more than a machine-word's worth of
> space? (Note, not where the system as a whole can have more physical
> memory than that - I think that was reasonably common on some later
> 32-bit systems wasn't it?)
Well, we ran into some systems where the specfication LOWER limit on the
size of fixnums was actually realized. It was a 32-bit architecture
that had only 16-bit fixnums. That was a bit limiting.
(We ran into the problem with using our own hash codes, where we wanted
fixnums for speed, but ended up with some constants that turned out to
be too large for 16 bits....)
I think this was true for an early version of lispworks for Windows.
BTW, just for reference, here is the fixnum sizes for a number of lisps
that I happen to have easy access to.
Mac OS X:
bits implementation
---- --------------
29 LispWorks Personal 5.1 (32-bit)
60 ClozureCL x8664 (64-bit)
24 CLISP 2.47
29 CMUCL 19e
29 SBCL 1.0.30
31 ABCL 0.17.0
Red Hat Linux (32-bit)
bits implementation
---- --------------
29 Allegro CL 5.0, 6.2, 7.0
24 CLISP 2.41.1
29 CMUCL 19c
So, it looks like 29 bits is one of the most popular choices for 32-bit
architectures.
That's a separate issue from that of knowing whether array-dimension-limit
is exclusive or not. The actual limit may be as little as 1024, right,
or as large as most-positive-fixnum.
No matter how large or small the limit is, if you want to make arrays which
stretch to the limit, you have to know whether the limit is exclusive or
inclusive. If you are forced to guess, the safer guess is exclusive.
On CLISP, you have a most-positive-fixnum of 16.7 million. You can easily make
an array of 32 bit words on that order on the average desktop PC, without
coming anywhere near filling memory.
I recall that the Win32 API uses a kind of fixnum scheme for distinguishing
resource name strings and resource ID's. In the FindResource function,
a resource is specified as a pointer to a name: a LPCTSTR lpName
parameter. But if this pointer is in the address range 0 to 65535,
the pointer is treated as a numeric resource ID, and not a pointer to a name.
You can convert an ID number n into the string pointer using
the MAKEINTRESOURCE(n) macro.
Could it be that this Lisp did a similar trick? If the value lands in the 16
bit range, it's not a pointer to something, but a fixnum value. This is a
possible valid alternative to reserving tag bits. The nice thing is that you
don't have to do shifting and masking---just do arithmetic directly on it---but
the range limitation sucks.
I can't imagine otherwise why a Lisp in 32 bit land would chew off 16 bits in
the representation of a fixnum; nobody could possibly need that many bits
for a type tag.
I thought about doing this kind of trick in txr, but I went for a two bit mask
instead. One little fly in that ointment is not just the restricted range, but
the fact that I want nil to correspond to the null pointer, which is all zero
bits in most C compilers, and so the range-based scheme would alias 0 and NIL,
which is repugnant. :) Choosing a different NIL is possible, but then all
that Lispy C code can no longer be written like
``if (foo_list) { /* foo_list is not empty ... */ }''. It would always
have to be ``if (!null(foo_list)) ...'' which is irritating and error
prone.
Thus, the dimension can be max-positive-bignum!
> So, by making the maximum dimension be max-positive-fixnum minus 1, you
> can safely use fixnums to iterate through the indices without worrying
> that the final increment will create a bignum.
I thought about this same thing at first yesterday, but concluded that
it can't be the correct rationale.
Here is why. Even if max-positive-fixnum is an /inclusive/ bound on
array size, there is no overflow.
If an array has max-positive-fixnum elements, they are numbered
from 0 to (1- max-positive-fixnum).
Everything is representable in fixnum: all of the indices,
the array size, and even the index which is one element
past the end array.
You can walk through the array like this:
(loop for x below (length array) ...)
and x never exceeds max-positive-fixnum.
Your `>= dimension' test is the same. It does not preclude dimension
from being fixnum.
You just have to ensure you don't do something silly, like try to
increment the index even in the case when the loop guard fails.
This can't be right. CLISP's most-positive-fixnum is 16.7 million, which
means that the mantissa has 24 bits.
But remember that fixnum is signed; CLISP also has a
most-negative-fixnum that is -16.7 million.
Might you be neglecting to count the sign bit?
I'm approaching it from the perspective of what gives the programmer
the most choice and is in accord with the dynamic nature of Lisp. I
believe this is the right perspective.
> There is
> nothing to prevent fixnums from being very large, and you can use them
> to address your array, so there is no effective limitation.
On every architecture there is a natural size for fixnums which makes
arithmetic with them the fastest so a reasonable implementation will
probably choose this size for fixnums. It's not obligated by the
standard to do so but likely will. For an example of the unpleasant
dilemma which the standard creates for implementations see my reply to
Paul Wallich.
> >> Really, the reason is that Common Lisp is not purely a high level
> >> programming language, it is specified in such a way that implementation
> >> can easily be efficient. If you allowed bignums as index for arrays,
> >> each AREF would imply a bignum multiplication, which would be slow.
> >
> > Multiply what with what ?
>
> Aref converts indexes to a flat row major index. What is multiplied
> with what is an exercise left to the reader.
So the issue of multiplication doesn't even arise with vectors. For
arrays which are not vectors it could be handled with appropriate
declarations. Again , see my reply to Paul Wallich.
> > 5 , so ? In fact if the point for the limitation in the size of arrays
> > is to never have a bignum as an array subscript then each dimension
> > could be up to (1+ most-positive-fixnum) so the definition of
> > ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> > possibility.
>
> Again, that is not the point. The point is that row major indexes are
> fixnums. See array-row-major-index.
Right. So the standard could allow ARRAY-DIMENSION-LIMIT and
ARRAY-TOTAL-SIZE-LIMIT to be at most (1+ most-positive-fixnum) (which
means that both should be allowed to be bignums). This would still
ensure that ARRAY-ROW-MAJOR-INDEX can work with only fixnums while
giving the programmer 1 extra subscript to work with. Not that 1
subscript matters much of course , my main beef is with the decision to
only allow fixnums. Having made that decision I can kind of see why
they opted for simplicity and specified ARRAY-DIMENSION-LIMIT and
ARRAY-TOTAL-SIZE-LIMIT to also be fixnums at the cost of taking 1 extra
subscript away from the programmer.
> I am curious: why are you worried about this?
I'm not worried , I'm curious.
> Have you ran into a
> difficulty where, for example, you find fixnums on 64-bit SBCL
> limiting?
I haven't tried a 64-bit SBCL but with a 32-bit one yes it's limiting.
I'm almost certain that I could go around the limitation by either
using arrays of arrays or using C but it would be annoying.
--
The last scene is a killer, with the professor turning into the
protoplasm of life itself, and his wife turning into a glowing shell
of rock-like flesh, with her inner fires glowing through the crevices
(the effect is something like an overheated Spiderman).
Roger Ebert on "Altered states"
It may be the case with plenty CL implementations but not all. For
example on the computer I'm using right now most-positive-fixnum of
SBCL is 536870911. This means that I can't create a bit vector with
more than 536870911 members which is ridiculous since I have 1 GB ram.
> (Even on 32-bit operating systems, a single fixnum-denominated
> array can already use up at least all of physical memory and quite
> possibly more, depending on what's in the array.)
Of course it can use all the memory if each element is large enough.
The problem is that it might miss a lot of memory when every member of
the array is small.
> 2) Now array-size limit depends not on the lisp but on the OS and the
> disks you have handy and how much someone allocated for /swap and what
> other programs are running. Good luck with that.
I'm not sure what you mean "not on the lisp". If you do your
programming in Lisp then array size will obviously depend on the Lisp.
A good Lisp implementation in turn will make optimal use of what the
hardware and operating system make available. Actually this is true for
every programming language ; luck has nothing to do with it.
The problem here is that the standard does not allow a Lisp
implementation to do this on the computer I'm using. It's a 32 bit
computer so a fast fixnum should be at most 32 bits. I assume SBCL uses
2 bits for internal book-keeping. I don't know anything about the
internal design of SBCL but if we assume that it was a good design
decision to take these 2 bits away from the range of values then the
conundrum becomes apparent : either use a fixnum greater than 32 bits
which would have a negative impact on the speed of all integer
arithmetic or arrive at a situation where the programmer cannot create
a bit vector of size greater than 2**29 despite the 1 GB ram available.
It doesn't contradict my point if there are Lisp implementations ,
possibly many , where a similar conundrum does not arise. If there is
even one Lisp implementation where the conundrum arises then the
standard made a poor choice in enforcing the fixnum restriction. If the
standard allowed (but not enforced) for the array indices to be bignums
then the implementations where fixnum is large enough anyway would be
unaffected. But an implementation like the one I'm using would be
allowed to offer larger arrays while keeping the fixnum size to what's
optimal for the architecture.
> Or do you just mean the
> maximum allocatable file size, in which case array-size limit varies
> dynamically with the contents of the array?
I don't know what "maximum allocatable file size" is. If by "array-size
limit" you mean not what the standard specifies but the actual
practical size as a programme is running then of course it depends on
the contents of the array.
> 3) Abandoning the fixnum limit, especially given adjustability, is
> potentially really bad for performance even for arrays that aren't
> bigger than that limit;
Every time you pass from fixnums to bignums it is "potentially really
bad for performance". This is something for the programmer to worry
about. The standard could offer a declaration by which the programmer
could inform the compiler that the total size of an array will be
a fixnum. That's how it works with everything else in Lisp : the
language allows flexibility and if you want to increase performance you
restrict your flexibility through the appropriate declarations. I don't
see why array access should be different.
It's worth considering also that in the case of array access you may
not even need a declaration at all. When you create the array or adjust
its size the compiler knows if the (new) size is so large that it needs
bignums or not. If not then it can automatically arrange for all access
calculations to use fixnum arithmetic.
> if you want really bad performance, you can roll it yourself.
I don't want really bad performance , I want arrays which can be larger
than most-positive-fixnum. How can I roll this myself ?
> 4) The problem has historically been with array-size-limit rather
> smaller than the fixnum limit.
Well things change. Memory and disks get bigger while the standard
stays the same.
--
MR. KUNSTLER: I will assume with all of the people I ask you about
they are very fine young men and so on. It will save
time.
http://www.law.umkc.edu/faculty/projects/ftrials/Chicago7/Daley.html
Eww , that's pathetic.
> Yes, but I am under the impression that CLISP is not the
> implementation one would pick for computationally intensive
> applications, so I see no problem with it making this particular
> choice.
Wanting large arrays doesn't necessarily mean you want computationally
intensive applications. Having said that I have seen people here say
that CLISP is fast.
> I like the CL standard because it leaves a lot of choice to the
> implementor regarding these efficiency hacks, and at the same time the
> programmer can rely on certain things being true.
That's what declarations are for.
> Eg if you are
> writing some method which handles array indexes, you can always rely
> on them being fixnums, regardless of the implementation. And then the
> implementation is allowed to say what it considers a fixnum, and if
> you don't like that, just pick another implementation.
What if you want to use a 32-bit computer ?
No , you want to make sure that the programmer has the freedom to
choose the trade-off which is most appropriate for his application. If
he is willing to sacrifice some speed in order to gain larger arrays he
should have that choice.
> > On the
> > other hand the maximum size of arrays should be what the operating
> > system can handle even if it means that part of the array would be in
> > swap memory.
>
> Um, the question of fixnum sizes and the physical/virtual memory sizes
> are completely orthogonal.
I 99.99% agree. However in the part of my post you're quoting I'm not
making a connection between size of fixnum and size of memory , I'm
making a connection between size of arrays and size of memory.
> As Pascal pointed out, most lisp systems can
> generally index their entire address space (at a word level) using their
> fixnums,
I'm not sure what the "at a word level" part means but I believe I
understand your overall gist. So what about those lisp systems which
can't index their entire address space using their fixnums ? From what
Pillsy said CLisp probably can't.
> so that means that fixnums would be sufficient to address all
> of the memory.
What if you want a large bit vector ? Are you supposed to use something
larger than a bit for the array element and then calculate bit
addresses and do your own bit fiddling ?
> Even if a lot of systems use 29 rather than 30 bits for
> fixnums, this still gives you most of memory, and some copying GCs need
> can only safely use half of the virtual memory in order to guarantee
> enough space to copy into....
>
> Besides, it would be ludicrous to expect that the fixnum size of a lisp
> implementation would depend on the actual amount of physical RAM
> installed in a computer, rather than the processor (and software)
> architecture.
I agree. On the other hand it makes sense to expect that the maximum
array size would depend on the actual amount of physical RAM installed
in a computer so the standard is at fault for creating a connection
between fixnum size and array size.
> > > Really, the reason is that Common Lisp is not purely a high level
> > > programming language, it is specified in such a way that
> > > implementation can easily be efficient. If you allowed bignums as
> > > index for arrays, each AREF would imply a bignum multiplication, which
> > > would be slow.
> >
> > Multiply what with what ?
>
> How do you address a multiple dimension array in RAM without
> multiplying?
If all the dimensions are powers of 2 then you can do it by ORing and
shifting. Perhaps there are some nice tricks available in other cases
too.
> > > Moreover, there would be no point, because in general
> > > the size of arrays slots is one word, and fixnum is all you need to
> > > index all the words you can store in memory. (eg. with a 32-bit
> > > processor, a fixnum of 30-bit is enough to index the whole memory.
> >
> > Is it ?
>
> Sure. Why do you think it isn't?
I'm thinking it *may* not be because you can cover a larger range of
values with 32 bits than with 30.
> > 5 , so ? In fact if the point for the limitation in the size of arrays
> > is to never have a bignum as an array subscript then each dimension
> > could be up to (1+ most-positive-fixnum) so the definition of
> > ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> > possibility.
>
> Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
> arrays, including single-dimensional arrays that you want to displace to
> your multi-dimensional array.
>
> I also suspect that ROW-MAJOR-AREF also plays into this limit.
As I've said to my reply to Tamas "the standard could allow
ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to be at most (1+
most-positive-fixnum) (which means that both should be allowed to be
bignums). This would still ensure that ARRAY-ROW-MAJOR-INDEX can work
with only fixnums while giving the programmer 1 extra subscript to work
with". Issue "ARRAY-DIMENSION-LIMIT-IMPLICATIONS" acknowledges this. It
says
Version 1 of this proposal had two parts:
1. to allow ARRAY-DIMENSION-LIMIT to be a bignum (the smallest
bignum) in implementations which wanted to permit the full
fixnum range for array size.
> There is a lot of interaction between various parts of the specification
> that you have to understand and consider when examining the design
> decisions. It isn't enough to focus only on one small part without
> considering how that impacts the language definition as a whole.
Feel free to point out any parts which you think are relevant. Which
makes me wonder is there an "official" rationale for the Lisp standard
somewhere ?
--
Our shoppers returned, and we raised a jaunty toast
To the solutions provider that delivers the most
As the ordeal did teach me, there's just no disputing
The performance and value of Sun network computing
http://mirror.optus.com.au/pub/flash/sun2003/SunHolidayPoem12.pdf
After the index reaches the maximum dimension you do not increment the
index anymore. So even if the maximum dimension were
max-positive-fixnum you could still safely declare your index to be a
fixnum. That assuming you increment by 1 at each repetition. If you
increment by 3 or more you need bignums anyway. So the only situation
where it might make a difference is when you increment by 2. But even
then you can unroll the final repetition and only use fixnums. And of
course you also have the option of counting downwards.
--
"Ensign Crusher is in Sickbay receiving fellatio from his mother."
"Oh no, not again," groans the Captain.
"He is not a true warrior!" exclaims Worf.
http://tinyurl.com/die-wesley-crusher
> Well things change. Memory and disks get bigger while the standard
> stays the same.
This is purely an implementation issue. When memory and disks get
bigger, then implementations need to increase the size of fixnums. An
implementation with small fixnums needs to change, not the standard.
--
Raffael Cavallaro
The trade off is there. You are perfectly free to ask your Lisp for
an array biger than array-dimension-limit, and your implementation
is free to provide it to you.
In this case, you merely can't use the ANSI standard as a negotiation
tool.
> On the other hand it makes sense to expect that the maximum
> array size would depend on the actual amount of physical RAM installed
> in a computer so the standard is at fault for creating a connection
> between fixnum size and array size.
Again, this is an *implementation* issue, not an issue with the
standard. Implementors are responsible for assuring that their fixnums
are large enough to address arrays that are reasonable on the platform.
The standard gives them the flexibility to do that, so any failure to
provide fixnums large enough to address reasonable sized arrays on that
platform is a failure of that specific implementation, not a flaw in
ANSI common lisp.
--
Raffael Cavallaro
More probably you will implement a bigarray data abstraction using an
array of arrays to store the data.
Remember that ARRAY-TOTAL-SIZE-LIMIT may be as low as 1024, so if you
want to write a portable program you will have to implement this
bigarray abstraction anyway.
--
__Pascal Bourguignon__
> The problem here is that the standard does not allow a Lisp
> implementation to do this on the computer I'm using.
The standard doesn't prevent Lisp implementations from allowing bignum
array indices. It just doesn't require it, so portable applications
can't depend on it.
--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
> > Aref converts indexes to a flat row major index. What is multiplied
> > with what is an exercise left to the reader.
>
> So the issue of multiplication doesn't even arise with vectors.
The index has to be multiplied by the element size to get the memory
address. Lisp references and immediate data types are typicaly a power
of 2 in size, so this multiplication can usually be done using shifting,
but that probably takes about the same amount of effort in the CPU.
If you want to allocate a bignum array, why don't you roll your own
implementation of bignum-sized arrays that make use of fixnum-sized
arrays? Then you'll be able to use all the memory your implementation
provides portably.
[ ... ]
> 29 Allegro CL 5.0, 6.2, 7.0
Wow, these are old! Well, even though the newer 32-bit versions of
Allegro CL haven't changed, with 29 bits plus sign bit, most of the
architectures also have 64-bit versions, which also have 60 bits plus
sign bit for fixnum size.
Duane
However, if the fixnum representation is selected so that the machine-
integer represents precisely the number of bytes per natural word,
then all arrays with natural-sized elements (e.g. 4 bytes on a 32-bit
system and 8 bytes on a 64-bit system) are accessible with machine
instructions using the fixnum indices as if they are machine integers,
with no shifting needed. This is a strong reason why fixnum
representations in most lisps are so similar. It also hints at why
going ex-spec and potentially using bignums is such a losing
proposition (since access using fixnums is _so_ fast). In fact, on
machines which don't have a "shift-left-{2,3}-and-add" instruction
(added especially for these array accesses in C) and which have C
implementations which don't do strength reduction in array
manipulation, these Lisp implementations stand a chance at actually
_beating_ C implementations.
Duane
Each array dimension has to be smaller than ARRAY-DIMENSION-LIMIT and
ARRAY-DIMENSION-LIMIT has to be a fixnum therefore array dimensions
have to be fixnums. Each array index has to be smaller than the
corresponding dimension therefore each array index also has to be a
fixnum. (I'm assuming that if a,b are integers and 0 <= a <= b and b is
a fixnum then a also has to be a fixnum.)
The problem is that I may need to implement a bigarray even if I just
want a programme to work on a specific 32-bit computer.
Yes. However it's not a big problem:
1- it is easily solutionned in lisp, or
2- you can also easily switch to another implementation where the
limit is big enough for you.
--
__Pascal Bourguignon__
No it's not necessarily a failure of the implementation because an
implementation is also responsible for assuring that the fixnums it
provides are as fast as possible for the the architecture on which the
implementation is running. So for example if an implementation is
running on a 32-bit computer then fixnums likely have to ([1]) be at
most 32 bits. So there are at most 31 bits to represent positive
numbers. This means that you cannot create an array with more than
2**31 elements even if the elements are just bits in which case the
array would comfortably fit in memory.
[1] When I say "have to" I mean it's a requirement of good design , not
that the standard demands it.
And I would almost certainly get "no" for an answer :-D
> and your implementation
> is free to provide it to you.
Not if it wants to conform to the standard.
> In this case, you merely can't use the ANSI standard as a negotiation
> tool.
And that's the problem. I'm asking for fast fixnums plus arrays as
large as the system can support plus standard conformance. I don't
think that's a lot to ask for but I can't get it.
--
- Do you speak any English ?
- Of course I speak English.
- I hope you don't mind if we move our men so that the two of us
will have more room to groove.
> So for example if an implementation is
> running on a 32-bit computer then fixnums likely have to ([1]) be at
> most 32 bits. So there are at most 31 bits to represent positive
> numbers. This means that you cannot create an array with more than
> 2**31 elements even if the elements are just bits in which case the
> array would comfortably fit in memory.
The purpose of the array data structure is to allow very fast random
access to *any* content, not just bits (which seems to be your use
case). For this to be possible the standard must mandate that array
indices fit in machine words. See Duane's post above for more details.
So to say that a 32-bit lisp can only have so many bits for
fixnums/array-indices is to speak of an inerent limitation of that
platform, not of the ansi spec. If you really want large bit arrays,
use a 64-bit lisp with large fixnums/array-dimensions. For example, in
Clozure CL64 array-total-size-limit is 72,057,594,037,927,936 which
gives you over 9 million gigabytes worth of bit-array.
--
Raffael Cavallaro
So you propose that if a program asks for an oversized array, bigger
than array-dimension-limit, and the implementation gives it to the program,
the implementation is not conforming to the standard?
I am not able to find a citation which would support this belief.
It looks like the behavior of requesting a too-large array is simply
not defined.
There is no requirement for this situation to be detected and signaled
with an error condition.
Any response whatsoever, to undefined behavior, is conforming.
> On Sat, 5 Dec 2009 05:42:42 +0000 (UTC) Kaz Kylheku <kkyl...@gmail.com>
> wrote:
>> On 2009-12-05, Spiros Bousbouras <spi...@gmail.com> wrote:
>> > On 04 Dec 2009 10:01:06 -0800
>> > t...@sevak.isi.edu (Thomas A. Russ) wrote:
>> >> Spiros Bousbouras <spi...@gmail.com> writes:
>> >>
>> >> > > > 2. Why the requirement that the dimensions of an array are
>> >> > > > fixnums ?
>> >> > >
>> >> > > Why not? This is not a big restriction, if an implementation
>> >> > > wants bigger arrays, it only has to provide bigger fixnums.
>> >> >
>> >> > fixnums are supposed to be efficient , that's the criterion for
>> >> > choosing the size of fixnums not the maximum size of arrays.
>> >>
>> >> Yes. And you want to make sure array access is efficient. Thus the
>> >> limitation on only allowing fixnums as indicies
>> >
>> > No , you want to make sure that the programmer has the freedom to
>> > choose the trade-off which is most appropriate for his application.
>> > If he is willing to sacrifice some speed in order to gain larger
>> > arrays he should have that choice.
>>
>> The trade off is there. You are perfectly free to ask your Lisp for an
>> array biger than array-dimension-limit,
>
> And I would almost certainly get "no" for an answer :-D
With the open source Lisps, you can define your own answer.
> And that's the problem. I'm asking for fast fixnums plus arrays as large
> as the system can support plus standard conformance. I don't think
> that's a lot to ask for but I can't get it.
You mean you don't have internet access to download 64 bit SBCL?
Tamas
This is a good question. If the OP invested the same amount of time
to programming instead of whining about a non-issue, he would have
solved the problem already.
Tamas
No, it would be non conforming on the part of the program.
Check Section "1.5 Conformance", there are two classes of conformances.
To make the program conformant, it would have to test against
ARRAY-DIMENSION-LIMIT (and ARRAY-TOTAL-SIZE-LIMIT in the case of
multi-dimensional arrays).
;; non-conformant program:
(make-array size)
;; conformant program:
(if (< size array-dimension-limit)
(make-array size :initial-element 0)
#+implementation-with-bigger-arrays (make-array size :initial-element 0)
#-implementation-with-bigger-arrays (error "Cannot make an array big enough))
--
__Pascal Bourguignon__
> On Sat, 05 Dec 2009 05:54:30 -0500
> Barry Margolin <bar...@alum.mit.edu> wrote:
> > In article <OylSm.7781$6p6....@newsfe05.ams2>,
> > Spiros Bousbouras <spi...@gmail.com> wrote:
> >
> > > The problem here is that the standard does not allow a Lisp
> > > implementation to do this on the computer I'm using.
> >
> > The standard doesn't prevent Lisp implementations from allowing bignum
> > array indices. It just doesn't require it, so portable applications
> > can't depend on it.
>
> Each array dimension has to be smaller than ARRAY-DIMENSION-LIMIT and
> ARRAY-DIMENSION-LIMIT has to be a fixnum therefore array dimensions
> have to be fixnums. Each array index has to be smaller than the
> corresponding dimension therefore each array index also has to be a
> fixnum. (I'm assuming that if a,b are integers and 0 <= a <= b and b is
> a fixnum then a also has to be a fixnum.)
All those limits only apply to portable applications. They don't
restrict implementations from allowing larger arrays as extensions, and
non-portable applications can use these larger indices.
But that's a completely different claim. Of course it's nonconforming
on the part of the program.
It's also nonconforming to open a socket, create a thread, or request
UTF-8 encoding for a stream.
An array of 2**32 elements will also fit comfortably into memory if they
are bits.
A computer with 32 bit addresses which reference bytes is not capable
of addressing every bit of every byte in its address space.
Can you name one programming language whose ANSI or ISO standard
requires implementations to overcome this issue in their implementation
of bit arrays?
How many other languages even HAVE bit arrays as a standard language
feature?
The simple answer to this whole thread is that when we were writing the
standard, we created this limit as a lowest common denominator. We
didn't want the standard to be dependent on how bit arrays are
represented and the number of bits in a byte. We didn't want to force
implementations to perform bignum arithmetic when doing array indexing,
because this is supposed to be a fast operation -- we wanted Lisp arrays
to be as fast as arrays in other languages.
Bit arrays already have inherent inefficiency. Few CPUs support bit
array indexing natively, so it has to be done by splitting the index
into a byte or word index and offset, then shift/mask to get that bit
(there might be a machine instruction for this). Requiring the first
step to use bignums would make things even worse.
As I said in another post, implementations can offer this if they wish.
This part of the standard is a restriction on conforming programs, not
implementations. If you need something like this in a portable
application, write your own API, and use conditional compilation to
translate it to native array indexing on implementations that support
it, otherwise use an array of arrays.
One is C++. :)
The std::vector<bool> template can be implemented as a specialization
which packs bits.
And, guess what, we run into limitations.
The size and index values are of std::vector<T>::size_type.
Typically, this will be some fixed with like 32 bits.
In the libstdc++ implementation, this is typedefed to size_t, which
is going to be 32 on 32 bit machines.
So you can ``only'' have only 2**32-1 bools in that implementation;
500 megs worth of bits.
> So the issue of multiplication doesn't even arise with vectors. For
> arrays which are not vectors it could be handled with appropriate
> declarations. Again , see my reply to Paul Wallich.
Yes, it does. Most (all?) modern systems are byte-addressed. So
unless your array is not an array of bytes, the implementation will
have to multiply or divide, typically by a power of two, to get the
reference.
> And that's the problem. I'm asking for fast fixnums plus arrays as
> large as the system can support plus standard conformance. I don't
> think that's a lot to ask for but I can't get it.
Of course you can. Just use an implementation whose fixnums agree with
what the platform you are using supports.
> No it's not necessarily a failure of the implementation because an
> implementation is also responsible for assuring that the fixnums it
> provides are as fast as possible for the the architecture on which the
> implementation is running. So for example if an implementation is
> running on a 32-bit computer then fixnums likely have to ([1]) be at
> most 32 bits. So there are at most 31 bits to represent positive
> numbers. This means that you cannot create an array with more than
> 2**31 elements even if the elements are just bits in which case the
> array would comfortably fit in memory.
And that's just fine, because an array that large will exhaust the
address space of the system, except possibly for bit-arrays.
If you really want to create bit-arrys that large, then you either have
a really smart compiler or you don't care much about performance - it
is generally much, much better to create arrays of larger objects and
then roll your own bitwise access.
> Bit arrays already have inherent inefficiency. Few CPUs support bit
> array indexing natively, so it has to be done by splitting the index
> into a byte or word index and offset, then shift/mask to get that bit
> (there might be a machine instruction for this). Requiring the first
> step to use bignums would make things even worse.
As I hinted in an earlier post, this is quite right. I had an
application where the "natural" representation was large bitvectors
(large as in would-just-fit-in-4G, so not huge by today's standards,
but large enough). It became very rapidly apparent that if I wanted
performance not to suck, I needed to use vectors of bytes or larger
objects and do the bit operations myself. (I suspect 32-bit objects
would be the safe compromise - comfortably smaller than a fixnum on a
64-bit implementation, but large enough to reduce the number of
fetches).
> On Thu, 03 Dec 2009 20:49:51 -0500
> Paul Wallich <p...@panix.com> wrote:
> >
> > Nuh-uh.
> >
> > 1) That's already the case in plenty of CL implementations -- unless a
> > lot of people have 60 bits of physical memory they're not telling anyone
> > about.
>
> It may be the case with plenty CL implementations but not all. For
> example on the computer I'm using right now most-positive-fixnum of
> SBCL is 536870911. This means that I can't create a bit vector with
> more than 536870911 members which is ridiculous since I have 1 GB ram.
Well, that's the point of having more than one implementation. You can
choose the implementation that suits your needs. So it looks like you
need to get a 64-bit lisp to use. That should get you around 60 bits of
fixnum space.
> > (Even on 32-bit operating systems, a single fixnum-denominated
> > array can already use up at least all of physical memory and quite
> > possibly more, depending on what's in the array.)
>
> Of course it can use all the memory if each element is large enough.
> The problem is that it might miss a lot of memory when every member of
> the array is small.
I suppose. But then you still end up with the situation that array
access will be, perhaps, unnecessarily slow. The issue is that you
would need to be prepared to do bignum arithmetic on all array accesses
unless there was specific information to the contrary.
And since the default in lisp is to not include type declarations, you
would end up with slow array access in the majority of cases. I think
that would be a bad trade-off on the part of the standard.
> > 2) Now array-size limit depends not on the lisp but on the OS and the
> > disks you have handy and how much someone allocated for /swap and what
> > other programs are running. Good luck with that.
>
> I'm not sure what you mean "not on the lisp". If you do your
> programming in Lisp then array size will obviously depend on the Lisp.
> A good Lisp implementation in turn will make optimal use of what the
> hardware and operating system make available. Actually this is true for
> every programming language ; luck has nothing to do with it.
Well, I think it would be rather odd to find a programming language
implementation that changed the sizes of things based on the particular
hardware it is running on. I mean, in C you wouldn't expect that "int"
would have its size depend on the amount of RAM installed.
> The problem here is that the standard does not allow a Lisp
> implementation to do this on the computer I'm using. It's a 32 bit
> computer so a fast fixnum should be at most 32 bits. I assume SBCL uses
> 2 bits for internal book-keeping.
Actually, I believe that it uses 3, since fixnums are 29 bits long.
> I don't know anything about the
> internal design of SBCL but if we assume that it was a good design
> decision to take these 2 bits away from the range of values then the
> conundrum becomes apparent : either use a fixnum greater than 32 bits
> which would have a negative impact on the speed of all integer
> arithmetic or arrive at a situation where the programmer cannot create
> a bit vector of size greater than 2**29 despite the 1 GB ram available.
Well, you as the programmer can always create larger bit vectors if you
really need them. It should be relatively easy to do.
(defstruct big-bit-vector
size
subvectors)
(defun create-big-bit-vector (size)
(multiple-value-bind (full remainder)
(floor size (1- most-positive-fixnum))
(let ((subvectors (make-array (if (zerop remainder) full (1+ full)))))
(loop for i from 0 below full
do (setf (aref subvectors i)
(make-array (1- most-positive-fixnum)
:element-type 'bit :initial-element 0)))
(unless (zerop remainder)
(setf (aref subvectors full)
(make-array remainder
:element-type 'bit :initial-element 0)))
(make-big-bit-vector :size size :subvectors subvectors))))
(defun big-bit-vector-aref (vector index)
(multiple-value-bind (full remainder)
(floor index (1- most-positive-fixnum))
(unless (zerop remainder) (incf full))
(aref (aref (big-bit-vector-subvectors vector) full) remainder)))
;; The setf method is left as an exercise for the reader.
... etc.
Note that this is just a sketch, I didn't really do much testing. I do
suggest you set *print-array* to NIL before working with this code.
CL-USER> (setq *print-array* nil)
NIL
CL-USER> (defvar *bv* (create-big-bit-vector (* 2 most-positive-fixnum)))
#S(BIG-BIT-VECTOR :SIZE 1073741822 :SUBVECTORS #<Vector @ #x81612732>)
CL-USER> (length (big-bit-vector-subvectors *bv*))
3
CL-USER> (map 'list #'length (big-bit-vector-subvectors *bv*))
(536870910 536870910 2)
BTW, why do you need bit vectors with more than 500 million elements?
> It doesn't contradict my point if there are Lisp implementations ,
> possibly many , where a similar conundrum does not arise. If there is
> even one Lisp implementation where the conundrum arises then the
> standard made a poor choice in enforcing the fixnum restriction.
I think you have this backwards. As long as the standard allows an
implementation that doesn't have the problem, then the defect cannot be
in the standard. It must be in the implementation.
> > 3) Abandoning the fixnum limit, especially given adjustability, is
> > potentially really bad for performance even for arrays that aren't
> > bigger than that limit;
>
> Every time you pass from fixnums to bignums it is "potentially really
> bad for performance". This is something for the programmer to worry
> about.
Except that the programmer can't really do anything about the internal
implementation of AREF. So it really is the language implementer that
has to worry about such things. And I want my array references to be
fast, thank you very much.
> The standard could offer a declaration by which the programmer
> could inform the compiler that the total size of an array will be
> a fixnum. That's how it works with everything else in Lisp : the
> language allows flexibility and if you want to increase performance you
> restrict your flexibility through the appropriate declarations. I don't
> see why array access should be different.
I suppose. Although that would mean that by default array accesses
would be slow rather than fast.
> It's worth considering also that in the case of array access you may
> not even need a declaration at all. When you create the array or adjust
> its size the compiler knows if the (new) size is so large that it needs
> bignums or not. If not then it can automatically arrange for all access
> calculations to use fixnum arithmetic.
Um, the compiler can only know this if the array dimensions are
compile-time constants. If you dynamically create an array, then the
compiler doesn't have this information.
For example:
(defun create-my-new-array (size)
(make-array (list size size) :element-type ...))
The compile can't possibly know if size is fixnum or bignum, unless you
put in type declarations.
Similarly
(defun get-element (x y)
(aref *my-array* x y))
runs into similar problems.
> > if you want really bad performance, you can roll it yourself.
>
> I don't want really bad performance , I want arrays which can be larger
> than most-positive-fixnum. How can I roll this myself ?
>
> > 4) The problem has historically been with array-size-limit rather
> > smaller than the fixnum limit.
>
> Well things change. Memory and disks get bigger while the standard
> stays the same.
The standard just mandates things in terms of FIXNUM size, which does
not have an upper bound in the standard. Nothing stops lisp systems on
64-bit architectures from offering larger fixnums. And, surprise,
surprise, the 64-bit lisp systems do have larger fixnums. So just get a
64-bit lisp and all your problems will be solved. This is all well
within the range of design choices allowed by the standard.
--
Thomas A. Russ, USC/Information Sciences Institute
> On 2009-12-04, Thomas A. Russ <t...@sevak.isi.edu> wrote:
> > BTW, just for reference, here is the fixnum sizes for a number of lisps
> > that I happen to have easy access to.
> >
> > Mac OS X:
> > bits implementation
> > ---- --------------
> > 29 LispWorks Personal 5.1 (32-bit)
> > 60 ClozureCL x8664 (64-bit)
> > 24 CLISP 2.47
>
> This can't be right. CLISP's most-positive-fixnum is 16.7 million, which
> means that the mantissa has 24 bits.
That's what the table says, doesn't it?
> But remember that fixnum is signed; CLISP also has a
> most-negative-fixnum that is -16.7 million.
>
> Might you be neglecting to count the sign bit?
I used
(integer-length most-positive-fixnum)
to generate the sizes.
> On 2009-12-04, Thomas A. Russ <t...@sevak.isi.edu> wrote:
> >
> > Well, my guess would be related to considering how one often uses a loop
> > to access elements in an array. Typically, you would iterate starting
> > at 0 and ending when the index is >= the dimension.
>
> Thus, the dimension can be max-positive-bignum!
Doh!
> On 4 Dec 2009 15:54:00 GMT
> Tamas K Papp <tkp...@gmail.com> wrote:
> > Eg if you are
> > writing some method which handles array indexes, you can always rely
> > on them being fixnums, regardless of the implementation. And then the
> > implementation is allowed to say what it considers a fixnum, and if
> > you don't like that, just pick another implementation.
>
> What if you want to use a 32-bit computer ?
Then don't complain about limitations imposed by the computer
architecture on your lisp system.
> On 04 Dec 2009 10:01:06 -0800
> t...@sevak.isi.edu (Thomas A. Russ) wrote:
> > Spiros Bousbouras <spi...@gmail.com> writes:
> >
> > > > > 2. Why the requirement that the dimensions of an array are fixnums ?
> > > >
> > > > Why not? This is not a big restriction, if an implementation wants
> > > > bigger arrays, it only has to provide bigger fixnums.
> > >
> > > fixnums are supposed to be efficient , that's the criterion for
> > > choosing the size of fixnums not the maximum size of arrays.
> >
> > Yes. And you want to make sure array access is efficient. Thus the
> > limitation on only allowing fixnums as indicies
>
> No , you want to make sure that the programmer has the freedom to
> choose the trade-off which is most appropriate for his application. If
> he is willing to sacrifice some speed in order to gain larger arrays he
> should have that choice.
He does have that choice.
He can always implement his own array storage system. See the sketch of
such an implementation in another reply.
But I would expect that an implementation that didn't use fixnum limits
would have generally slow array accesses, and so it would tend to be
avoided by must users. And thus such an implementation is unlikely to
exist for long in the lisp ecosystem.
> > > On the
> > > other hand the maximum size of arrays should be what the operating
> > > system can handle even if it means that part of the array would be in
> > > swap memory.
> >
> > Um, the question of fixnum sizes and the physical/virtual memory sizes
> > are completely orthogonal.
>
> I 99.99% agree. However in the part of my post you're quoting I'm not
> making a connection between size of fixnum and size of memory , I'm
> making a connection between size of arrays and size of memory.
>
> > As Pascal pointed out, most lisp systems can
> > generally index their entire address space (at a word level) using their
> > fixnums,
>
> I'm not sure what the "at a word level" part means but I believe I
> understand your overall gist.
As Duane Rettig also points out, the standard array which holds any lisp
object has storage sizes for machine-word size objects, since that is
typically what is used for FIXNUM and POINTERS. Since, on a 32-bit
architecture, there are 4 bytes per word, and addresses are typically
measured in bytes, you can cleverly array to address all of virtual
memory using fixnum indices for undifferentiated arrays.
The only exception would seem to be bit vectors, where -- if the
implementations supports it -- you can pack 8 elements into a single
byte. [Note that the standard doesn't require this.]
> So what about those lisp systems which
> can't index their entire address space using their fixnums ? From what
> Pillsy said CLisp probably can't.
Well, that certainly isn't the standard's fault. Choose a different
lisp implementation. That's why there are so many -- different design
points.
> What if you want a large bit vector ? Are you supposed to use something
> larger than a bit for the array element and then calculate bit
> addresses and do your own bit fiddling ?
Well, that's one option.
There are others.
> > Even if a lot of systems use 29 rather than 30 bits for
> > fixnums, this still gives you most of memory, and some copying GCs need
> > can only safely use half of the virtual memory in order to guarantee
> > enough space to copy into....
> >
> > Besides, it would be ludicrous to expect that the fixnum size of a lisp
> > implementation would depend on the actual amount of physical RAM
> > installed in a computer, rather than the processor (and software)
> > architecture.
>
> I agree. On the other hand it makes sense to expect that the maximum
> array size would depend on the actual amount of physical RAM installed
> in a computer so the standard is at fault for creating a connection
> between fixnum size and array size.
Why?
Why physical RAM and not virtual address space?
Typcially the fixnum size is tied to virtual address space instead, so
you get what you want.
And tying the array size to particular machine configurations makes it
really hard to distribute your program, since you can't count on the
maximum array size being anything in particular.
And you don't have the option of running more slowly with less RAM. You
wouldn't be able to run at all.
> > How do you address a multiple dimension array in RAM without
> > multiplying?
>
> If all the dimensions are powers of 2 then you can do it by ORing and
> shifting. Perhaps there are some nice tricks available in other cases
> too.
Well, yes, but they get harder when you have to do them with BIGNUMs.
They are no longer single machine instructions.
> > > > Moreover, there would be no point, because in general
> > > > the size of arrays slots is one word, and fixnum is all you need to
> > > > index all the words you can store in memory. (eg. with a 32-bit
> > > > processor, a fixnum of 30-bit is enough to index the whole memory.
> > >
> > > Is it ?
> >
> > Sure. Why do you think it isn't?
>
> I'm thinking it *may* not be because you can cover a larger range of
> values with 32 bits than with 30.
See comment about object sizes, word alignment and byte addressing.
It's only an issue for bit vectors or other specialized arrays that take
less than one word per element.
> > > 5 , so ? In fact if the point for the limitation in the size of arrays
> > > is to never have a bignum as an array subscript then each dimension
> > > could be up to (1+ most-positive-fixnum) so the definition of
> > > ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> > > possibility.
> >
> > Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
> > arrays, including single-dimensional arrays that you want to displace to
> > your multi-dimensional array.
> >
> > I also suspect that ROW-MAJOR-AREF also plays into this limit.
>
> As I've said to my reply to Tamas "the standard could allow
> ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to be at most (1+
> most-positive-fixnum) (which means that both should be allowed to be
> bignums). This would still ensure that ARRAY-ROW-MAJOR-INDEX can work
> with only fixnums while giving the programmer 1 extra subscript to work
> with". Issue "ARRAY-DIMENSION-LIMIT-IMPLICATIONS" acknowledges this. It
> says
I think this is getting to the ridiculous hair-splitting point. Are
your requirements really so constrained that getting ONE or TWO
additional array elements out of 500 million makes a difference?
> Well, that's one option.
> There are others.
That actually turns out to be the best option if you care about
performance for large bit-based structures as well even if you have an
implementation where the fixnum limit is not a problem, unless you have
a heroic compiler.
Ah, but on x86, the only architecture Spiros knows about, you can do a load
with two registers (base and offset), and a small power-of-two scale
by which the offset is multiplied, all in one instruction.
:)
I think that, barring some fantastically exceptional circumstances, the use of
a 500 megabyte bit array indicates a computer science flunkie between the
keyboard and chair.
For one thing, only if the data is random, with a ``white'' distribution
is such a representation space-efficient.
If there is any obvious sparseness (reams of 1's or 0's), you can
compress the data with some alternate data structure: radix tree
or whatever.
Even if compression takes more operations, it may be beneficial.
For one thing, you don't have a 500 megabyte monster mapping in your
address space. For another, you have much better cache behavior.
For instance consider random accesses to a huge bit vector.
In a compressed scheme, most of the all-zero or all-one areas can
be represented efficiently, by a small number of nodes in the
radix tree or whatever structure. These can fit into your
L1 cache, and if not that, then the L2.
In the naive representation, random read accesses are spread through a vast
area of memory. Each access is very likely to go to a different L1
cache line from that of any other recent access, and so on.
You will likely trash your L1, most of your L2 and the TLB.
Yuck, talk about dumb.
Anyway, on the theme about how to overcome the bit array + fixnum limitation:
without switching to any bit packing tricks, what you can also do is this:
Instead of using some least significant bits of the index as a bit offset into
a word which contains packed bits, what you can do is use bits on the /opposite/
end of the index to select which huge array to access. Then the least
significant 30 bits (or whatever is the fixnum mantissa size) are used to index
into that array. This way, you keep the bit arrays and don't have to write any
code to pack bits, or worry about choosing the best cell size, etc.
While it is true that (32-bit) CMUCL uses a 3-bit "lowtag" for object
encoding, it is also the case that CMUCL allocates *two* lowtag codepoints
(of the eight available) to fixnums. Thus fixnums in CMUCL are actually
30 bits, not 29. [And given Duane's comments about scaling fixnums to
machine addresses, I suspect the same is true for ACL.]
You later wrote:
+---------------
| I used (integer-length most-positive-fixnum) to generate the sizes.
+---------------
But that only counts the positive fixnums and ignores the negative ones.
To get the correct answer, try this instead:
cmu> (integer-length (- most-positive-fixnum most-negative-fixnum))
30
cmu>
Similar experiments in other CLs should bump all of your fixnum
sizes by one. ;-}
-Rob
-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
It is, and likely a widely used concept. Tags 0 and 4 would be used
for evan and odd fixnums, thus doubling the range of fixnums over what
bits are usually available for addresses when dealing with the normal
3-bit tags (for 32-bit lisps). The same concept is true for 64-bit
lisps, with the tag width being 4 and the two tags used to get fixnums
are 0 and 8.
This relationship would tend to be constant for any power-of-two
natural-word-size, for any implementation which tends to allocate Lisp
objects on a two-word basis. At Franz, we call these "Allocation
Units", or AUs. An AU is always two natural words, and always starts
on a boundary aligned to whatever the tag-width dictates (e.g. LSbits
of #b000 for 32 bit lisps and #b0000 for 64-bit lisps). Allegro CL
allocates the cons with the smallest allocation possible, i.e. one
AU. But because a natural word, which is used to represent a Lisp
tagged-pointer, is half the size of an AU, it requires one extra bit
to encode one of these tagged pointers, which we call a LispVal. If
you do the math, you will see that if we do a port to a 128-bit
machine, the numbers will scale perfectly well; you already know what
our design will be.
> You later wrote:
>
> +---------------
> | I used (integer-length most-positive-fixnum) to generate the sizes.
> +---------------
>
> But that only counts the positive fixnums and ignores the negative ones.
> To get the correct answer, try this instead:
>
> cmu> (integer-length (- most-positive-fixnum most-negative-fixnum))
>
> 30
> cmu>
>
> Similar experiments in other CLs should bump all of your fixnum
> sizes by one. ;-}
I think "29 bits plus sign" is a perfectly acceptable way to say it,
as long as you don't forget the "plus sign" part.
Duane
I read section 1.5 .I still don't see why it is the responsibility of
the programme to not create arrays which are too big. For example the
page on ARRAY-TOTAL-SIZE-LIMIT says "The actual limit on the array
total size imposed by the implementation"... Note the "imposed by the
implementation" part. And note also that the MAKE-ARRAY page does not
say anywhere "a created array shall have size smaller than
ARRAY-TOTAL-SIZE-LIMIT".
> To make the program conformant, it would have to test against
> ARRAY-DIMENSION-LIMIT (and ARRAY-TOTAL-SIZE-LIMIT in the case of
> multi-dimensional arrays).
>
>
> ;; non-conformant program:
> (make-array size)
>
> ;; conformant program:
> (if (< size array-dimension-limit)
> (make-array size :initial-element 0)
> #+implementation-with-bigger-arrays (make-array size :initial-element 0)
> #-implementation-with-bigger-arrays (error "Cannot make an array big enough))
--
Who's your mama ?
That's what I was saying.
> I am not able to find a citation which would support this belief.
I cannot give you a citation which says this exactly but I thought this
is the gist of what the standard is saying. I elaborate in my reply
to Pascal.
> It looks like the behavior of requesting a too-large array is simply
> not defined.
I hope you're right.
> There is no requirement for this situation to be detected and signaled
> with an error condition.
>
> Any response whatsoever, to undefined behavior, is conforming.
--
There are only two kinds of storage devices - those that have failed,
and those that are about to fail.
Jonathan Schwartz
Well , if you say so I'm sure you're right but it's not obvious to me
how it follows from (or is consistent with) the HS. So then the
question becomes why have ARRAY-DIMENSION-LIMIT and
ARRAY-TOTAL-SIZE-LIMIT at all ? Was it for backwards compatibility ?
Actually even before the standard was created , why would an
implementation introduce ARRAY-DIMENSION-LIMIT or
ARRAY-TOTAL-SIZE-LIMIT ? It seems simpler to me not to have these
constants and instead have MAKE-ARRAY and ADJUST-ARRAY return nil if
the requested array was too large.
Define my own answer ? What does that mean ?
> > And that's the problem. I'm asking for fast fixnums plus arrays as large
> > as the system can support plus standard conformance. I don't think
> > that's a lot to ask for but I can't get it.
>
> You mean you don't have internet access to download 64 bit SBCL?
Does the 64 bit SBCL run on a 32 bit computer ? If it does does it
provide the fastest fixnums for the architecture ?
It's so that standard-conforming programs can be written which check and obey
these limits.
Moreover, there are minimum values on these limits so that you can write
conforming programs that use small arrays without bothering with these checks.
I'll see your 2**32 elements and raise you to 2**33 elements.
> A computer with 32 bit addresses which reference bytes is not capable
> of addressing every bit of every byte in its address space.
What do you mean addressing ? Who said anything about addressing ?
You seem to be talking about what assembly instructions will be output
by the compiler in order to manipulate the array. That's an entirely
different issue. If the compiler can allocate the necessary number of
bytes then it can manipulate individual bits even if it has to use
masks and AND and OR. My concern was with whether the standard allows
the programmer to create the array to begin with (plus my other
requirements).
> Can you name one programming language whose ANSI or ISO standard
> requires implementations to overcome this issue in their implementation
> of bit arrays?
I'm afraid I don't understand what issue you have in mind.
--
If at first you don't succeed, well, so much for skydiving.
Victor O'Reilly
Yes, that's because the CLHS is written avoiding as much as possible
redundancy. If it is written at one place that there is a limit on
the size of arrays, then it is not repeated at another place that you
shouldn't try to allocate bigger arrays.
Just don't be an ass hole!
--
__Pascal Bourguignon__
But where does it say whose responsibility it is to enforce the size
restriction ?
> Just don't be an ass hole!
Tsk , tsk , tsk.
Nowhere. So that means that the behavior is undefined.
Similarly, whose responsibility is to enforce that literal lists can't be
modified?
Requesting a larger array than array-dimension-limit or array-total-size-limit
is simply outside of the scope of the ANSI Common Lisp standard.
Implementations can diagnose such an attempt with an error signal,
honor the attempt, if possible, or just behave unpredictably:
allow the program to become corrupted and produce incorrect
results, or the entire Lisp image to crash, with or without
some error diagnostic, etc.
Ensuring that some of these really bad things don't happen is a quality of
implementation issue, as is the provision of huge arrays that can be indexed by
bignums.
> So then the
> question becomes why have ARRAY-DIMENSION-LIMIT and
> ARRAY-TOTAL-SIZE-LIMIT at all ? Was it for backwards compatibility ?
> Actually even before the standard was created , why would an
> implementation introduce ARRAY-DIMENSION-LIMIT or
> ARRAY-TOTAL-SIZE-LIMIT ? It seems simpler to me not to have these
> constants and instead have MAKE-ARRAY and ADJUST-ARRAY return nil if
> the requested array was too large.
Well, let's assume for the sake of argument that these constants didn't
exist and that the specification adopted your suggestion of just
returning NIL if the sizes were too big.
So, now you write a program and it fails because the array you asked for
is too big and MAKE-ARRAY returns NIL. How do you then find out what
the limit is? Do you then need to write a loop trying successively
bigger values until you happen to locate the implementation limit?
It seems far friendlier to mandate that this information is available,
and much more useful to have it in a form that can be inspected by a
program rather than hidden in the implementation notes in a loose-leaf
binder on my bookshelf.
Lots of these limits are exposed in Common Lisp (and often not in other
languages), allowing programs to do some reasoning or testing before
trying operations. It also makes it possible for such programs to
provide more useful error messages to their users when limits are
exceeded.
I wrote a small bit-flags utility that used FIXNUMs to encode small bit
vectors for storing binary option information. One of the items when
defining a new class of such bit-flags was to make sure that the fixnum
range was big enough for the number of flags one wanted to have. This
involved checking the size of MOST-POSITIVE-FIXNUM against the number of
bits required for the particular set of flags.
> I'm afraid I don't understand what issue you have in mind.
I presume the issue is that the elements of such a bit array can not be
addressed by the machine. To run into this problem a language
implementation would need (a) to have bit arrays, and (b) to support
integer types larger than the machine's (largest) pointer type.
Actually, a simpler form of the same question might be: are there
languages whose ISO/ANSI standard require that they should support
arrays, with any element type, with more elements than the machine's
largest pointer type can address?
The answer is: there are, but all of them have gotos that do not return.
So the idiot is stuck, not being able to evolve into a developer.
:)
Which is the case with all other calculations involving integers.
> And since the default in lisp is to not include type declarations, you
> would end up with slow array access in the majority of cases. I think
> that would be a bad trade-off on the part of the standard.
It's the same trade-off functions like + , * , - etc. make. Do you also
consider this a bad trade-off ?
> > I'm not sure what you mean "not on the lisp". If you do your
> > programming in Lisp then array size will obviously depend on the Lisp.
> > A good Lisp implementation in turn will make optimal use of what the
> > hardware and operating system make available. Actually this is true for
> > every programming language ; luck has nothing to do with it.
>
> Well, I think it would be rather odd to find a programming language
> implementation that changed the sizes of things based on the particular
> hardware it is running on.
No it wouldn't be odd at all , it happens all the time.
> I mean, in C you wouldn't expect that "int"
> would have its size depend on the amount of RAM installed.
No but you would expect malloc() to return NULL or not based on how
much memory you asked for and how much memory there is available.
> > The problem here is that the standard does not allow a Lisp
> > implementation to do this on the computer I'm using. It's a 32 bit
> > computer so a fast fixnum should be at most 32 bits. I assume SBCL uses
> > 2 bits for internal book-keeping.
>
> Actually, I believe that it uses 3, since fixnums are 29 bits long.
* (- most-positive-fixnum most-negative-fixnum)
1073741823
The range covered is 2**30 so fixnums are 30 bits long.
> Well, you as the programmer can always create larger bit vectors if you
> really need them.
As I've said myself many posts ago.
> BTW, why do you need bit vectors with more than 500 million elements?
I don't at present. As I've said in a previous post I'm curious i.e.
I'm trying to understand why the standard says what it says. The
mention of bit vectors was for illustration purposes , you could modify
my examples to use something other than bit vectors
> > It doesn't contradict my point if there are Lisp implementations ,
> > possibly many , where a similar conundrum does not arise. If there is
> > even one Lisp implementation where the conundrum arises then the
> > standard made a poor choice in enforcing the fixnum restriction.
>
> I think you have this backwards. As long as the standard allows an
> implementation that doesn't have the problem, then the defect cannot be
> in the standard. It must be in the implementation.
When I wrote the quote above I was under the impression that the
standard makes it impossible for an implementation to provide arrays
with bignum indices. If that impression had been correct then the
conundrum would arise by the combination of architecture and standard.
But my impression was apparently mistaken so the conundrum does not
arise because an implementation can provide arrays with bignum indices
as an extension.
> > Every time you pass from fixnums to bignums it is "potentially really
> > bad for performance". This is something for the programmer to worry
> > about.
>
> Except that the programmer can't really do anything about the internal
> implementation of AREF. So it really is the language implementer that
> has to worry about such things. And I want my array references to be
> fast, thank you very much.
Me too , I want my arrays fast and my women big breasted.
> > The standard could offer a declaration by which the programmer
> > could inform the compiler that the total size of an array will be
> > a fixnum. That's how it works with everything else in Lisp : the
> > language allows flexibility and if you want to increase performance you
> > restrict your flexibility through the appropriate declarations. I don't
> > see why array access should be different.
>
> I suppose. Although that would mean that by default array accesses
> would be slow rather than fast.
Only if you're a pessimist. If you're an optimist they would be fast as
opposed to very fast.
> > It's worth considering also that in the case of array access you may
> > not even need a declaration at all. When you create the array or adjust
> > its size the compiler knows if the (new) size is so large that it needs
> > bignums or not. If not then it can automatically arrange for all access
> > calculations to use fixnum arithmetic.
>
> Um, the compiler can only know this if the array dimensions are
> compile-time constants. If you dynamically create an array, then the
> compiler doesn't have this information.
>
> For example:
>
> (defun create-my-new-array (size)
> (make-array (list size size) :element-type ...))
>
> The compile can't possibly know if size is fixnum or bignum, unless you
> put in type declarations.>
>
> Similarly
>
> (defun get-element (x y)
> (aref *my-array* x y))
>
> runs into similar problems.
The implementation can internally have 2 aref functions , one for
arrays whose size is a fixnum and another for arrays whose size is a
bignum. Furthermore the implementation can arrange for the internal
storage of arrays to include a pointer to the aref function which is
the appropriate one for the specific array. So at the time an array is
created it will be examined whether its size is a fixnum or not and its
aref pointer will be assigned accordingly. Similarly when an array is
adjusted. When the array is referenced then the pointer will be
obtained and the appropriate aref function called. So the access time
for arrays whose dimensions are only known at runtime requires only 1
additional pointer dereferencing relative to arrays whose dimensions
are known at compile time. How much extra time that costs I don't know.
Perhaps the optimizations some processors provide are so good as to
reduce this extra time to 0.
> So just get a
> 64-bit lisp and all your problems will be solved. This is all well
> within the range of design choices allowed by the standard.
My "problems" were the questions I asked in my opening post and they
couldn't possibly be answered by changing the Lisp I use.
--
I don't understand why there are so many unsolved crimes. It seemed
like every time I did something wrong, it was solved overnight.
Joe Mammana
My first computer had a Z80 so there.
Are you calling me a computer scientist ?
I guess it's time for you to upgrade your computer.
64-bit machines are not uncommon anymore, and it seems like you need a
larger address space. So it may be time to start saving up for a
hardware upgrade.
You would try smaller values rather than bigger but yes you would write
a loop. And you're not interested in the implementation limit ,
whatever that is , but on whether you can get an array as big as you
need to do your job , at the time you ask for it and under the memory
usage situation of your programme.
> It seems far friendlier to mandate that this information is available,
> and much more useful to have it in a form that can be inspected by a
> program rather than hidden in the implementation notes in a loose-leaf
> binder on my bookshelf.
I don't see how it is useful. It's not as if there is any guarantee
that as long as your array size is below ARRAY-TOTAL-SIZE-LIMIT and
each dimension is below ARRAY-DIMENSION-LIMIT the array will be
succesfully created. Every time you create a new Lisp object the
possibility exists that you will run out of memory and that garbage
collection won't be able to release enough memory. I don't think the
standard specifies a way for such an occurrence to be signalled whereas
with my method you would have a way which works for array creation (but
not for the creation of other objects).
Apart from that why is there a need for both ARRAY-DIMENSION-LIMIT and
ARRAY-TOTAL-SIZE-LIMIT ? In what situation ARRAY-TOTAL-SIZE-LIMIT on
its own wouldn't be enough ?
> I wrote a small bit-flags utility that used FIXNUMs to encode small bit
> vectors for storing binary option information. One of the items when
> defining a new class of such bit-flags was to make sure that the fixnum
> range was big enough for the number of flags one wanted to have. This
> involved checking the size of MOST-POSITIVE-FIXNUM against the number of
> bits required for the particular set of flags.
Yes , MOST-POSITIVE-FIXNUM is useful but it's not what I asked about.
--
Even real life has plot holes.
> On 08 Dec 2009 16:50:54 -0800
> t...@sevak.isi.edu (Thomas A. Russ) wrote:
> > Spiros Bousbouras <spi...@gmail.com> writes:
> >
> > > So then the
> > > question becomes why have ARRAY-DIMENSION-LIMIT and
> > > ARRAY-TOTAL-SIZE-LIMIT at all ? Was it for backwards compatibility ?
> > > Actually even before the standard was created , why would an
> > > implementation introduce ARRAY-DIMENSION-LIMIT or
> > > ARRAY-TOTAL-SIZE-LIMIT ? It seems simpler to me not to have these
> > > constants and instead have MAKE-ARRAY and ADJUST-ARRAY return nil if
> > > the requested array was too large.
> >
> > Well, let's assume for the sake of argument that these constants didn't
> > exist and that the specification adopted your suggestion of just
> > returning NIL if the sizes were too big.
> >
> > So, now you write a program and it fails because the array you asked for
> > is too big and MAKE-ARRAY returns NIL. How do you then find out what
Return NIL? What is this, C? It should signal an error.
> > the limit is? Do you then need to write a loop trying successively
> > bigger values until you happen to locate the implementation limit?
>
> You would try smaller values rather than bigger but yes you would write
> a loop. And you're not interested in the implementation limit ,
> whatever that is , but on whether you can get an array as big as you
> need to do your job , at the time you ask for it and under the memory
> usage situation of your programme.
But often you're doing something based on input from the user. It's
useful to have these constants so that the program can provide useful
feedback. I suppose the limit could be in the error object, rather than
a constant. But it seems inappropriate for the error object to contain
something not specific to the particular context, but just a general
attribute of the implementation.
Another thing that makes the constants useful is that the application
can check them at the start, and abort immediately with an error like
"Sorry, this implementation doesn't support large enough arrays." Isn't
this better than running for a while and then aborting when it gets to
the point where it needs the large array?
--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
> 64-bit machines are not uncommon anymore, and it seems like you need a
> larger address space. So it may be time to start saving up for a
> hardware upgrade.
I think it's worth looking at what he wrote a couple of articles back:
On 2009-12-09 01:30:15 +0000, Spiros Bousbouras <spi...@gmail.com> said:
>
>> BTW, why do you need bit vectors with more than 500 million elements?
>
> I don't at present. As I've said in a previous post I'm curious i.e.
> I'm trying to understand why the standard says what it says. The
> mention of bit vectors was for illustration purposes , you could modify
> my examples to use something other than bit vectors
Or, in other words, he *doesn't actually have a problem he wants to
solve*. Watch out for mad people.
Now imagine that you are trying to implement bignum-array, as
discussed before. For the sake of argument, suppose that you are
doing this by slicing up a larger array to pieces that CL can manage.
Then the size of these slices would be deduced from
ARRAY-TOTAL-SIZE-LIMIT.
HTH,
Tamas
> On 6 Dec 2009 06:23:09 GMT
> Tamas K Papp <tkp...@gmail.com> wrote:
>> On Sat, 05 Dec 2009 21:34:52 +0000, Spiros Bousbouras wrote:
>>
>> With the open source Lisps, you can define your own answer.
>
> Define my own answer ? What does that mean ?
That you can just implement your own extensions. If they are good,
they will be incorporated to the main branch, otherwise you just fork.
>> > And that's the problem. I'm asking for fast fixnums plus arrays as
>> > large as the system can support plus standard conformance. I don't
>> > think that's a lot to ask for but I can't get it.
>>
>> You mean you don't have internet access to download 64 bit SBCL?
>
> Does the 64 bit SBCL run on a 32 bit computer ? If it does does it
> provide the fastest fixnums for the architecture ?
Oh, you didn't tell us that you were Robert Maas's cousin. But we
figured it out eventually.
Cheers,
Tamas
> What if you want to use a 32-bit computer ?
Then you're basically screwed. OK, sure, today you "just" need a
billion array elements, but tomorrow you'll need twice or three times
as many and the problem will no longer have anything to do with the
size of your fixnums.
Or, you know, you could get a 64-bit computer.
Cheers,
Pillsy
> On Fri, 04 Dec 2009 07:19:28 -0800, Pillsy wrote:
[...]
> > The fixnums on some popular implementations (like CLISP on my 32-bit XP
> > machine) have a dramatically smaller range. 17 million element arrays
> > really are too small for many applications.
> Yes, but I am under the impression that CLISP is not the
> implementation one would pick for computationally intensive
> applications, so I see no problem with it making this particular
> choice.
In principle, I don't either. In practice, it's a pain in the ass
because CLISP is the only Lisp system I can use on my work machine.
Of course, at work I use Lisp as an interactive scratch pad and
occasional alternative to bash+awk, so it's a mere annoyance. If it
were a real problem, it would be worth the various forms of hassle and
expense to solve it.
Cheers,
Pillsy
[...]
> On 07 Dec 2009 10:03:32 -0800
> t...@sevak.isi.edu (Thomas A. Russ) wrote:
> > Spiros Bousbouras <spi...@gmail.com> writes:
> >
> > I suppose. But then you still end up with the situation that array
> > access will be, perhaps, unnecessarily slow. The issue is that you
> > would need to be prepared to do bignum arithmetic on all array accesses
> > unless there was specific information to the contrary.
>
> Which is the case with all other calculations involving integers.
>
> > And since the default in lisp is to not include type declarations, you
> > would end up with slow array access in the majority of cases. I think
> > that would be a bad trade-off on the part of the standard.
>
> It's the same trade-off functions like + , * , - etc. make. Do you also
> consider this a bad trade-off ?
The difference is that +, -, etc. are all user-level functions. That
means I, as the programmer, have control over whether the declarations
have some effect or not.
The computations inside of AREF are not user-level functions, so there
isn't anyway to make them efficient. I don't even think it would be
really possible, even with some heroic compiler work. That is because
it isn't enough to know that the indices for a particular AREF
invocation are fixnums, but one has to also know that all of the
dimensions of the array (or at least all but one?) are also fixnums.
That is really well beyond what one could expect a compiler to be able
to do.
You are free to come up with your own generic array implementation that
solves these problems, however.
> > Well, I think it would be rather odd to find a programming language
> > implementation that changed the sizes of things based on the particular
> > hardware it is running on.
>
> No it wouldn't be odd at all , it happens all the time.
>
> > I mean, in C you wouldn't expect that "int"
> > would have its size depend on the amount of RAM installed.
>
> No but you would expect malloc() to return NULL or not based on how
> much memory you asked for and how much memory there is available.
And that happens exactly the same way in lisp as well.
Fixnum sizes don't change, but any attempt to allocate memory will
either work or fail based on what's available.
MAKE-ARRAY is not a primitive allocation function like malloc.
> > Well, you as the programmer can always create larger bit vectors if you
> > really need them.
>
> As I've said myself many posts ago.
So, what is the problem, then?
It shouldn't be the responsibility of a language specification to
provide all possible data structures for you. As long as the
specfication allows you to create the data structure you want, why do
you care? Just build what you need.
> > BTW, why do you need bit vectors with more than 500 million elements?
>
> I don't at present. As I've said in a previous post I'm curious i.e.
> I'm trying to understand why the standard says what it says.
Well, we answered that question a long time ago: Implementation
efficiency for array access.
> The
> mention of bit vectors was for illustration purposes , you could modify
> my examples to use something other than bit vectors
So this entire argument is really about nothing of any practical import
at all? So it's all a large, albeit polite, flamefest?
And it really does pretty much only apply to bitvectors. If you use
anything other than bit vectors (maybe with the exception of strings),
you end up with a situation where FIXNUMs are sufficient, given a
reasonable implmentation, to address an array that takes up all
available memory -- leaving no room for a program to actually do
anything with that array.
> > > It doesn't contradict my point if there are Lisp implementations ,
> > > possibly many , where a similar conundrum does not arise. If there is
> > > even one Lisp implementation where the conundrum arises then the
> > > standard made a poor choice in enforcing the fixnum restriction.
> >
> > I think you have this backwards. As long as the standard allows an
> > implementation that doesn't have the problem, then the defect cannot be
> > in the standard. It must be in the implementation.
>
> When I wrote the quote above I was under the impression that the
> standard makes it impossible for an implementation to provide arrays
> with bignum indices. If that impression had been correct then the
> conundrum would arise by the combination of architecture and standard.
> But my impression was apparently mistaken so the conundrum does not
> arise because an implementation can provide arrays with bignum indices
> as an extension.
Even without that there isn't a problem with the standard, since it
doesn't prevent you, the programmer, from creating an array
implementation with bignum indices. Again, see my sketch of just such
an implementation.
> > > Every time you pass from fixnums to bignums it is "potentially really
> > > bad for performance". This is something for the programmer to worry
> > > about.
> >
> > Except that the programmer can't really do anything about the internal
> > implementation of AREF. So it really is the language implementer that
> > has to worry about such things. And I want my array references to be
> > fast, thank you very much.
>
> Me too , I want my arrays fast...
Except that you also want something else, namely bignum indices, that
will make your arrays not fast. It is an engineering tradeoff. You
can't have both. Which do you want more?
Ok. So you are requiring a pointer dereference and a function call.
With the need for the dereferencing, you eliminate the ability of the
compiler to open-code the array access and completely dispense with even
the function call.
And you would have to do this every time inside a loop.
I think you haven't really implemented a compiler or language
interpreter, since this suggestion is really rather naive. In fact, a
lot of your argumentation indicates that you rarely consider the real
world ramifications of your theoretical suggestions.
I will treat this thread as closed.
> On 07 Dec 2009 10:03:32 -0800
> t...@sevak.isi.edu (Thomas A. Russ) wrote:
[...]
> > I suppose. But then you still end up with the situation that array
> > access will be, perhaps, unnecessarily slow. The issue is that you
> > would need to be prepared to do bignum arithmetic on all array accesses
> > unless there was specific information to the contrary.
> Which is the case with all other calculations involving integers.
There are reasons to want to exactly represent 42! without wanting to
use 42! as an array index. So there is a different set of trade-offs
involved in saying that everything must be a fixnum and saying that
array indices must be fixnums.
Cheers,
Pillsy
So what did you do manually that your implementation wasn't clever
enough to do on its own ? It seems to me that the most naive thing an
implementation can do for implementing bit arrays is to use 1 byte per
bit. If this uses too much memory and leads to swapping then I can see
how it would damage performance. But I don't think it takes a
particularly clever compiler to pack bits into one byte (or word
if it leads to faster access) and then access individual bits using
masks. After all why would you declare a bit array if you don't want
that to happen ? If it's good enough to have each bit occupy a whole
byte you would declare each array element to be of type (mod 256) or
whatever is appropriate instead of 256.
--
[Answering the question of whether Harry Potter or Wesley Crusher is
more annoying.]
Harry Potter by far. Wesley Crusher has nothing occultic to him.
http://tinyurl.com/die-wesley-crusher-2
I on the other hand would expect that most users wouldn't even notice
the "slow" array accesses. After all , under default optimization
settings , aref will probably do more than simply fetch the value from
a certain address. That's "slow" but still perfectly adequate for many
applications.
[...]
> As Duane Rettig also points out, the standard array which holds any lisp
> object has storage sizes for machine-word size objects, since that is
> typically what is used for FIXNUM and POINTERS. Since, on a 32-bit
> architecture, there are 4 bytes per word, and addresses are typically
> measured in bytes, you can cleverly array to address all of virtual
> memory using fixnum indices for undifferentiated arrays.
>
> The only exception would seem to be bit vectors, where -- if the
> implementations supports it -- you can pack 8 elements into a single
> byte. [Note that the standard doesn't require this.]
It doesn't require it but I would expect a decent Lisp implementation
to work this way perhaps with declaration (optimize space). And it's
not just with bit vectors but with any object with size smaller than 1
byte.
[...]
> > > Besides, it would be ludicrous to expect that the fixnum size of a lisp
> > > implementation would depend on the actual amount of physical RAM
> > > installed in a computer, rather than the processor (and software)
> > > architecture.
> >
> > I agree. On the other hand it makes sense to expect that the maximum
> > array size would depend on the actual amount of physical RAM installed
> > in a computer so the standard is at fault for creating a connection
> > between fixnum size and array size.
>
> Why?
> Why physical RAM and not virtual address space?
Ok , virtual address space then.
> Typcially the fixnum size is tied to virtual address space instead, so
> you get what you want.
>
> And tying the array size to particular machine configurations makes it
> really hard to distribute your program, since you can't count on the
> maximum array size being anything in particular.
Since ARRAY-TOTAL-SIZE-LIMIT and related constants can be as large as
they want one cannot count on the maximum array size being anything in
particular anyway.
> And you don't have the option of running more slowly with less RAM. You
> wouldn't be able to run at all.
If an application requires a certain amount of RAM in order to do its
job then yes , it will not run on less RAM. But what does this have to
do with the discussion ?
> > > How do you address a multiple dimension array in RAM without
> > > multiplying?
> >
> > If all the dimensions are powers of 2 then you can do it by ORing and
> > shifting. Perhaps there are some nice tricks available in other cases
> > too.
>
> Well, yes, but they get harder when you have to do them with BIGNUMs.
> They are no longer single machine instructions.
But still simpler than multiplication.
[...]
> > > > 5 , so ? In fact if the point for the limitation in the size of arrays
> > > > is to never have a bignum as an array subscript then each dimension
> > > > could be up to (1+ most-positive-fixnum) so the definition of
> > > > ARRAY-DIMENSION-LIMIT should be different in order to allow for such a
> > > > possibility.
> > >
> > > Um, not really. Because the ARRAY-DIMENSION-LIMIT has to apply to all
> > > arrays, including single-dimensional arrays that you want to displace to
> > > your multi-dimensional array.
> > >
> > > I also suspect that ROW-MAJOR-AREF also plays into this limit.
> >
> > As I've said to my reply to Tamas "the standard could allow
> > ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to be at most (1+
> > most-positive-fixnum) (which means that both should be allowed to be
> > bignums). This would still ensure that ARRAY-ROW-MAJOR-INDEX can work
> > with only fixnums while giving the programmer 1 extra subscript to work
> > with". Issue "ARRAY-DIMENSION-LIMIT-IMPLICATIONS" acknowledges this. It
> > says
>
> I think this is getting to the ridiculous hair-splitting point.
Why is it "ridiculous hair-splitting" ? When you say "Um, not
really..." above and then make a point which isn't valid I think it's
perfectly reasonable to point out that your point isn't valid.
> Are
> your requirements really so constrained that getting ONE or TWO
> additional array elements out of 500 million makes a difference?
Not at all. As I said in [1]
Not that 1 subscript matters much of course , my main beef is
with the decision to only allow fixnums. Having made that
decision I can kind of see why they opted for simplicity and
specified ARRAY-DIMENSION-LIMIT and ARRAY-TOTAL-SIZE-LIMIT to
also be fixnums at the cost of taking 1 extra subscript away
from the programmer.
[1] <RylSm.7782$6p6....@newsfe05.ams2>
< http://groups.google.co.uk/group/comp.lang.lisp/msg/0268f9abf3bd8eca >
--
I want to hug you in a non-sexual way and tell you that you make my
heart burst of joy and cuddle up like a cute little cookie monster and
ask for more milk.
http://static.thepiratebay.org/legal/indianagregg_resp2.txt
[...]
> Anyway, on the theme about how to overcome the bit array + fixnum limitation:
> without switching to any bit packing tricks, what you can also do is this:
> Instead of using some least significant bits of the index as a bit offset into
> a word which contains packed bits, what you can do is use bits on the /opposite/
> end of the index to select which huge array to access. Then the least
> significant 30 bits (or whatever is the fixnum mantissa size) are used to index
> into that array. This way, you keep the bit arrays and don't have to write any
> code to pack bits, or worry about choosing the best cell size, etc.
I don't follow. In the solution you propose will each huge array use 1
byte to store 1 bit ? If no then I don't see how your solution avoids
bit packing. Or is your solution intended for situations where you
don't mind wasting memory ?
Ok , I kind of see it. Still , I think it would be clearer if the
specification of ARRAY-TOTAL-SIZE-LIMIT for example said "This shall be
the upper exclusive bound on the array total size of an array".
> Similarly, whose responsibility is to enforce that literal lists can't be
> modified?
If the programmer wants to avoid undefined behavior then he must ensure
that his programme does not attempt to modify any literal objects. This
is clear to me because the page on quote special operator says "The
consequences are undefined if literal objects (including quoted
objects) are destructively modified". Although I have to confess I
can't see the significance of "destructively". Is it possible to modify
an object non-destructively ?
> Requesting a larger array than array-dimension-limit or array-total-size-limit
> is simply outside of the scope of the ANSI Common Lisp standard.
>
> Implementations can diagnose such an attempt with an error signal,
> honor the attempt, if possible, or just behave unpredictably:
> allow the program to become corrupted and produce incorrect
> results, or the entire Lisp image to crash, with or without
> some error diagnostic, etc.
>
> Ensuring that some of these really bad things don't happen is a quality of
> implementation issue, as is the provision of huge arrays that can be indexed by
> bignums.
--
Who's your mama ?
Uh... The CLHS already *DOES* say exactly that, quite explicitly, in fact!
http://www.lispworks.com/documentation/HyperSpec/Body/v_ar_tot.htm
...
Constant Variable ARRAY-TOTAL-SIZE-LIMIT
...
Description:
The upper exclusive bound on the "array total size" of an array.
I don't know how one could be much more clear than that.
Furthermore, ARRAY-TOTAL-SIZE-LIMIT is the *lowest* of all of the
possible reasons for limiting array size in a given implementation:
The actual limit on the array total size imposed by the implementation
might vary according the element type of the array; in this case,
the value of array-total-size-limit will be the smallest of these
possible limits.
-Rob
-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
So the phrase
"This shall be the upper exclusive bound on the array total size of an array"
is "exactly" the same as
``The upper exclusive bound on the "array total size" of an array''
is that right ?
One might argue that the meaning is the same but the first phrase makes
it clear to me that the bound is a constraint the violation of which
causes undefined behavior.
Link is wrong , the correct is
< http://groups.google.co.uk/group/comp.lang.lisp/msg/3cdf465376c4870c >
> So what did you do manually that your implementation wasn't clever
> enough to do on its own ? It seems to me that the most naive thing an
> implementation can do for implementing bit arrays is to use 1 byte per
> bit. If this uses too much memory and leads to swapping then I can see
> how it would damage performance. But I don't think it takes a
> particularly clever compiler to pack bits into one byte (or word
> if it leads to faster access) and then access individual bits using
> masks. After all why would you declare a bit array if you don't want
> that to happen ? If it's good enough to have each bit occupy a whole
> byte you would declare each array element to be of type (mod 256) or
> whatever is appropriate instead of 256.
The problem was not paging, because the implementation did pack things
(I'm sure that all serious implementations do). The problem was that
the compiler was not up to doing the kind of loop-unrolling needed to
generate a good pattern of fetches and masks. That's absolutely
critical for good performance on huge bit-arrays, as otherwise you get
something like a fetch/shift per element which is horrible.
What I ended up doing (and I didn't completely implement this, as I ran
out of time and energy) was using an array of some good, larger type,
then using a macro to unroll the loop myself.