No doubt different CL environments have different limits on
performance. I was wondering in general how well any of the current
generation might handle that large a dataset, given sufficient
hardware. I was thinking of a situation where multiple arrays are
allocated to get around array-dimension-limit, array-total-size-limit,
etc.., so the aggregate might form the basis for a logical matrix of
sufficient size.
Thanks,
Greg Menke
* (* 20000 20000 8)
3200000000
No matter what sort of system you use, that size of matrix will be
problematic, as it pushes the edges of the permissible size of a
process.
8 byte floats will lead to memory consumption of ~3.2GB, which won't
be a happy thing irrespective of what language you're working in.
And that needs to be at least doubled, as you need space for the
"source" matrix as well as the "destination" matrix.
--
(concatenate 'string "cbbrowne" "@cbbrowne.com")
http://cbbrowne.com/info/rdbms.html
Share and Enjoy!!
| No matter what sort of system you use, that size of matrix will be
| problematic, as it pushes the edges of the permissible size of a
| process.
|
| 8 byte floats will lead to memory consumption of ~3.2GB, which won't
| be a happy thing irrespective of what language you're working in.
4GB memory costs only a few hundred dollars, so addressing arrays with
20-something bits (a few MB) is getting inadequate really fast.
A natural point of conversion to using 32 or preferably 40 (or definitely at
least 30) bit array indexing is the migration to 64-bit processors, for
example x86-64 - indeed, increased address space and usable physical memory
can be the primary motivator besides the 3-5 times as many usable registers.
(I assume 64-bit Lisp implementations allow bigger array sizes and direct
integers already).
Robert
Do you have a computer that allows you to address that much memory in
one process?
64 bit systems that have all the hardware to support huge amounts of
memory _aren't_ cheap.
It may cost a few hundred dollars to move from 512MB of memory to 2GB
of memory; jumping onwards to 6GB requires spending the bigger bucks
on an upgrade to [UltraSPARC | IBM PPC | Itanic | SGI MIPS | HPaq
Alpha].
The price immediately jumps sharply, the RAM probably _isn't_ cheap,
and you probably get to gulp on $10K USD before you've gotten started,
and there may not be a suitable CL yet for the system you've bought.
The problem is that no 64 bit architecture has leapt out as being a
really good "risk." None are cheap; all are _highly_ tied to specific
vendors; there has been lots of FUD since the late '90s about "Merced"
being the only 64 bit architecture that could conceivably survive.
Note that there is a serious dearth of vendors hawking it, even now.
--
(concatenate 'string "chris" "@cbbrowne.com")
http://www3.sympatico.ca/cbbrowne/spreadsheets.html
Rules of the Evil Overlord #214. "If a malignant being demands a
sacrificial victim have a particular quality, I will check to make
sure said victim has this quality immediately before the sacrifice and
not rely on earlier results. (Especially if the quality is virginity
and the victim is the hero's girlfriend.)"
<http://www.eviloverlord.com/>
If a 64-bit architecture doesn't allow this, what's the point of it?
>The price immediately jumps sharply, the RAM probably _isn't_ cheap,
>and you probably get to gulp on $10K USD before you've gotten started,
>and there may not be a suitable CL yet for the system you've bought.
Prices will come down when they become more common.
It's probably the case that none of the CL implementations have yet been
ported to 64-bit environments (but I think I've seen mention of Emacs on
64-bit systems allowing multi-GB buffers). That should also happen when
the systems get cheaper, or if a big customer has an application that
requires a 64-bit Lisp.
--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
> It's probably the case that none of the CL implementations have yet been
> ported to 64-bit environments
AFAIK, at least ACL and Clisp have been ported to 64-bit.
P.
> It's probably the case that none of the CL implementations have yet been
> ported to 64-bit environments (but I think I've seen mention of Emacs on
> 64-bit systems allowing multi-GB buffers).
There are 64-bit Lisp implementations available for the popular 64-bit
UNIX systems. Franz's Allegro CL has been ported to three 64-bit
architectures: the Alpha, PA 2.0 and the UltraSPARC.
Interestingly enough, even though the width of a fixnum was increased
in Franz's 64-bit Lisp, no upward adjustment was made to the maximum
array dimension limit. How strange.
| Interestingly enough, even though the width of a fixnum was increased
| in Franz's 64-bit Lisp, no upward adjustment was made to the maximum
| array dimension limit. How strange.
It depends on how you define what a 64-bit lisp means - does it mean it runs
in a 64-bit environment, or it has at least one 64-bit feature, or does it
have to exploit all benefits of ? These would be my criteria in the order
of importance:
1. heap size
2. heap size
3. heap size
4. array indexing
5. fixnum size
Robert
Unless you're implementing complex AI systems like Cyc, which have
databases containing an enormous number of medium-sized objects, I think
array size is likely to be the highest criteria. For most other
applications, you're not going to be able to fill up that huge heap unless
you can create huge arrays, unless you kludge it with nested arrays (as the
OP mentioned).
| >1. heap size
| >2. heap size
| >3. heap size
| >4. array indexing
| >5. fixnum size
|
| Unless you're implementing complex AI systems like Cyc, which have
| databases containing an enormous number of medium-sized objects, I think
| array size is likely to be the highest criteria. For most other
| applications, you're not going to be able to fill up that huge heap unless
| you can create huge arrays, unless you kludge it with nested arrays (as
the
| OP mentioned).
You are right. Though would it be very untypical for non-AI applications to
require a big heap and still not use large arrays? For example, an image
processing system would use large arrays (larger ones than what current
implementations can handle), but a CAD system perhaps would not.
Regarding your second point: if image can grow large, we can make virtual
arrays, but if image size is constrained, then we are worse off even if an
array can be as big as the image - reason for the prioritization.
Large fixnum range is also useful, for example for currency amounts.
Robert
Indeed. 32 bits isn't nearly enough to express Bill Gates' net worth
in pennies, and that's not a facetious remark, as it only takes a
company of "moderate" size to get sales of >$214M wherein 32 bit
values "blow up."
Throw in a couple of tag bits, and MOST-POSITIVE-FIXNUM shrinks
further.
For instance, the biggest value CLISP can cope with, in pennies, is
167772.16, and hopefully many of the gentle readers have pension
savings exceeding that :-).
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://cbbrowne.com/info/finances.html
"In computing, invariants are ephemeral." -- Alan Perlis
> The world rejoiced as "Robert Monfera" <mon...@fisec.com> wrote:
> > Large fixnum range is also useful, for example for currency amounts.
>
> Indeed. 32 bits isn't nearly enough to express Bill Gates' net worth
> in pennies, and that's not a facetious remark, as it only takes a
> company of "moderate" size to get sales of >$214M wherein 32 bit
> values "blow up."
As has been pointed out here before, IEEE Doubles can be used as
perfectly good 53 bit fixnums. You lose 11 bits out of 64, but that's
better than many alternatives -- and it's fast.
For those worried about silent overflow ... the Inexact Result trap
and/or flag can take care of that. I don't have the necessary
documentation for other architectures here but the PowerPC has three
options:
1) you can enable an exception that is taken if an inexact result is
produced
2) there is a status flag that is set/cleared based on whether the last
FP operation produced an inexact result
3) there is a status flag that is set when an FP operation produces an
inexact result, and never cleared (you can, manually, of course).
-- Bruce
> Regarding your second point: if image can grow large, we can make virtual
> arrays, but if image size is constrained, then we are worse off even if an
> array can be as big as the image - reason for the prioritization.
>
> Large fixnum range is also useful, for example for currency amounts.
For this sort of application, I would consider large fixnums to be
purely an optimization time convenience. There shouldn't be any added
cost in the development or maintenance of your wares if currency
objects are represented as bignums. All of the standard Common Lisp
numerical operations "just work" for bignums as they do for fixnums.
In contrast, creating a truly useful virtual array type strikes me as
being a potentially expensive undertaking. While just about anybody
could hack together a simple virtual array abstraction over the
weekend, I would imagine that reimplementing the relevant parts of the
arrays and sequences dictionary for this new type as well as devising
a reliable discipline for marshalling virtual arrays through your FFI
requires a more considerable effort.
Having large arrays especially wins when dealing with external library
code where somebody else set the rules. I am especially fond of the
array and sequence frobs Common Lisp offers me--resorting to
operations on large arrays in foreign code just because my data is too
big for representation on the Lisp run-time is unquestionably
loathsome.
On the other hand, it may have a million elements in one of the dimensions,
and a limit on 20000 elements in one dimension can be an unacceptable limitation
in some cases.
--
Gisle Sælensminde ( gi...@ii.uib.no )
With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going
to land, and it could be dangerous sitting under them as they fly
overhead. (from RFC 1925)
> Do you have a computer that allows you to address that much memory in
> one process?
Yes, several. Other than the oldest of them they were all less than
L2,000 new. Although I don't think any of the (small) boxes we have
can physically have more than 4GB, so this would involve paging unless
the space was sparse. But we buy very low-end boxes.
> 64 bit systems that have all the hardware to support huge amounts of
> memory _aren't_ cheap.
They're not expensive, either, other than for silly values of
expensive. I mean, how much is it going to cost you to design the
manual-paging scheme you need to solve some problem that won't fit in
2-4GB?
> The price immediately jumps sharply, the RAM probably _isn't_ cheap,
> and you probably get to gulp on $10K USD before you've gotten started,
> and there may not be a suitable CL yet for the system you've bought.
That's a silly value of expensive, I guess. However if you're really
constrained, start looking for used boxes. I nearly bought (and
regret not buying) an ex-lease Sun E3k last year, which had, I think
6GB of memory in it (and 4 processors), for L3,000.
> The problem is that no 64 bit architecture has leapt out as being a
> really good "risk." None are cheap; all are _highly_ tied to specific
> vendors; there has been lots of FUD since the late '90s about "Merced"
> being the only 64 bit architecture that could conceivably survive.
> Note that there is a serious dearth of vendors hawking it, even now.
This is only really true on the desktop. Sun and IBM and HP (and maybe
SGI) have sold billions of dollars worth of 64bit kit over the last
few years.
--tim
> Yes, several. Other than the oldest of them they were all less than
> L2,000 new. Although I don't think any of the (small) boxes we have
> can physically have more than 4GB, so this would involve paging
> unless the space was sparse. But we buy very low-end boxes.
If they don't have slots for >4GB of memory, then I have to protest
that they aren't computers that "allow you to address that much memory
in one process."
I've got an Alpha box that could cope with having 1.5GB of memory on
it; I don't claim that's >4GB, despite the fact that the CPU would be
_theoretically_ capable of the job. I can't put 4GB of RAM in the box
(except via the futile exercise of taping a bag containing an extra
20GB to the side of it; that's not _usable_ memory :-)).
>> 64 bit systems that have all the hardware to support huge amounts of
>> memory _aren't_ cheap.
> They're not expensive, either, other than for silly values of
> expensive. I mean, how much is it going to cost you to design the
> manual-paging scheme you need to solve some problem that won't fit
> in 2-4GB?
The machines with slots for 16GB of RAM don't sell for $800, and more
than likely don't use the RAM that's cheap.
>> The price immediately jumps sharply, the RAM probably _isn't_ cheap,
>> and you probably get to gulp on $10K USD before you've gotten started,
>> and there may not be a suitable CL yet for the system you've bought.
>
> That's a silly value of expensive, I guess. However if you're really
> constrained, start looking for used boxes. I nearly bought (and
> regret not buying) an ex-lease Sun E3k last year, which had, I think
> 6GB of memory in it (and 4 processors), for L3,000.
I'm assuming _new_ hardware here. After all, _somebody_ will be
paying full price.
I agree on the merits of the E3K; that would indeed be a slick
addition to the stable.
>> The problem is that no 64 bit architecture has leapt out as being a
>> really good "risk." None are cheap; all are _highly_ tied to specific
>> vendors; there has been lots of FUD since the late '90s about "Merced"
>> being the only 64 bit architecture that could conceivably survive.
>> Note that there is a serious dearth of vendors hawking it, even now.
>
> This is only really true on the desktop. Sun and IBM and HP (and maybe
> SGI) have sold billions of dollars worth of 64bit kit over the last
> few years.
I _was_ thinking of servers when I wrote that paragraph. SPARC,
Alpha, MIPS, PPC, IA-64. People don't think of them as "desktop"
platforms.
I got a response via email suggesting AMD "Sledgehammer" as a possible
upcoming alternative.
It is indeed a very attractive idea that in another year, there may be
machines that are compatible with IA-32 (thus making them acceptable
to the teeming hordes of desktop folk), and which offer 64 bit
extensions.
Two objections still emerge:
1. It's not available now, so it's no answer, now.
2. It's not obvious that motherboards supporting 8GB or 16GB of RAM
will be immediately available upon release of the architecture.
(Arguably there's #3: It may take time for there to be a x86-64 "code
generator" for GCC and/or Python...)
I have periodically griped about much the same issue vis-a-vis
"alternative" CPU architectures and their usability for Linux. Linux
has supported StrongARM for some years now, but it has been
problematic to get inexpensive hardware for that. Ditto for PPC.
You can get some cheap CPUs for some of these variant architectures;
if there's not a source of cheap motherboards, those CPUs are useless.
The only way to get a Linux/PPC system is to buy your hardware from
Apple (thus buying MacOS), or to pay _BIG_ bucks for third party
"embedded system" hardware.
In comparison, I can walk down the street and get an IA-32 motherboard
that supports 2GB of RAM for maybe a couple hundred bucks. But it
won't support 8GB.
--
(concatenate 'string "cbbrowne" "@cbbrowne.com")
http://www3.sympatico.ca/cbbrowne/
:FATAL ERROR -- ERROR IN USER
You are aware that IA-32 has support for 36-bit addressing, right? (The
PSE36 CPU flag reports its presence in a particular processor.) Chipsets
with support for the full 64GB address space are certainly available, and
Linux supports them in recent kernel versions, giving you 4GB of physical
memory per process. I am not so sure the raw need for 64-bit processors
will be all that imminent, given the ability to provide so much RAM to
the 32-bit processors. Not that 32 bits is enough, of course, but we may
find better ways to work with 64-bit units than switching the whole
processor. The only serious short-coming in modern implementations of
IA-32 is the register starvation. It is quite fascinating what Intel
(and AMD) have been able to do with this architecture.
--
In a fight against something, the fight has value, victory has none.
In a fight for something, the fight is a loss, victory merely relief.
70 percent of American adults do not understand the scientific process.
> If they don't have slots for >4GB of memory, then I have to protest
> that they aren't computers that "allow you to address that much memory
> in one process."
Well. I can write a program that allocates an array of 2^32
double-floats. That program will run (it may even run in reasonable
time if the access patterns are not toxic). That's my criteria for
`address that much memory in one process'. Maybe I should have
phrased it as `have that much (virtual) linear address space in one
process', sorry.
> The machines with slots for 16GB of RAM don't sell for $800, and more
> than likely don't use the RAM that's cheap.
Yes, like I said `silly value of expensive'.
> I'm assuming _new_ hardware here. After all, _somebody_ will be
> paying full price.
Sure, then it's 10k or something. That's still not really an
expensive machine if you're getting paid to solve problems that need
significant memory (I realise that `expensive machine' has come to
mean `anything over $500', to many people: I'm just not one). It is
out of the hobby market, I agree (but hobbyists can find slower such
machines if they are willing to pay for the power and live with the
noise...)
All I'm trying to say is that if you have a problem that needs
slightly more than 32bits of address space which is in any way
commercially interesting, then the cost of a machine, with memory, to
solve it, is not in any sense prohibitive. Certainly, if the problem
is in the 4-16GB range hardware is likely to be cheap compared to the
time cost of implementing manual paging or whatever you need to do to
get the problem down to 2-4GB.
--tim
[...]
> If they don't have slots for >4GB of memory, then I have to protest
> that they aren't computers that "allow you to address that much memory
> in one process."
This is an incredibly naive qualification. While a low end 64-bit
machine may not have enough "slots" for over four gigabytes of RAM,
the processor and MMU can still address that amount memory. For
example, unlike your average 32-bit machine a low-end 64-bit machine
can do such things as mmap() multi-gigabyte files without employing a
windowing hack.
> I've got an Alpha box that could cope with having 1.5GB of memory on
> it; I don't claim that's >4GB, despite the fact that the CPU would be
> _theoretically_ capable of the job. I can't put 4GB of RAM in the box
> (except via the futile exercise of taping a bag containing an extra
> 20GB to the side of it; that's not _usable_ memory :-)).
I myself have an Alpha with "only" 1.5GB of memory, but I have tens of
gigabytes of virtual memory and I can mmap() my entire 72GB disk if I
genuinely feel the need to. Since none of my applications have a
working set which exceeds the size of my physical memory, I am winning
big with a 64-bit machine which has well under 4GB of memory.
Do you actually have an application with a working set of over four
gigabytes?
[...]
> (Arguably there's #3: It may take time for there to be a x86-64 "code
> generator" for GCC and/or Python...)
Operating systems and compilers for x86-64 have been available over a
year. While they may not be production quality at this time, working
code generators for this architecture are old news.
> The world rejoiced as Tim Bradshaw <t...@cley.com> wrote:
> > * Christopher Browne wrote:
> >> Do you have a computer that allows you to address that much memory in
> >> one process?
>
> > Yes, several. Other than the oldest of them they were all less than
> > L2,000 new. Although I don't think any of the (small) boxes we have
> > can physically have more than 4GB, so this would involve paging
> > unless the space was sparse. But we buy very low-end boxes.
>
> If they don't have slots for >4GB of memory, then I have to protest
> that they aren't computers that "allow you to address that much memory
> in one process."
>
> I've got an Alpha box that could cope with having 1.5GB of memory on
> it; I don't claim that's >4GB, despite the fact that the CPU would be
> _theoretically_ capable of the job. I can't put 4GB of RAM in the box
> (except via the futile exercise of taping a bag containing an extra
> 20GB to the side of it; that's not _usable_ memory :-)).
Physical memory is one thing, process size is another. I've written
software on IA-32 that had to do manual paging because of the process
size limit, and it would have been *so* much nicer/faster/easier to
just use a 64-bit architecture and let the kernel do the paging.
Since the memory access was more or less random within each 250MB
chunk, but not across all several GB that the process wanted to
address, so paging would have been fine.
(I think this is relevant because didn't this start by wondering about
the usefulness of having a fully 64-bit lisp?)
--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'
http://citeseer.nj.nec.com/wilson92pointer.html
Paul
| You are aware that IA-32 has support for 36-bit addressing, right? (The
| PSE36 CPU flag reports its presence in a particular processor.)
Chipsets
| with support for the full 64GB address space are certainly available,
and
| Linux supports them in recent kernel versions, giving you 4GB of
physical
| memory per process.
I remember discussions about how old OS limits (max. memory size,
fragmentation) prevented lisp images to grow much beyond about 1GB. With
4GB physical memory per process, is it likely that the new practical image
size limit on x86 approaches 4GB?
| The only serious short-coming in modern implementations of
| IA-32 is the register starvation. It is quite fascinating what Intel
| (and AMD) have been able to do with this architecture.
The Hammer may ease register starvation a good bit - even though they only
doubled the number of registers, it's worth more than that because the
number of general purpose registers just about tripled and the width is
doubled too (plus double of the 128-bit SSE registers). Together, 32
registers aren't bad, and tens of millions of Hammers will be sold yearly.
AMD processors have long used internal register renaming, and probably they
have 64 or more internal registers. It is thus noteworthy and somewhat
credible that AMD defended the choice of 16+16 registers, saying the
benefits of further increase diminish quickly beyond that point, and there
would be some drawbacks.
AMD introduced the RISC in a CISC concept first, and now the Pentium 4
translates to RISC upon first memory read and its instruction cache retains
the RISC code. (Probably WLIW or micro-ops are preferable terms over RISC.)
With some dramatization, we can say that the memory controller gets rid of
IA-32 and CISC-ness before the processor is even reached.
Robert
| I got a response via email suggesting AMD "Sledgehammer" as a possible
| upcoming alternative.
Sorry Christopher, I meant to post it here (see on the bottom).
| It is indeed a very attractive idea that in another year, there may be
| machines that are compatible with IA-32 (thus making them acceptable
| to the teeming hordes of desktop folk), and which offer 64 bit
| extensions.
|
| Two objections still emerge:
| 1. It's not available now, so it's no answer, now.
Correct.
| 2. It's not obvious that motherboards supporting 8GB or 16GB of RAM
| will be immediately available upon release of the architecture.
Indeed, according to roadmaps, server motherboards will arrive several
months after desktop (<=4GB) versions for the 64-bit AMD chips.
Robert
----- Original Message -----
From: "Christopher Browne" <cbbr...@acm.org>
Newsgroups: comp.lang.lisp
Sent: Friday, June 14, 2002 1:14 PM
Subject: Re: Upper limits of CL
| In the last exciting episode, "Robert Monfera" <mon...@fisec.com> wrote::
| > 4GB memory costs only a few hundred dollars, so addressing arrays
| > with 20-something bits (a few MB) is getting inadequate really fast.
| 64 bit systems that have all the hardware to support huge amounts of
| memory _aren't_ cheap.
IBM wants to sell you a 32-bit dual-processor capable server expandable to
4GB for $949 -
http://commerce.www.ibm.com/cgi-bin/ncommerce/CategoryDisplay?cgrfnbr=207255
0&smrfnbr=2072488&cntrfnbr=1&cgmenbr=1&cntry=840&lang=en_US&scrfnbr=73
(You may get 10% off if you ask or pay with Amex.)
For 2GB memory, add $600. With 4GB memory, it is still less than $3000. I
expect similar or lower prices and larger physical memory capability with
the upcoming Hammers, but even the $10k price you mention does not sound
horrible for deployment.
| The problem is that no 64 bit architecture has leapt out as being a
| really good "risk." None are cheap; all are _highly_ tied to specific
| vendors; there has been lots of FUD since the late '90s about "Merced"
| being the only 64 bit architecture that could conceivably survive.
| Note that there is a serious dearth of vendors hawking it, even now.
The success of Hammer looks inevitable, simply by virtue of automatically
becoming the first mass-market 64-bit computer processor. It does not need
64-bit applications, it just has to replace the K7 Athlon, which has 20% of
the market. Reportedly Intel develops Pentiums in the same direction.
Whereas in the past 7-8 families of low-volume (thus expensive) 64-bit
processors and 32-bit server processors competed, now it reduces to x86,
PowerPC, Itanium and Sparc at the most, of which 3 are already supported by
Lisp implementations to a good extent, and the 4th (Itanium) hasn't started
selling yet. Memory for them is getting really cheap. _Laptops_ can have
1GB RAM now.
Robert
Robert> "Erik Naggum" <er...@naggum.net> wrote in message
Robert> news:32331562...@naggum.net...
Robert> | You are aware that IA-32 has support for 36-bit addressing, right? (The
Robert> | PSE36 CPU flag reports its presence in a particular processor.)
Robert> Chipsets
Robert> | with support for the full 64GB address space are certainly available,
Robert> and
Robert> | Linux supports them in recent kernel versions, giving you 4GB of
Robert> physical
Robert> | memory per process.
Robert> I remember discussions about how old OS limits (max. memory size,
Robert> fragmentation) prevented lisp images to grow much beyond about 1GB. With
Robert> 4GB physical memory per process, is it likely that the new practical image
Robert> size limit on x86 approaches 4GB?
For CMUCL, the heap can be 2 GB on FreeBSD, 1.75 GB for the other BSDs
and Linux. The remaining memory space goes to the static space, the
read-only space, the binding stack, and the control stack. It also
has to leave some room for C code, C stack, and dynamic libraries.
Does this qualify as "approaches 4GB"? :-)
Robert> | The only serious short-coming in modern implementations of
Robert> | IA-32 is the register starvation. It is quite fascinating what Intel
Robert> | (and AMD) have been able to do with this architecture.
Robert> The Hammer may ease register starvation a good bit - even though they only
Robert> doubled the number of registers, it's worth more than that because the
Robert> number of general purpose registers just about tripled and the width is
Robert> doubled too (plus double of the 128-bit SSE registers). Together, 32
Robert> registers aren't bad, and tens of millions of Hammers will be sold yearly.
Considering that MIPS, HP-PA, PPC, and Sparc all have 32 registers, I
think 32 is a good number.
Ray
| For CMUCL, the heap can be 2 GB on FreeBSD, 1.75 GB for the other BSDs
| and Linux. The remaining memory space goes to the static space, the
| read-only space, the binding stack, and the control stack. It also
| has to leave some room for C code, C stack, and dynamic libraries.
|
| Does this qualify as "approaches 4GB"? :-)
Can you actually create 1900 to 2000 instances of 1MB arrays? I just consed
1024 such arrays in ACL on Windows (the XP Professional has an OS-imposed
limit of 2GB), but as someone, probably Duane, pointed out, ACL now requires
a contiguous block of memory, and 1.1GB was the largest. After some
tweaking, ACL and the OS behaved pretty well, I run some tests looping
through the vectors and doing global GCs.
Robert
Yup. Thanks for reminding me when I visited. The bigger array sizes are
going to be in place for 6.2. The new limit for 64-bit versions is going
to be 72057594037927936, or (expt 2 56). Hopefully that will be enough
for a while.
--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)
> Christopher Browne <cbbr...@acm.org> writes:
>
> > I've got an Alpha box that could cope with having 1.5GB of memory on
> > it; I don't claim that's >4GB, despite the fact that the CPU would be
> > _theoretically_ capable of the job. I can't put 4GB of RAM in the box
> > (except via the futile exercise of taping a bag containing an extra
> > 20GB to the side of it; that's not _usable_ memory :-)).
>
> I myself have an Alpha with "only" 1.5GB of memory, but I have tens of
> gigabytes of virtual memory and I can mmap() my entire 72GB disk if I
> genuinely feel the need to. Since none of my applications have a
> working set which exceeds the size of my physical memory, I am winning
> big with a 64-bit machine which has well under 4GB of memory.
>
> Do you actually have an application with a working set of over four
> gigabytes?
We have a customer of a customer who is up to a requirement for at least
6 Gb for their dataset. We had to provide a patch in order for that to
work (it was stupid, really; we still had a few int usages where longs were
required, and so our heap_sbrk calls were going negative when requesting
more than 2 Gb at a time for new lisp heap). I know of no others that are
up to datasets that large yet, but many are close, and it's only a matter
of time...
> It's probably the case that none of the CL implementations have yet
> been ported to 64-bit environments (but I think I've seen mention of
> Emacs on 64-bit systems allowing multi-GB buffers).
There's no doubt about it -- on a 64-bit system such as Tru64 under
Alpha, XEmacs has 63-bit integers, which means the size of your
buffers is pretty much only limited by the amount of free virtual
memory.
The same goes with UltraSPARC in 64-bit mode ("LP64"), which XEmacs
supports just fine.
Duane> Yup. Thanks for reminding me when I visited. The bigger array sizes are
Duane> going to be in place for 6.2. The new limit for 64-bit versions is going
Duane> to be 72057594037927936, or (expt 2 56). Hopefully that will be enough
Duane> for a while.
We (expt 2 56)? Does this mean you are using 8 bits for tags?
Ray
Robert> "Raymond Toy" <t...@rtp.ericsson.se> wrote in message
Robert> news:4nznxwc...@edgedsp4.rtp.ericsson.se...
Robert> | For CMUCL, the heap can be 2 GB on FreeBSD, 1.75 GB for the other BSDs
Robert> | and Linux. The remaining memory space goes to the static space, the
Robert> | read-only space, the binding stack, and the control stack. It also
Robert> | has to leave some room for C code, C stack, and dynamic libraries.
Robert> |
Robert> | Does this qualify as "approaches 4GB"? :-)
Robert> Can you actually create 1900 to 2000 instances of 1MB arrays? I just consed
I don't have BSD to try but on Linux you should be able to create 1500
or so such arrays. However, I just tried that and it stops after 500
MB or so, despite my telling CMUCL to use more heap. I'll have to
look into that.
Ray
Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
>> >>>>> "Duane" == Duane Rettig <du...@franz.com> writes:
>>
Duane> Yup. Thanks for reminding me when I visited. The bigger array sizes are
Duane> going to be in place for 6.2. The new limit for 64-bit versions is going
Duane> to be 72057594037927936, or (expt 2 56). Hopefully that will be enough
Duane> for a while.
>>
>> We (expt 2 56)? Does this mean you are using 8 bits for tags?
Duane> For types, in the header word of the array.
Ok. I guess this means that the header word contains the 8-bit header
type and the remaining bits are used to indicate the length of the
array.
If so, this is different from CMUCL where the length of the array is
stored as a fixnum in the word following the header word.
Ray
> >>>>> "Duane" == Duane Rettig <du...@franz.com> writes:
>
> Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
> >> >>>>> "Duane" == Duane Rettig <du...@franz.com> writes:
> >>
> Duane> Yup. Thanks for reminding me when I visited. The bigger
> array sizes are going to be in place for 6.2. The new limit for
> 64-bit versions is going to be 72057594037927936, or (expt 2 56).
> Hopefully that will be enough for a while.
> >>
> >> We (expt 2 56)? Does this mean you are using 8 bits for tags?
>
> Duane> For types, in the header word of the array.
>
> Ok. I guess this means that the header word contains the 8-bit header
> type and the remaining bits are used to indicate the length of the
> array.
>
> If so, this is different from CMUCL where the length of the array is
> stored as a fixnum in the word following the header word.
Yes. This is a tradeoff that every implementation has to make. On the
one hand, there is a slightly lower limit for array sizes if the size
field is encroached upon by the type byte. On the other hand, using
an extra word for a size field forces all arrays (actually, half, on
average) to be larger. We chose the former situation because a
one-element array can be only one Allocation Unit (AU) but with an
extra word being used for size, the same array would have to be 2 AUs.
> Yes. This is a tradeoff that every implementation has to make. On the
> one hand, there is a slightly lower limit for array sizes if the size
> field is encroached upon by the type byte. On the other hand, using
> an extra word for a size field forces all arrays (actually, half, on
> average) to be larger. We chose the former situation because a
> one-element array can be only one Allocation Unit (AU) but with an
> extra word being used for size, the same array would have to be 2 AUs.
This is the sort of thing that really interests me. I wish there was
a Lisp Implementors list somewhere that people could yak about
implementation concepts.
Of course, there's the GC-List, but that really only covers GC. And
is not particularly active lately. (I suppose it'll pick up once
OOPSLA comes along again.)
'james
--
James A. Crippen <ja...@unlambda.com> ,-./-. Anchorage, Alaska,
Lambda Unlimited: Recursion 'R' Us | |/ | USA, 61.20939N, -149.767W
Y = \f.(\x.f(xx)) (\x.f(xx)) | |\ | Earth, Sol System,
Y(F) = F(Y(F)) \_,-_/ Milky Way.
Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
>>
>> Ok. I guess this means that the header word contains the 8-bit header
>> type and the remaining bits are used to indicate the length of the
>> array.
>>
>> If so, this is different from CMUCL where the length of the array is
>> stored as a fixnum in the word following the header word.
Duane> Yes. This is a tradeoff that every implementation has to make. On the
Duane> one hand, there is a slightly lower limit for array sizes if the size
Duane> field is encroached upon by the type byte. On the other hand, using
Duane> an extra word for a size field forces all arrays (actually, half, on
Duane> average) to be larger. We chose the former situation because a
Duane> one-element array can be only one Allocation Unit (AU) but with an
Duane> extra word being used for size, the same array would have to be 2 AUs.
I see. Just curious on how this design decision came about. Are one
element arrays that common? Or at least common enough to warrant this
savings?
According to the CMUCL design doc, the extra word "allows for much
larger vectors, and it allows LENGTH to simply access a single word
without masking or shifting". Of course, until recently, no version
of CMUCL allowed you to allocate even 2^24 of anything except
strings. (The heap was 64 MB in all platforms, I think.)
Ray
James> Duane Rettig <du...@franz.com> writes:
>> Yes. This is a tradeoff that every implementation has to make. On the
>> one hand, there is a slightly lower limit for array sizes if the size
>> field is encroached upon by the type byte. On the other hand, using
>> an extra word for a size field forces all arrays (actually, half, on
>> average) to be larger. We chose the former situation because a
>> one-element array can be only one Allocation Unit (AU) but with an
>> extra word being used for size, the same array would have to be 2 AUs.
James> This is the sort of thing that really interests me. I wish there was
James> a Lisp Implementors list somewhere that people could yak about
James> implementation concepts.
The CMUCL Design document has some information about this which you
might find interesting.
Ray
> >>>>> "Duane" == Duane Rettig <du...@franz.com> writes:
>
> Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
> >>
> >> Ok. I guess this means that the header word contains the 8-bit header
> >> type and the remaining bits are used to indicate the length of the
> >> array.
> >>
> >> If so, this is different from CMUCL where the length of the array is
> >> stored as a fixnum in the word following the header word.
>
> Duane> Yes. This is a tradeoff that every implementation has to make. On the
> Duane> one hand, there is a slightly lower limit for array sizes if the size
> Duane> field is encroached upon by the type byte. On the other hand, using
> Duane> an extra word for a size field forces all arrays (actually, half, on
> Duane> average) to be larger. We chose the former situation because a
> Duane> one-element array can be only one Allocation Unit (AU) but with an
> Duane> extra word being used for size, the same array would have to be 2 AUs.
>
> I see. Just curious on how this design decision came about. Are one
> element arrays that common? Or at least common enough to warrant this
> savings?
No, that's backwards. It is the lighter weight of arrays that make it
more conducive to allocate more of them with impunity. This applies to
arrays of all sizes. It also applies to cons cells, as well (I know there
are implementations that use three words for cons cells rather than two,
and I think I rememeber hearing of one which used 4 words per cons).
It is also the same reason why we allow hash-tables to be created that
have only three keys. The lighter weight any particular object can be,
the more of them that can be allocated by a user.
> According to the CMUCL design doc, the extra word "allows for much
> larger vectors, and it allows LENGTH to simply access a single word
> without masking or shifting". Of course, until recently, no version
> of CMUCL allowed you to allocate even 2^24 of anything except
> strings. (The heap was 64 MB in all platforms, I think.)
Kind of an ironic twist, eh, that the design intention was thwarted by
reality, even if only temporarily?
I wouldn't make the same claim (i.e. "allows for much larger vectors")
when calling out (expt 2 56) as the array-*-limit - I know of nobody who
could even test such a limit currently, and so calling this limit out
simply states that the total size array that a CL user could allocate
on one of our 64-bit lisps will not be limited by this value, once users
are able to build up machines with this much virtual memory.
There's a difference between what is _theoretically_ allowed, and what
is actually achievable.
Obviously, the are _no_ systems out there allowing you to truly have
2^64 bytes/words/nybbles/... of RAM. That quantity of RAM would
probably cost as much as all computers constructed thus far in history
:-).
Even on the more reasonable side of things, many of the earlier 64 bit
systems predate RAM becoming cheap, and may simply not have enough
slots around to allow having 2GB (2^31) of memory. That's
unfortunate, but nonetheless the case. That is likely to apply to
just about all of the "desktop" 64 bit systems that are getting
released to the used market.
>>The price immediately jumps sharply, the RAM probably _isn't_ cheap,
>>and you probably get to gulp on $10K USD before you've gotten
>>started, and there may not be a suitable CL yet for the system
>>you've bought.
> Prices will come down when they become more common.
Doubtless so.
> It's probably the case that none of the CL implementations have yet
> been ported to 64-bit environments (but I think I've seen mention of
> Emacs on 64-bit systems allowing multi-GB buffers). That should
> also happen when the systems get cheaper, or if a big customer has
> an application that requires a 64-bit Lisp.
What is unfortunate is that nobody got around to building a _nice_
segmented OS on the IA-32 architecture. There are addressing modes
that should allow that to be handled nicely, and the horrors of the
64KB segments associated with the earlier days of Intel wouldn't apply
the same way to the use of less-drainbamaged segmentation schemes.
Having _huge_ linear memory spaces isn't forcibly mandatory, so long
as there's a way of reasonably accessing very large amounts of memory.
--
(reverse (concatenate 'string "gro.gultn@" "enworbbc"))
http://www3.sympatico.ca/cbbrowne/multics.html
The light at the end of the tunnel may be an oncoming dragon.
Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
>> I see. Just curious on how this design decision came about. Are one
>> element arrays that common? Or at least common enough to warrant this
>> savings?
Duane> No, that's backwards. It is the lighter weight of arrays that make it
Duane> more conducive to allocate more of them with impunity. This applies to
Duane> arrays of all sizes. It also applies to cons cells, as well (I know there
Duane> are implementations that use three words for cons cells rather than two,
Duane> and I think I rememeber hearing of one which used 4 words per cons).
Duane> It is also the same reason why we allow hash-tables to be created that
Duane> have only three keys. The lighter weight any particular object can be,
Duane> the more of them that can be allocated by a user.
I can understand why you might want cons cells to be small. I find
less of a reason for arrays, but that might be because my arrays tend
to fairly large and one extra word out of thousands for each array is
irrelevant to me. :-) As an implementor, I'm sure you have a much
different view. Thanks for the info.
>> According to the CMUCL design doc, the extra word "allows for much
>> larger vectors, and it allows LENGTH to simply access a single word
>> without masking or shifting". Of course, until recently, no version
>> of CMUCL allowed you to allocate even 2^24 of anything except
>> strings. (The heap was 64 MB in all platforms, I think.)
Duane> Kind of an ironic twist, eh, that the design intention was thwarted by
Duane> reality, even if only temporarily?
But given that the original CMUCL developers stopped around 1990 or
so, it's not really a surprise. We used to drool over 64 MB
Sparcstations back then. I'm pretty sure that not until a few years
ago did any one even ask for more heap space.
Duane> I wouldn't make the same claim (i.e. "allows for much larger vectors")
Duane> when calling out (expt 2 56) as the array-*-limit - I know of nobody who
Duane> could even test such a limit currently, and so calling this limit out
Duane> simply states that the total size array that a CL user could allocate
Duane> on one of our 64-bit lisps will not be limited by this value, once users
Duane> are able to build up machines with this much virtual memory.
Indeed. If I remember correctly, the Sparc V9 architecture says that
there are only 44 bits worth of virtual memory even though you have
64-bit addresses. There's a huge whole in the middle of the space.
Ray
So? I was just talking about more than 4GB, not the maximum theoretical
possibility.
Furthermore, the amount of RAM is irrelevant, since "memory in one process"
refers to *virtual* memory, not physical memory. Virtual memory has almost
always been used to allow addressing more memory than is available as
physical RAM.
--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
If the O.P. was actually trying to invert matrices of dimension
'(20000 20000), then they more than likely care somewhat about the
amount of memory physically available.
--
(concatenate 'string "chris" "@cbbrowne.com")
http://cbbrowne.com/info/spiritual.html
I'm a Lisp variable -- bind me!
Christopher> Barry Margolin <bar...@genuity.net> wrote:
>> In article <aeotnp$8r9qj$1...@ID-125932.news.dfncis.de>,
>> Christopher Browne <cbbr...@acm.org> wrote:
>>> Barry Margolin <bar...@genuity.net> wrote:
>>>> In article <aed89h$61hf1$1...@ID-125932.news.dfncis.de>,
>>>> Christopher Browne <cbbr...@acm.org> wrote:
>>>>> Do you have a computer that allows you to address that much memory in
>>>>> one process?
>>>>
>>>> If a 64-bit architecture doesn't allow this, what's the point of it?
>>>
>>> There's a difference between what is _theoretically_ allowed, and what
>>> is actually achievable.
>>>
>>> Obviously, the are _no_ systems out there allowing you to truly have
>>> 2^64 bytes/words/nybbles/... of RAM. That quantity of RAM would
>>> probably cost as much as all computers constructed thus far in history
>>> :-).
>>
>> So? I was just talking about more than 4GB, not the maximum theoretical
>> possibility.
>>
>> Furthermore, the amount of RAM is irrelevant, since "memory in one process"
>> refers to *virtual* memory, not physical memory. Virtual memory has almost
>> always been used to allow addressing more memory than is available as
>> physical RAM.
Christopher> If the O.P. was actually trying to invert matrices of dimension
Christopher> '(20000 20000), then they more than likely care somewhat about the
Christopher> amount of memory physically available.
If I did the math right, this matrix takes about 3GB of memory for
double-precision values. I'm pretty sure an inversion can be done
in-place, so you only need the array and probably a couple more 20000
element arrays to do it.
Even if not, the OP has other more serious problems if he wants to
invert such a matrix and get anything meaningful, so he's probably
caring about the wrong thing.
Ray
Whether the memory is real or virtual affects the performance, but not the
ability to do it. In some applications you need an answer, and you don't
mind waiting a while for it.
Although I shall not speak of the efficiency issue since I have never done
anything like this before, it is fairly obvious that all you need to do this
is some fairly minimal amount of memory (only kilobytes in magnitude) and
enough disk space to hold one or two copies of this matrix. Binary I/O with
objects of the appropriate type is a requirement, of course. There is, also
of course, no material difference between this and system-managed virtual
memory, but it might not even matter that it is not random access if you are
actually going to traverse the data sequentially. You could even do the old
tape-station trick and write a temporary tape/file with redundant data in
your access pattern if this is faster than random access to disk or virtual
memory -- there is strong empirical evidence that paging is slower than
predictable sequential disk access, and you get a free 40G disk with every
three pairs of sneakers these days.
A good engineer would balance the tradeoffs and solve the problem within the
existing resource constraints. A theoretical computer scientist would whine
until he got a big enough machine to implement the mathematical solution with
the least amount of fussing about the constraints of the real world. I know
which one I would hire to get the job done.
When I hear about these amazing datasets, I keep thinking that they need to
be generated somehow. The only applications I know of that require multiple
gigabytes of memory (and then tens of gigabytes) are those that require
multiple terabytes of disk space (and then easily tens of terabytes, too). A
friend of mine from the U of Oslo once headed the IT department in Statoil
responsible for geological surveys. They aquired the first homogeneous park
of disks capable of 1 terabyte simultaneous access in Norway (they already
had _hundreds_ og terabytes of "virtual disk space" on enormous tape systems
that were loaded onto disk as needed). These datasets were so huge that they
took many months to accumulate -- they had to send physical _ships_ out to
sea to scan the seabed, at the cost of dollar a byte, and the sheer transfer
time from tape to disk could be a couple _days_ after the physical tapes had
also taken several days to move from ship to the computer facility. Then, of
course, came the processing of these things. Dedicated hardware costing many
millions of dollars churned on the data for weeks on end before it finally
spit out a colorful map of the seabed that took several days to plot on their
monster plotters so they could hang them up on the walls, hundreds of meters
long in all. Later, computers became powerful enough to do virtual reality
with real-time 3D navigation over and into the analyzed seabed. This amazing
feat was all done in Fortran, unsurprisingly, software costing more millions
of dollars and that were tens of millions of lines of code developed over
decades. This was all worth it because it takes approximately one day of
actual production from an oil well to pay for all the computing power
necessary to punch the right hole in the ground. The whole country of Norway
can run for twice as long as it takes to pump up the oil, from the _taxation_
on the _profits_ alone, socialist plan economy and social security and
defense and everything. This is not some jerkoff hobbyist whining about not
getting a 64-bit Common Lisp -- this is big enough to invest 100 man-years to
develop dedicated programming languages and spawning moderately large
companies just to cater to a single industry of a _very_ small number of
companies. (We have no mom-and-pop oil rigs in Norway.) It would be cheaper
to implement a Common Lisp system of your own for this kind of operation than
to ask someone else to do it for you. (In an unrelated industry, that is
just how Erlang got started.) The morale of this long-winded example is
this: If you really need to invert a 20,000 by 20,000 matrix of doubles, and
it is not just some kind of academic masturbation, the memory and Common Lisp
and 64-bit hardware and whatever else you can come up with on the IT side are
going to be miniscule compared to the rest of the costs of the system. Take
meteorology -- the first Cray purchased in Norway was used to help our
fishing industry plan their expeditions to sea. According to legend, the
hardware paid for itself the first week it had been in operation, but the
cost of satellite imaging, telecommunications equipment capable of feeding
the machine in real time, etc, dwarfed the Cray hardware by two orders of
magnitude. Apart from the oil, we have _weather_ in this mountainous costal
country. Billions of dollars have been poured into meteorological prediction
in this country alone since computers were able to help at all. Under such
circumstances, I can fully sympathize with the need for more than 32-bit
addressing and I appreciate the need for the raw computational power that
goes into these things, but if this talk about 64-bit implementations is only
some academic exercise, I actually find it _insulting_ to the brilliant minds
who have had to do without it. Besides, if you really need gigabyte upon
gigabyte of memory and the hardware to utilize it, the only thing between you
and satisfying that need has been money for at least 10 years. It's not like
it's Common Lisp's fault that you haven't had that money -- and if you had
had it, would you have had anything left over to do anything useful with it
over a reasonable period of time? Like, you don't buy a 150-million-dollar
printing press just because you got an idea about publishing a newspaper --
you upgrade from the 75-million-dollar printing press when you approach 20
out of 24 hours running time 7 days a week and want to not run out of hours
of the day before the new press can be delivered and run in.
So pardon my cynical twist, but what are you doing with that 20,000×20,000
double-precision floating point matrix you say you need to invert _today_?
If you answer "nutt'n, I jus kinda wondered what it'd be like, you know", you
should be very happy that I am most likely more than 3000 miles away from
you, or I would come over and slap you hard.
And if you _are_ doing serious stuff of this magnitude, why do you even
bother with run-of-the-mill Common Lisp implementations on stock hardware?
Implement your own goddamn Common Lisp system optimized for your hardware and
your domain-specific language and other needs. That was -- after all -- how
several of the Common Lisp implementations out there got started. It wasn't
a miracle then, and it won't be a miracle now. Just f-ing do it.
[ If I sound grumpy, it is only because I have come across too many idiots of
the "it can't be done" persuasion lately, the kind of managers who have an
aquarium in their office because fifteen brains think better than one. ]
--
Guide to non-spammers: If you want to send me a business offer, please be
specific and do not put "business offer" in the Subject header. If it is
urgent, do not use the word "urgent". If you need an immediate answer,
give me a reason, do not shout "for your immediate attention". Thank you.
It was asserted to me that inverting the array was meaningful- however
I don't know the exact application beyond its something a CS prof at
the local university is working with. Our default position with
respect to memory was to stuff the machine with as much ram as
possible & use plenty of swap. Either that, or beg, bribe or steal
time on the university's fairly studly Sun hardware.
The prof in question ended up working up a Pascal program (poor guy-
the C people have it good compared to the Pascal people) using then
"Strassen algorithm". I don't know if its finished running yet...
Gregm