Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

64-bit arrays

0 views
Skip to first unread message

Roedy Green

unread,
Sep 16, 2009, 7:11:09 AM9/16/09
to
I was wondering how you might modify java to permit 64-bit array
indexing.

Here is one possibility:

int[long] x = new int[ 3000000000L ];

String[long][int] y = new String[ 3000000000L ] [ 10 ];
--
Roedy Green Canadian Mind Products
http://mindprod.com

"Many people tend to look at programming styles and languages like religions: if you belong to one, you cannot belong to others. But this analogy is another fallacy."
~ Niklaus Wirth (born: 1934-02-15 age: 75)

Roedy Green

unread,
Sep 16, 2009, 9:49:07 AM9/16/09
to
On Wed, 16 Sep 2009 04:11:09 -0700, Roedy Green
<see_w...@mindprod.com.invalid> wrote, quoted or indirectly quoted
someone who said :

>int[long] x = new int[ 3000000000L ];


You might even allow Strings as indexes for abbreviated HashMap
syntax.

Daniel Pitts

unread,
Sep 16, 2009, 6:31:30 PM9/16/09
to
Roedy Green wrote:
> On Wed, 16 Sep 2009 04:11:09 -0700, Roedy Green
> <see_w...@mindprod.com.invalid> wrote, quoted or indirectly quoted
> someone who said :
>
>> int[long] x = new int[ 3000000000L ];
>
>
> You might even allow Strings as indexes for abbreviated HashMap
> syntax.
Why stop at Strings?
Why not allow arbitrary Object types as indexes?

Any way, I don't think you can enable 64bit indexes without a major
disruptive change. You can however, use arrays as "primitive" building
blocks to create LargeArray classes, which may have arrays of arrays,
where the Most Significant 32bits of the index is the outter array
index, and the Least Significant 32bits of the inner:

private class LargeIntArray {
int[][] array;

public int get(long index) {
array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE];
}

public void set(long index, int value) {
array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE] =
value;
}
}


--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Lew

unread,
Sep 16, 2009, 9:59:35 PM9/16/09
to
Daniel Pitts wrote:
> Any way, I don't think you can enable 64bit indexes without a major
> disruptive change. You can however, use arrays as "primitive" building
> blocks to create LargeArray classes, which may have arrays of arrays,
> where the Most Significant 32bits of the index is the outter array
> index, and the Least Significant 32bits of the inner:
>
> private class LargeIntArray {
> int[][] array;
>
> public int get(long index) {
> array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE];
> }
>
> public void set(long index, int value) {
> array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE] =
> value;
> }
> }

You need to downcast those array indexes to int.

<sscce>
package testit;

import java.io.Serializable;
import java.lang.reflect.Array;

public class LargeArray <E extends Serializable> implements Serializable
{
private static final long serialVersionUID = 1L;

private final E [] [] storage;

public LargeArray( Class <? extends E> clazz, int outer, int inner )
{
@SuppressWarnings( "unchecked" ) // uses underlying class
E [] [] stor = (E [] []) Array.newInstance( clazz, outer, inner );
this.storage = stor;
}

public E get( long index )
{
return this.storage [(int)(index>>32)] [(int) index];
}

public void set( long index, E value )
{
this.storage [(int)(index>>32)] [(int) index] = value;
}
}
</sscce>

--
Lew

Daniel Pitts

unread,
Sep 17, 2009, 4:36:47 PM9/17/09
to
Lew wrote:
> Daniel Pitts wrote:
>> Any way, I don't think you can enable 64bit indexes without a major
>> disruptive change. You can however, use arrays as "primitive"
>> building blocks to create LargeArray classes, which may have arrays of
>> arrays, where the Most Significant 32bits of the index is the outter
>> array index, and the Least Significant 32bits of the inner:
>>
>> private class LargeIntArray {
>> int[][] array;
>>
>> public int get(long index) {
>> array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE];
>> }
>>
>> public void set(long index, int value) {
>> array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE] =
>> value;
>> }
>> }
>
> You need to downcast those array indexes to int.
Thanks, it was only pseudo-code meant to demonstrate the concept, but
you are indeed technically correct :-)

Lew

unread,
Sep 17, 2009, 7:39:10 PM9/17/09
to

However, I did not correct the major flaws, i.e., I left the shift at 32
instead of 31, I ignored the calculation of initial size from a long, I didn't
lock down the second dimension to Integer.MAX_VALUE, and of course, I
completely ignored the memory pressure of having an array exceed 8 GiB in size
and thus the question of whether 'LargeArray' is even useful or when.

--
Lew

Roedy Green

unread,
Sep 19, 2009, 6:04:06 AM9/19/09
to
On Thu, 17 Sep 2009 19:39:10 -0400, Lew <no...@lewscanon.com> wrote,

quoted or indirectly quoted someone who said :

> I

>completely ignored the memory pressure of having an array exceed 8 GiB in size
>and thus the question of whether 'LargeArray' is even useful or when.
>
>--

I hope that turns out to be one of those Bill Gates comments about a
desktop computer never having a need for more than 640K.


--
Roedy Green Canadian Mind Products
http://mindprod.com

"Perfect reusable components are not obtained at the first shot."
~ Bertrand Meyer (born: 1950 age: 59) 1989, creator of design by contract and the Eiffel language.

Lew

unread,
Sep 19, 2009, 10:42:29 AM9/19/09
to
Lew quoted or indirectly quoted someone who said :

>> I completely ignored the memory pressure of having an array exceed 8 GiB
>> in size and thus the question of whether 'LargeArray' is even useful or when.

Roedy Green wrote:
> I hope that turns out to be one of those Bill Gates comments about a
> desktop computer never having a need for more than 640K.

I hope it doesn't. In any event my comment was not a prediction but historic.

How many of your computers have more than 8 GiB of RAM on them, much less to
spare for a single data structure?

--
Lew

Casey Hawthorne

unread,
Sep 20, 2009, 2:22:51 AM9/20/09
to
>How many of your computers have more than 8 GiB of RAM on them, much less to
>spare for a single data structure?

Computers for video processing, scientific computing where one usually
has data of huge^3, etc.
--
Regards,
Casey

Lew

unread,
Sep 20, 2009, 10:10:50 AM9/20/09
to
(Please attribute quotations.)

Lew wrote:
>> How many of your computers have more than 8 GiB of RAM on them, much less to
>> spare for a single data structure?

Casey Hawthorne wrote:
> Computers for video processing, scientific computing where one usually
> has data of huge^3, etc.

You didn't answer any part of the question - "how many", "of your" and "for a
single data structure" being salient. Is there even one video-processing app
that requires a single structure to exceed 8 GB?

I work on enterprise systems where they allocate 16-64 GB per JVM and process
tens of millions of documents during a peak week. These applications can't
tolerate a single data structure that requires 16 GB. When things start to
exceed one GB per structure they look for disk to supplement the activity.

So, Casey, do you from your own experience know of a specific situation that
tolerates, let alone requires a single Java structure of 16 GB or larger? (16
GiB is the minimum size of a 'LargeArray', since a single-row one is
equivalent to a regular int-indexed array and 'LargeArray' wouldn't be needed
there.) If you do, cite it instead of vaguely waving your hands about "video
processing" or "huge^3", whatever that is.

Note that I'm not saying such a thing cannot ever exist. Only that at present
the use case is so rare as not to justify a language change yet.

--
Lew

Roedy Green

unread,
Sep 21, 2009, 10:46:01 AM9/21/09
to
On Sat, 19 Sep 2009 10:42:29 -0400, Lew <no...@lewscanon.com> wrote,

quoted or indirectly quoted someone who said :

>


>How many of your computers have more than 8 GiB of RAM on them, much less to
>spare for a single data structure?

Back in the 90s the oil companies had such beasts where their
databases resided entirely in RAM. They could not process nearly fast
enough otherwise, and they had money to throw at the problem.

RAM is cheap enough so that for even a modest increase in
productivity, it will pay businesses to load up to whatever the
current motherboard limit is.

Recall the long and short pointers you had to fool around with in C to
deal with the segmented architecture of the 80286. It greatly
complicates coding. I don't think many people were willing to stick
with it, even if the code were tighter than flat 32-bit addressing.

--
Roedy Green Canadian Mind Products
http://mindprod.com

"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."
~ Howard Aiken (born: 1900-03-08 died: 1973-03-14 at age: 73)

Roedy Green

unread,
Sep 21, 2009, 10:52:32 AM9/21/09
to
On Sat, 19 Sep 2009 23:22:51 -0700, Casey Hawthorne
<caseyhHA...@istar.ca> wrote, quoted or indirectly quoted
someone who said :

>Computers for video processing, scientific computing where one usually


>has data of huge^3, etc.

A friend worked for Creo. They produced high speed PostScript colour
printers for the magazine industry. The images were huge and high res
with very accurate colour depth -- in other words huge amounts of RAM.
They needed special hardware to shovel these bits around quickly.

Another place you might need huge amounts of RAM is in a cable TV
provider studio. You have hundreds of simultaneous video inputs and
outputs.

--
Roedy Green Canadian Mind Products
http://mindprod.com

"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."

Roedy Green

unread,
Sep 21, 2009, 11:51:22 AM9/21/09
to
On Sun, 20 Sep 2009 10:10:50 -0400, Lew <no...@lewscanon.com> wrote,

quoted or indirectly quoted someone who said :

>>> How many of your computers have more than 8 GiB of RAM on them, much less to

>>> spare for a single data structure?

anything 3D gobbles RAM amazingly. I suspect the weather predicting
models could use giant arrays.


--
Roedy Green Canadian Mind Products
http://mindprod.com

"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."

Lew

unread,
Sep 21, 2009, 8:34:19 PM9/21/09
to
Roedy Green wrote:
> On Sun, 20 Sep 2009 10:10:50 -0400, Lew <no...@lewscanon.com> wrote,
> quoted or indirectly quoted someone who said :
>
>>>> How many of your computers have more than 8 GiB of RAM on them, much less to
>>>> spare for a single data structure?
>
> anything 3D gobbles RAM amazingly. I suspect the weather predicting
> models could use giant arrays.

Yes, but by their nature 3D models tend to have multidimensional arrays. A
lot of them will not at this time be written in Java, and if individual axes
of the multi-D arrays in those that are Java-based require more than
Integer.MAX_VALUE elements in any one dimension, an even rarer requirement,
they have workarounds.

However, you missed the point of my question. I didn't ask if you can
envision a requirement for that much memory. I asked how many of *your*
computers (expand to those you work with, not necessarily own yourself) have
>8 GiB RAM (really, >16 GiB to accommodate 'LargeArray' as envisioned here),
never mind enough to spare that much for a single array. My point is that
large-array applications are rare, not that they're non-existent, and that
existing language features allow Java programmers to handle them without
changing how array indexes work. Yet.

Even were that much RAM available, it is likely that it would be distributed
around multiple nodes (JVMs) so that multiple processing resources would come
to bear on the problem. Large-model arrays don't necessarily need long indexes.

--
Lew

Lothar Kimmeringer

unread,
Sep 24, 2009, 9:21:59 AM9/24/09
to
Lew wrote:

> I work on enterprise systems where they allocate 16-64 GB per JVM and process
> tens of millions of documents during a peak week.

So? That just shows that you're in a business where you don't have
to cope with large data-structures.

> These applications can't
> tolerate a single data structure that requires 16 GB. When things start to
> exceed one GB per structure they look for disk to supplement the activity.

So? It just shows that your application is

- bad designed if that happens very often and has a bad impact on the
overall performance
- is not designed to cope with large data-structures because it
rarely happens

> So, Casey, do you from your own experience know of a specific situation that
> tolerates, let alone requires a single Java structure of 16 GB or larger?

In dclj there were discussions about this as well and there are people
who implement applications in the health industry where large high-
quality pictures have to be processed and you more and more reach
the limit of arrays in java. If you have to analyse three-dimensional
data within a single array, you reach the magical point with a cube
of 322 ARGB pixels side-length. With 2D data you reach the limit
with 11585 pixels. This sounds like much but medical data increases in
size quite fast.

> (16
> GiB is the minimum size of a 'LargeArray', since a single-row one is
> equivalent to a regular int-indexed array and 'LargeArray' wouldn't be needed
> there.)

Integer.MAX_VALUE is something about 2 billion, so where do the
remaining 8 GB come from in your calculation? With a primitive
int-array, the values should reside in a continuous area of the
memory with no need to have a reference to each element. You
might be correct with a 64 bit-system but that should be VM-
dependent how an int-array is saved in memory (where a C-int is
64 bytes).

> Note that I'm not saying such a thing cannot ever exist. Only that at present
> the use case is so rare as not to justify a language change yet.

Creating a class LargeIntArray is no change in the language
and that we are reaching the point where larger array-sizes
are becoming common you can easily see that the bug has been
found in the Arrays-sort-algorithm that happened when you
tried to sort arrays of sizes larger than 1GB.


Regards, Lothar
--
Lothar Kimmeringer E-Mail: spam...@kimmeringer.de
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!

Lew

unread,
Sep 24, 2009, 7:25:37 PM9/24/09
to
Lothar Kimmeringer wrote:
> Integer.MAX_VALUE is something about 2 billion, so where do the
> remaining 8 GB come from in your calculation?

An integer takes four bytes. Four times two is eight. If you need
'LargeArray' as we've been discussing, it's because you need at least two rows
of the int [] []. Eight times two is sixteen.

--
Lew

Roedy Green

unread,
Sep 27, 2009, 8:25:49 PM9/27/09
to
On Mon, 21 Sep 2009 20:34:19 -0400, Lew <no...@lewscanon.com> wrote,

quoted or indirectly quoted someone who said :

>


>Yes, but by their nature 3D models tend to have multidimensional arrays

One you personally might like to play with is arrays representing a
genome, programs that look for repeating patterns etc.


--
Roedy Green Canadian Mind Products
http://mindprod.com

When you are programming and you feel paralysis from overwhelm:
(o) Write down your worries in an electronic todo list, so you can temporarily stop thinking about them.
(o) Write smaller modules and classes so you don�t need to juggle many facts to complete the code.
(o) Do some necessary, but simple, clerical task.
(o) Solve some simple problem with at least some probability of being useful in the big solution. It will break the logjam.
~ Roedy

Lew

unread,
Sep 27, 2009, 11:53:55 PM9/27/09
to
Roedy Green wrote:
> One you personally might like to play with is arrays representing a
> genome, programs that look for repeating patterns etc.

That actually is a fascinating project, perhaps one you might put on
mindprod's projects page if it isn't there already.

My mind tosses up several notions virtually on sight of this idea:

- Java for this? Really? I'm thinking Fortran or C++.
- Array accelerator hardware.
- Distributed algorithms - OK, maybe Java.

Then I start thinking.

I wouldn't necessarily use monolithic arrays in RAM. I'd factor the problem
into chunks that would fit within Java's current limits and play games with
multiple cooperating structures.

When you start thinking about big problems, you have to start thinking about
scalability. Singleton arrays no matter the index size aren't scalable.

Since I'm thinking in terms of distributed algorithms and scalability, I don't
regard the index limit of Integer.MAX_VALUE as a barrier.

When it does become a barrier, and there are situations where it could be, I'd
favor an API change over a language change. Something along the lines of the
LargeArray class, maybe one that takes a BigInteger index, with a service
provider that uses a custom classloader to plug in a library that leverages
add-in matrix boards. When we get into big structures like that, we're
talking about something different from plain ol' vanilla arrays and more
potent than just a bunch of indexable holding pens.

The big problems won't be amenable to current techniques anyway. A
computation mechanism that uses holographic interference to modulate light in
accordance with processing algorithms would be helpful. The computer science
of simultaneity will be as interesting as the computer science of concurrency.

--
Lew

Roedy Green

unread,
Sep 29, 2009, 11:15:11 AM9/29/09
to
On Wed, 16 Sep 2009 04:11:09 -0700, Roedy Green
<see_w...@mindprod.com.invalid> wrote, quoted or indirectly quoted
someone who said :

>I was wondering how you might modify java to permit 64-bit array
>indexing.

Why do you need it? There are class of arguments of the form "If you
build it, they will come."


A successful [software] tool is one that was used to do something
undreamed of by its author.
~ Stephen Curtis Johnson , winner of the first Turing award.

A large part of mathematics which becomes useful developed with
absolutely no desire to be useful, and in a situation where nobody
could possibly know in what area it would become useful; and there
were no general indications that it ever would be so. By and large it
is uniformly true in mathematics that there is a time lapse between a
mathematical discovery and the moment when it is useful; and that this
lapse of time can be anything from 30 to 100 years, in some cases even
more; and that the whole system seems to function without any
direction, without any reference to usefulness, and without any desire
to do things which are useful.
~ John von Neumann (born: 1903-12-28 died: 1957-02-08 at age: 53)

Scientific advances are enabled by a technology advance that allows us
to see what we have not been able to see before.
~ Lloyd Watts (born: 1961-10-02 age: 47)

--
Roedy Green Canadian Mind Products
http://mindprod.com

On two occasions I have been asked [by members of Parliament], "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
~ Charles Babbage (born: 1791-12-26 died: 1871-10-18 at age: 79)

0 new messages