Here is one possibility:
int[long] x = new int[ 3000000000L ];
String[long][int] y = new String[ 3000000000L ] [ 10 ];
--
Roedy Green Canadian Mind Products
http://mindprod.com
"Many people tend to look at programming styles and languages like religions: if you belong to one, you cannot belong to others. But this analogy is another fallacy."
~ Niklaus Wirth (born: 1934-02-15 age: 75)
>int[long] x = new int[ 3000000000L ];
You might even allow Strings as indexes for abbreviated HashMap
syntax.
Any way, I don't think you can enable 64bit indexes without a major
disruptive change. You can however, use arrays as "primitive" building
blocks to create LargeArray classes, which may have arrays of arrays,
where the Most Significant 32bits of the index is the outter array
index, and the Least Significant 32bits of the inner:
private class LargeIntArray {
int[][] array;
public int get(long index) {
array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE];
}
public void set(long index, int value) {
array[(index>>32)&Integer.MAX_VALUE][index&Integer.MAX_VALUE] =
value;
}
}
--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
You need to downcast those array indexes to int.
<sscce>
package testit;
import java.io.Serializable;
import java.lang.reflect.Array;
public class LargeArray <E extends Serializable> implements Serializable
{
private static final long serialVersionUID = 1L;
private final E [] [] storage;
public LargeArray( Class <? extends E> clazz, int outer, int inner )
{
@SuppressWarnings( "unchecked" ) // uses underlying class
E [] [] stor = (E [] []) Array.newInstance( clazz, outer, inner );
this.storage = stor;
}
public E get( long index )
{
return this.storage [(int)(index>>32)] [(int) index];
}
public void set( long index, E value )
{
this.storage [(int)(index>>32)] [(int) index] = value;
}
}
</sscce>
--
Lew
However, I did not correct the major flaws, i.e., I left the shift at 32
instead of 31, I ignored the calculation of initial size from a long, I didn't
lock down the second dimension to Integer.MAX_VALUE, and of course, I
completely ignored the memory pressure of having an array exceed 8 GiB in size
and thus the question of whether 'LargeArray' is even useful or when.
--
Lew
> I
>completely ignored the memory pressure of having an array exceed 8 GiB in size
>and thus the question of whether 'LargeArray' is even useful or when.
>
>--
I hope that turns out to be one of those Bill Gates comments about a
desktop computer never having a need for more than 640K.
--
Roedy Green Canadian Mind Products
http://mindprod.com
"Perfect reusable components are not obtained at the first shot."
~ Bertrand Meyer (born: 1950 age: 59) 1989, creator of design by contract and the Eiffel language.
Roedy Green wrote:
> I hope that turns out to be one of those Bill Gates comments about a
> desktop computer never having a need for more than 640K.
I hope it doesn't. In any event my comment was not a prediction but historic.
How many of your computers have more than 8 GiB of RAM on them, much less to
spare for a single data structure?
--
Lew
Computers for video processing, scientific computing where one usually
has data of huge^3, etc.
--
Regards,
Casey
Casey Hawthorne wrote:
> Computers for video processing, scientific computing where one usually
> has data of huge^3, etc.
You didn't answer any part of the question - "how many", "of your" and "for a
single data structure" being salient. Is there even one video-processing app
that requires a single structure to exceed 8 GB?
I work on enterprise systems where they allocate 16-64 GB per JVM and process
tens of millions of documents during a peak week. These applications can't
tolerate a single data structure that requires 16 GB. When things start to
exceed one GB per structure they look for disk to supplement the activity.
So, Casey, do you from your own experience know of a specific situation that
tolerates, let alone requires a single Java structure of 16 GB or larger? (16
GiB is the minimum size of a 'LargeArray', since a single-row one is
equivalent to a regular int-indexed array and 'LargeArray' wouldn't be needed
there.) If you do, cite it instead of vaguely waving your hands about "video
processing" or "huge^3", whatever that is.
Note that I'm not saying such a thing cannot ever exist. Only that at present
the use case is so rare as not to justify a language change yet.
--
Lew
>
>How many of your computers have more than 8 GiB of RAM on them, much less to
>spare for a single data structure?
Back in the 90s the oil companies had such beasts where their
databases resided entirely in RAM. They could not process nearly fast
enough otherwise, and they had money to throw at the problem.
RAM is cheap enough so that for even a modest increase in
productivity, it will pay businesses to load up to whatever the
current motherboard limit is.
Recall the long and short pointers you had to fool around with in C to
deal with the segmented architecture of the 80286. It greatly
complicates coding. I don't think many people were willing to stick
with it, even if the code were tighter than flat 32-bit addressing.
--
Roedy Green Canadian Mind Products
http://mindprod.com
"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."
~ Howard Aiken (born: 1900-03-08 died: 1973-03-14 at age: 73)
>Computers for video processing, scientific computing where one usually
>has data of huge^3, etc.
A friend worked for Creo. They produced high speed PostScript colour
printers for the magazine industry. The images were huge and high res
with very accurate colour depth -- in other words huge amounts of RAM.
They needed special hardware to shovel these bits around quickly.
Another place you might need huge amounts of RAM is in a cable TV
provider studio. You have hundreds of simultaneous video inputs and
outputs.
--
Roedy Green Canadian Mind Products
http://mindprod.com
"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."
>>> How many of your computers have more than 8 GiB of RAM on them, much less to
>>> spare for a single data structure?
anything 3D gobbles RAM amazingly. I suspect the weather predicting
models could use giant arrays.
--
Roedy Green Canadian Mind Products
http://mindprod.com
"Don�t worry about people stealing an idea; if it�s original, you�ll have to shove it down their throats."
Yes, but by their nature 3D models tend to have multidimensional arrays. A
lot of them will not at this time be written in Java, and if individual axes
of the multi-D arrays in those that are Java-based require more than
Integer.MAX_VALUE elements in any one dimension, an even rarer requirement,
they have workarounds.
However, you missed the point of my question. I didn't ask if you can
envision a requirement for that much memory. I asked how many of *your*
computers (expand to those you work with, not necessarily own yourself) have
>8 GiB RAM (really, >16 GiB to accommodate 'LargeArray' as envisioned here),
never mind enough to spare that much for a single array. My point is that
large-array applications are rare, not that they're non-existent, and that
existing language features allow Java programmers to handle them without
changing how array indexes work. Yet.
Even were that much RAM available, it is likely that it would be distributed
around multiple nodes (JVMs) so that multiple processing resources would come
to bear on the problem. Large-model arrays don't necessarily need long indexes.
--
Lew
> I work on enterprise systems where they allocate 16-64 GB per JVM and process
> tens of millions of documents during a peak week.
So? That just shows that you're in a business where you don't have
to cope with large data-structures.
> These applications can't
> tolerate a single data structure that requires 16 GB. When things start to
> exceed one GB per structure they look for disk to supplement the activity.
So? It just shows that your application is
- bad designed if that happens very often and has a bad impact on the
overall performance
- is not designed to cope with large data-structures because it
rarely happens
> So, Casey, do you from your own experience know of a specific situation that
> tolerates, let alone requires a single Java structure of 16 GB or larger?
In dclj there were discussions about this as well and there are people
who implement applications in the health industry where large high-
quality pictures have to be processed and you more and more reach
the limit of arrays in java. If you have to analyse three-dimensional
data within a single array, you reach the magical point with a cube
of 322 ARGB pixels side-length. With 2D data you reach the limit
with 11585 pixels. This sounds like much but medical data increases in
size quite fast.
> (16
> GiB is the minimum size of a 'LargeArray', since a single-row one is
> equivalent to a regular int-indexed array and 'LargeArray' wouldn't be needed
> there.)
Integer.MAX_VALUE is something about 2 billion, so where do the
remaining 8 GB come from in your calculation? With a primitive
int-array, the values should reside in a continuous area of the
memory with no need to have a reference to each element. You
might be correct with a 64 bit-system but that should be VM-
dependent how an int-array is saved in memory (where a C-int is
64 bytes).
> Note that I'm not saying such a thing cannot ever exist. Only that at present
> the use case is so rare as not to justify a language change yet.
Creating a class LargeIntArray is no change in the language
and that we are reaching the point where larger array-sizes
are becoming common you can easily see that the bug has been
found in the Arrays-sort-algorithm that happened when you
tried to sort arrays of sizes larger than 1GB.
Regards, Lothar
--
Lothar Kimmeringer E-Mail: spam...@kimmeringer.de
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)
Always remember: The answer is forty-two, there can only be wrong
questions!
An integer takes four bytes. Four times two is eight. If you need
'LargeArray' as we've been discussing, it's because you need at least two rows
of the int [] []. Eight times two is sixteen.
--
Lew
>
>Yes, but by their nature 3D models tend to have multidimensional arrays
One you personally might like to play with is arrays representing a
genome, programs that look for repeating patterns etc.
--
Roedy Green Canadian Mind Products
http://mindprod.com
When you are programming and you feel paralysis from overwhelm:
(o) Write down your worries in an electronic todo list, so you can temporarily stop thinking about them.
(o) Write smaller modules and classes so you don�t need to juggle many facts to complete the code.
(o) Do some necessary, but simple, clerical task.
(o) Solve some simple problem with at least some probability of being useful in the big solution. It will break the logjam.
~ Roedy
That actually is a fascinating project, perhaps one you might put on
mindprod's projects page if it isn't there already.
My mind tosses up several notions virtually on sight of this idea:
- Java for this? Really? I'm thinking Fortran or C++.
- Array accelerator hardware.
- Distributed algorithms - OK, maybe Java.
Then I start thinking.
I wouldn't necessarily use monolithic arrays in RAM. I'd factor the problem
into chunks that would fit within Java's current limits and play games with
multiple cooperating structures.
When you start thinking about big problems, you have to start thinking about
scalability. Singleton arrays no matter the index size aren't scalable.
Since I'm thinking in terms of distributed algorithms and scalability, I don't
regard the index limit of Integer.MAX_VALUE as a barrier.
When it does become a barrier, and there are situations where it could be, I'd
favor an API change over a language change. Something along the lines of the
LargeArray class, maybe one that takes a BigInteger index, with a service
provider that uses a custom classloader to plug in a library that leverages
add-in matrix boards. When we get into big structures like that, we're
talking about something different from plain ol' vanilla arrays and more
potent than just a bunch of indexable holding pens.
The big problems won't be amenable to current techniques anyway. A
computation mechanism that uses holographic interference to modulate light in
accordance with processing algorithms would be helpful. The computer science
of simultaneity will be as interesting as the computer science of concurrency.
--
Lew
>I was wondering how you might modify java to permit 64-bit array
>indexing.
Why do you need it? There are class of arguments of the form "If you
build it, they will come."
A successful [software] tool is one that was used to do something
undreamed of by its author.
~ Stephen Curtis Johnson , winner of the first Turing award.
A large part of mathematics which becomes useful developed with
absolutely no desire to be useful, and in a situation where nobody
could possibly know in what area it would become useful; and there
were no general indications that it ever would be so. By and large it
is uniformly true in mathematics that there is a time lapse between a
mathematical discovery and the moment when it is useful; and that this
lapse of time can be anything from 30 to 100 years, in some cases even
more; and that the whole system seems to function without any
direction, without any reference to usefulness, and without any desire
to do things which are useful.
~ John von Neumann (born: 1903-12-28 died: 1957-02-08 at age: 53)
Scientific advances are enabled by a technology advance that allows us
to see what we have not been able to see before.
~ Lloyd Watts (born: 1961-10-02 age: 47)
--
Roedy Green Canadian Mind Products
http://mindprod.com
On two occasions I have been asked [by members of Parliament], "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
~ Charles Babbage (born: 1791-12-26 died: 1871-10-18 at age: 79)