This is actually a problem sometimes. I think it would be better to
return -1 if you have to return an int, or some value that indicates that
the size is unknown, but greater than Integer.MAX_VALUE.
Just my 2 cents. This will end up being a problem for Java at some
point.
--
Kenneth P. Turvey <kt-u...@squeakydolphin.com>
The size method return an int, so it simply can not return
a larger value.
BTW, some collection types can not contain more elements. What
type are you using ?
Arne
I think it may have been an attempt to break as little existing code as
possible while dealing with the problem that some collections might
contain more than Integer.MAX_VALUE elements. For example, any
size-based test that a collection is non-empty or contains at least N
elements still works correctly.
Patricia
I'm with you -- don't lie.
I'd return a long, just to be safer, or maybe some equivalent of BigNum.
Or throw an exception: "You called the int size() routine but we have
more than int so bummer for you" sounds like a good message to me. The
exception will be the speed freaks cue that they need to call a
different size() routine.
> I'm with you -- don't lie.
>
> I'd return a long, just to be safer, or maybe some equivalent of BigNum.
>
> Or throw an exception: "You called the int size() routine but we have
> more than int so bummer for you" sounds like a good message to me. The
> exception will be the speed freaks cue that they need to call a
> different size() routine.
I like this better than my return -1 suggestion. If you must stick with
an int, and I wouldn't if you don't have to, then throw an exception.
In java something like a ReturnValueTooBigException().. :-)
[Snip]
>
> Therefore, it seems that this regulation can not break existing code.
I think Patricia was referring to breakage that might occur as the size
of a collection passes the Integer.MAX_VALUE boundary.
The ReturnValueTooBigException could even have a method declared as
"long size()" that reports the actual size of the collection.
Patricia
In Java, if a method's signature declares that it returns a type, the
implementation cannot return an incompatible type. For reasons known
only to the Java 1.2 team, the Collections API uses 'int' as the
return type from Collection.size(), so the largest value that can
possibly be returned is Integer.MAX_VALUE.
Not so in Python: methods have no signatures as such, and (as with
smalltalk, ruby, lisp, and many other languages) arithmetic on bignums
is identical to and compatible with arithmetic on machine integers.
Report the real size, if the size is available at all; use a bignum if
you have to. Your users won't notice the difference (except perhaps
in speed, but generally size isn't called in a tight loop) and your
API will be consistent between small, in-memory collections and
massive or procedurally-generated collections.
1. The brunt of the API prefers 32-bit integers unless that is
incapable, and Integer.MAX_VALUE is the largest 32-bit integer Java can
represent. The other choices would have been to return a negative
number, e.g., -1, which is not intuitive, or to return an exception,
which violates the maxim of exceptions representing `exceptional'
circumstances.
> The reason I'm asking is because we, the Python (programming language)
> developers, are considering imitating this with our sequences. It
> seems to be that this is akin to silently lying and could be quite
> confusing. Am I missing some practical benefit from this?
My opinion, along with most of the others in this thread, is to not
emulate this quirk. It is Java's nature which requires the return value,
and python does not have the same structures hindering it.
That said, I would not call it "silently lying," but more an
understatement. AFAICT, the collections that Java provides (excluding
wrappers) cannot have a size greater than Integer.MAX_VALUE anyways, so
the idea was probably a future-proofing decision. In addition, I feel
that most collections code which needs the precise size would fail with
collections for which the return value is wrong anyways, so the return
value issue is more or less moot.
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
But changing that would make a lot of existing code not compile.
Arne
The collections that are backed by an array can not contain
more elements than what can be in an int.
It seems rather consistent (but not necessarily wise) that
all collections both array backed and other has the same
limits as arrays.
Arne
Those that seek to create collections with more elements that can be counted
by int will surely need to bump up the JVM's -Xmx parameter somewhat.
--
Lew
Yep.
I find it difficult to believe that it is areal problem today.
But it will become a problem in maybe 10 years.
Arne
> Yep.
>
> I find it difficult to believe that it is areal problem today.
>
> But it will become a problem in maybe 10 years.
There has been at least one poster to this group that has had a problem
with it. I would be surprised to find that nobody was dealing with
collections with this many elements now.
In 10 years it will be a problem we have to grapple with on a regular
basis. It is a bad design. Throwing an exception would have made much
more sense.
Although I have not personally encountered the problem, I was running
Java on machines with enough memory for a collection with more than
Integer.MAX_VALUE elements in 2002.
Patricia
--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>
I don't think so.
long is Java not Python.
Arne
Somebody probably has. But not many.
> In 10 years it will be a problem we have to grapple with on a regular
> basis. It is a bad design. Throwing an exception would have made much
> more sense.
The real problem is not the return value but the return type.
Arne
>
> Although I have not personally encountered the problem, I was running
> Java on machines with enough memory for a collection with more than
> Integer.MAX_VALUE elements in 2002.
If you count virtual memory and hard drive space, I think all of us have
had machines capable of holding Collections 2 gigabytes and larger in
size for longer than that.
However, if one has a Collection backed by, say, a database with
100,000,000,000 rows, I don't think a Collection would be the best API
to access it. So this whole discussion might be highly theoretical.
Perhaps Sun (or someone else) could come up with an API that's more
convenient for manipulating databases and is similar to Collections, if
that's desired.
For many tasks that would require more than Integer.MAX_VALUE, the
Collections interface is just fine. The only reason we don't do it often
now is that our machines aren't really up to the task.
It won't be long before that simply isn't true.
Several other APIs already exist for accessing database objects. I don't
think we really need another one. You might really find a database
backed collection to be a good interface for many operations.
Databases nicely combine huge key domains with (relatively) sparse key sets,
then add built-in storage and caching, remote accessibility, sophisticated
query capabilities and a soupçon of programmatic connectivity.
Even terabyte-scale stores draw their keys from domains with sagans and sagans
of possible values, so in that sense they're sparse while still being huge.
Serving that oxymoronic goal is what DBMSes do well, for varying degrees of well.
Some are even perfect for in-memory embedded work, scale permitting. Apache
Derby (Java DB) comes to mind.
These won't be a drop-in replacement for Collections, of course. They can map
to Collections readily enough, via JPA and such, but your app will necessarily
think in wider terms than mere keystore. So coding, deployment and operations
effort increases, but you do get a huge capability boost with a DBMS.
--
Lew
Those working with servers.
> However, if one has a Collection backed by, say, a database with
> 100,000,000,000 rows, I don't think a Collection would be the best API
> to access it. So this whole discussion might be highly theoretical.
I agree. I must admit that I have never seen a non-memory backed
collection.
Arne
[Snip]
> These won't be a drop-in replacement for Collections, of course. They
> can map to Collections readily enough, via JPA and such, but your app
> will necessarily think in wider terms than mere keystore. So coding,
> deployment and operations effort increases, but you do get a huge
> capability boost with a DBMS.
[Snip]
The point of all this was that database backed Collections may have their
place. I think they do. So we already have a use case for collections
with more than Integer.MAX_VALUE entries.
> I agree. I must admit that I have never seen a non-memory backed
> collection.
I've thought about writing one on several occasions for various reasons.
I've never actually done it, but it does come up with regularity. I'm
sure others have actually implemented them.
If for no other reason than the limitation on size(), I think a
different API would be the best way to start.
Given that such an API would almost certainly have to deal with persistent
stores, and that such collections actually are persistence abstractions, we
could call the API the Java Persistence API, and have it manifest a library to
interface an object-oriented, collection-based model to an arbitrary
persistence engine. We should make it annotation- or descriptor-file-based at
the architect's will, and it can then use more-or-less POJO types to represent
the entities to be collected. This whole JPA layer could abstract the mapping
between the backing store and the object model. In ideal world such a thing
would already be standardized,
<http://java.sun.com/javaee/5/docs/tutorial/doc/bnbpy.html>
and have at least two solid, free implementations,
<http://www.hibernate.org/>
<http://openjpa.apache.org/>
<https://glassfish.dev.java.net/downloads/persistence/JavaPersistence.html>
and perhaps work well with, even be included as part of existing application
servers and frameworks.
<https://glassfish.dev.java.net/>
<http://www.springframework.org/>
<http://www-306.ibm.com/software/webservers/appserv/was/>
--
Lew
> <http://java.sun.com/javaee/5/docs/tutorial/doc/bnbpy.html>
> and have at least two solid, free implementations,
> <http://www.hibernate.org/>
> <http://openjpa.apache.org/>
> <https://glassfish.dev.java.net/downloads/persistence/JavaPersistence.html>
> and perhaps work well with, even be included as part of existing
> application servers and frameworks.
> <https://glassfish.dev.java.net/>
> <http://www.springframework.org/>
> <http://www-306.ibm.com/software/webservers/appserv/was/>
>
Wow that Kenneth guy is a fast worker! ;)
I can not google one either.
The closest is:
http://java.sun.com/j2se/1.5.0/docs/guide/collections/designfaq.html#5
Arne
> Wow that Kenneth guy is a fast worker! ;)
Wait.. I was the guy that said we didn't need another API!
>Does anyone know the reason that Collection.size returns
>Integer.MAX_VALUE when the the collection size is greater than that?
I suppose the when Collection was written, the authors were
unconsciously thinking in terms of 32-bit JVMs so had size return int
rather than long. It would have been impossible to have a ram-based
collection with more elements than that.
--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
It is still impossible to have an array backed collection with
more elements on a 64 bit JVM.
Arne
>> I suppose the when Collection was written, the authors were
>> unconsciously thinking in terms of 32-bit JVMs so had size return int
>> rather than long. It would have been impossible to have a ram-based
>> collection with more elements than that.
>
> It is still impossible to have an array backed collection with more
> elements on a 64 bit JVM.
True enough, but I'm not sure why Roedy brings this up anyway. There is
no reason to believe that a Collection must be array backed (or for that
matter, backed by a single array).
Some are.
And having some collections being able to contain more elements
than other could be said to expose implementation.
It could be argued that the List interface should contain
a note that says max. 2G elements, because then the interface
would be consistent.
Arne
>It is still impossible to have an array backed collection with
>more elements on a 64 bit JVM.
Collections might be backed on disk or using an array of arrays which
could in theory break the Integer.MAX_VALUE limit.
Collection.toArray implies an Integer.MAX_VALUE limit on collection
size.
At some point Java will acquire 64-bit indexed arrays. It will be
interesting to see how they stitch them in.