Does anyone know the reason that Collection.size returns Integer.MAX_VALUE when the the collection size is greater than that? The reason I'm asking is because we, the Python (programming language) developers, are considering imitating this with our sequences. It seems to be that this is akin to silently lying and could be quite confusing. Am I missing some practical benefit from this?
On Wed, 30 Apr 2008 17:45:51 -0700, Benjamin wrote: > Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that? The > reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It seems > to be that this is akin to silently lying and could be quite confusing. > Am I missing some practical benefit from this?
This is actually a problem sometimes. I think it would be better to return -1 if you have to return an int, or some value that indicates that the size is unknown, but greater than Integer.MAX_VALUE.
Just my 2 cents. This will end up being a problem for Java at some point.
-- Kenneth P. Turvey <kt-use...@squeakydolphin.com>
Benjamin wrote: > Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that? > The reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It > seems to be that this is akin to silently lying and could be quite > confusing. Am I missing some practical benefit from this?
The size method return an int, so it simply can not return a larger value.
BTW, some collection types can not contain more elements. What type are you using ?
Benjamin wrote: > Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that? > The reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It > seems to be that this is akin to silently lying and could be quite > confusing. Am I missing some practical benefit from this?
I think it may have been an attempt to break as little existing code as possible while dealing with the problem that some collections might contain more than Integer.MAX_VALUE elements. For example, any size-based test that a collection is non-empty or contains at least N elements still works correctly.
Benjamin wrote: > Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that? > The reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It > seems to be that this is akin to silently lying and could be quite > confusing. Am I missing some practical benefit from this?
I'm with you -- don't lie.
I'd return a long, just to be safer, or maybe some equivalent of BigNum.
Or throw an exception: "You called the int size() routine but we have more than int so bummer for you" sounds like a good message to me. The exception will be the speed freaks cue that they need to call a different size() routine.
On Wed, 30 Apr 2008 21:06:41 -0700, Mark Space wrote: > I'm with you -- don't lie.
> I'd return a long, just to be safer, or maybe some equivalent of BigNum.
> Or throw an exception: "You called the int size() routine but we have > more than int so bummer for you" sounds like a good message to me. The > exception will be the speed freaks cue that they need to call a > different size() routine.
I like this better than my return -1 suggestion. If you must stick with an int, and I wouldn't if you don't have to, then throw an exception.
In java something like a ReturnValueTooBigException().. :-)
-- Kenneth P. Turvey <kt-use...@squeakydolphin.com>
Kenneth P. Turvey wrote: > On Wed, 30 Apr 2008 21:06:41 -0700, Mark Space wrote:
>> I'm with you -- don't lie.
>> I'd return a long, just to be safer, or maybe some equivalent of BigNum.
>> Or throw an exception: "You called the int size() routine but we have >> more than int so bummer for you" sounds like a good message to me. The >> exception will be the speed freaks cue that they need to call a >> different size() routine.
> I like this better than my return -1 suggestion. If you must stick with > an int, and I wouldn't if you don't have to, then throw an exception.
> In java something like a ReturnValueTooBigException().. :-)
The ReturnValueTooBigException could even have a method declared as "long size()" that reports the actual size of the collection.
On Apr 30, 8:45 pm, Benjamin <musiccomposit...@gmail.com> wrote:
> Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that? > The reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It > seems to be that this is akin to silently lying and could be quite > confusing. Am I missing some practical benefit from this?
In Java, if a method's signature declares that it returns a type, the implementation cannot return an incompatible type. For reasons known only to the Java 1.2 team, the Collections API uses 'int' as the return type from Collection.size(), so the largest value that can possibly be returned is Integer.MAX_VALUE.
Not so in Python: methods have no signatures as such, and (as with smalltalk, ruby, lisp, and many other languages) arithmetic on bignums is identical to and compatible with arithmetic on machine integers.
Report the real size, if the size is available at all; use a bignum if you have to. Your users won't notice the difference (except perhaps in speed, but generally size isn't called in a tight loop) and your API will be consistent between small, in-memory collections and massive or procedurally-generated collections.
Benjamin wrote: > Does anyone know the reason that Collection.size returns > Integer.MAX_VALUE when the the collection size is greater than that?
1. The brunt of the API prefers 32-bit integers unless that is incapable, and Integer.MAX_VALUE is the largest 32-bit integer Java can represent. The other choices would have been to return a negative number, e.g., -1, which is not intuitive, or to return an exception, which violates the maxim of exceptions representing `exceptional' circumstances.
> The reason I'm asking is because we, the Python (programming language) > developers, are considering imitating this with our sequences. It > seems to be that this is akin to silently lying and could be quite > confusing. Am I missing some practical benefit from this?
My opinion, along with most of the others in this thread, is to not emulate this quirk. It is Java's nature which requires the return value, and python does not have the same structures hindering it.
That said, I would not call it "silently lying," but more an understatement. AFAICT, the collections that Java provides (excluding wrappers) cannot have a size greater than Integer.MAX_VALUE anyways, so the idea was probably a future-proofing decision. In addition, I feel that most collections code which needs the precise size would fail with collections for which the return value is wrong anyways, so the return value issue is more or less moot.
-- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
Mark Space wrote: > Benjamin wrote: >> Does anyone know the reason that Collection.size returns >> Integer.MAX_VALUE when the the collection size is greater than that? >> The reason I'm asking is because we, the Python (programming language) >> developers, are considering imitating this with our sequences. It >> seems to be that this is akin to silently lying and could be quite >> confusing. Am I missing some practical benefit from this?
> I'm with you -- don't lie.
> I'd return a long, just to be safer, or maybe some equivalent of BigNum.
But changing that would make a lot of existing code not compile.
Owen Jacobson wrote: > On Apr 30, 8:45 pm, Benjamin <musiccomposit...@gmail.com> wrote: >> Does anyone know the reason that Collection.size returns >> Integer.MAX_VALUE when the the collection size is greater than that? >> The reason I'm asking is because we, the Python (programming language) >> developers, are considering imitating this with our sequences. It >> seems to be that this is akin to silently lying and could be quite >> confusing. Am I missing some practical benefit from this?
> In Java, if a method's signature declares that it returns a type, the > implementation cannot return an incompatible type. For reasons known > only to the Java 1.2 team, the Collections API uses 'int' as the > return type from Collection.size(), so the largest value that can > possibly be returned is Integer.MAX_VALUE.
The collections that are backed by an array can not contain more elements than what can be in an int.
It seems rather consistent (but not necessarily wise) that all collections both array backed and other has the same limits as arrays.