Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

setSize ArrayList, when will it come?

105 views
Skip to first unread message

Jan Burse

unread,
Aug 8, 2011, 6:47:12 PM8/8/11
to
Dear All

When will ArrayList have a setSize() method. Its lack
makes it practically impossible to consistently replace
Vector by ArrayList.

Any reason for brainlessly closing
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4147188
by oracle or who ever?

Bye

Knute Johnson

unread,
Aug 8, 2011, 7:53:43 PM8/8/11
to

It's because they hate you. They don't want you to have that method so
it will drive you crazy. It's never going to have that method. It's a
conspiracy of the highest order. They could add that method if they
wanted to, but they don't.

You know, you're not really paranoid if they're actually out to get you.

The sun's almost over the yard arm here, go someplace quiet, have a
beer, and chill.

--

Knute Johnson

Jan Burse

unread,
Aug 8, 2011, 8:30:51 PM8/8/11
to
Knute Johnson schrieb:

Lets make a facebook page (+) for ArrayList.setSize(). And have the
beer later when it is not raining any more in southern germany (*)
and northern switzerland.

(+)
http://www.facebook.com/groups/202554153131264/

(*)
http://www.mittelbayerische.de/nachrichten/panorama/artikel/regen_sommer_treibt_deutsche_i/690749/regen_sommer_treibt_deutsche_i.html

Arne Vajhøj

unread,
Aug 8, 2011, 8:50:12 PM8/8/11
to

That type of usage is not what ArrayList was intended for.

Those that need that functionality should simply implement
a few utility methods to do what they want.

Arne


Eric Sosman

unread,
Aug 8, 2011, 9:09:14 PM8/8/11
to
On 8/8/2011 6:47 PM, Jan Burse wrote:

The first sentence of the referenced bug reads

The ArrayList class in the new collections
framework is not useful (in the way that the
Vector class is) for implementing *sparse*,
variable-size arrays.

Part of this is true: ArrayList is not suitable for sparse
arrays. The silent implication that Vector *is* suitable for
sparse arrays, though, is false. Neither class is suitable
for sparse arrays. (Demonstration: Try inserting [0]->"A"
and [2173741823]->"Z" in either.)

Perhaps if you'd explain your use case in more detail
someone will have a suggestion you may find helpful.

Also, it seems to me you should apologize to the responders
and retract the word "brainlessly." More thought went into the
response than into the complaint.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Jan Burse

unread,
Aug 8, 2011, 10:16:05 PM8/8/11
to
Eric Sosman schrieb:

> Perhaps if you'd explain your use case in more detail
> someone will have a suggestion you may find helpful.

I have only referted to the meta use case in my post.
The software refactoring use case, is replacing a synchronized
Vector by an unsychronized ArrayList in regions where no synchronization
is necessary.

This meta use case could even have IDE support. For example
IntelliJ has IDE support for the detection of unnessary
use of StringBuffer and suggests the class StringBuilder.

Similarly an IDE could suggest ArrayList in certain places
where Vector is used. But it will not succeeded in such
cases where I did use setSize on my Vector.

I actually do use setSize for a kind of sparse Vector.
Sparse in the sense that my Vector will have a couple
of holes represented by null value elements. Which
is eventually abuse of the term "sparse", but the use
case is there.

The missing thought in the reponse is the design goal
of ArrayList. They are explicitly advertised in the
class comment as unsynchronized version of Vector. It
is brainless not to honor such a design goal in a response.

On the other hand it is very clever to have the possibility
of turning initial design goals into requests for
enhancement. Imagine the following situation:

You go into a restaurant and order pizza,
you get your pizza without cheese, you
ask the waiter what is wrong, and the waiter
says, oh this cannot be changed, cheese has
to be ordered separately.

Brainless are we, who accept this.

Best Regards

Joshua Cranmer

unread,
Aug 8, 2011, 10:39:10 PM8/8/11
to
On 8/8/2011 9:16 PM, Jan Burse wrote:
> The missing thought in the reponse is the design goal
> of ArrayList. They are explicitly advertised in the
> class comment as unsynchronized version of Vector. It
> is brainless not to honor such a design goal in a response.

An unsynchronized version does not necessarily imply that it need be a
full drop-in replacement for the class except with the synchronized
methods removed. Now you're going to argue that it's brainless that

Object a = new Vector();
Object b = new ArrayList();
System.out.println(a.getClass() == b.getClass());

returns false.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

Jan Burse

unread,
Aug 9, 2011, 3:33:07 AM8/9/11
to
Joshua Cranmer schrieb:
>
> returns false.

And where is the drop-in manipulation, what was the before and after?
I understand what you are heading for, the morning star is not
evening star, yet they are the same.

But typical scenario would be:

Before:

Object a = new Vector();

Object b = new Vector();
System.out.println(a.getClass() == b.getClass());

returns true.

After:

Object a = new ArrayList();


Object b = new ArrayList();
System.out.println(a.getClass() == b.getClass());

returns true.


Jan Burse

unread,
Aug 9, 2011, 3:42:21 AM8/9/11
to
But:

Before:

Vector a = new Vector();
Vector b = new Vector();
b.setSize(3);
System.out.println(a.getClass() == b.getClass());

returns true.

After:

(Since b makes use of setSize(). I can only
replace a with an ArrayList)

ArrayList a = new ArrayList ();
Vector b = new Vector();
b.setSize(3);
System.out.println(a.getClass() == b.getClass());

returns false;

More than annoying, brainless.

Best Regards

Jan Burse schrieb:

Andreas Leitgeb

unread,
Aug 9, 2011, 3:56:36 AM8/9/11
to
Joshua Cranmer <Pidg...@verizon.invalid> wrote:
> On 8/8/2011 9:16 PM, Jan Burse wrote:
>> The missing thought in the reponse is the design goal
>> of ArrayList. They are explicitly advertised in the
>> class comment as unsynchronized version of Vector. It
>> is brainless not to honor such a design goal in a response.

> An unsynchronized version does not necessarily imply that it need be a
> full drop-in replacement for the class except with the synchronized
> methods removed.

Whether "An unsynchronized version" implies "drop-in replacement
except for the synchronisation" seems to be a bit fuzzy. To me
it kind of does. But even if it didn't, then including a simple
convenience method to simplify transition from Vector to ArrayList
wouldn't appear as a mistake to me - even though there *is* a clumsy
workaround for expanding and some different workaround for shrinking.

> Now you're going to argue that it's brainless that ...

Sorry to be so direct, but this "prognose" appeared entirely
brainless to me. I know, that your arguments usually aren't.
I guess this is just the type of answer you give to people
that you "identified" as trolls.

Patricia Shanahan

unread,
Aug 9, 2011, 4:33:25 AM8/9/11
to
On 8/8/2011 7:16 PM, Jan Burse wrote:
...

> I actually do use setSize for a kind of sparse Vector.
> Sparse in the sense that my Vector will have a couple
> of holes represented by null value elements. Which
> is eventually abuse of the term "sparse", but the use
> case is there.
...

If you only need small numbers of null elements, you could write a class
extending ArrayList that has setSize(). All you would do is loop adding
null elements or removing the tail elements until the ArrayList is the
required size.

Patricia

Jan Burse

unread,
Aug 9, 2011, 5:16:48 AM8/9/11
to
Patricia Shanahan schrieb:

If only so many fields in ArrayList would not be private
I could do that. But since for example in JDK 1.6.0_26
none of the fields are protected, everything is private.

What you suggest is theoretically sound but practically
impossible. Look see:

public class ArrayList<E> extends ...
{
private transient Object[] elementData;
private int size;
...
}

And using reflection overriding this protection,
is kind of ugly and eventually less performant.

Bye

Jan Burse

unread,
Aug 9, 2011, 5:18:14 AM8/9/11
to
Andreas Leitgeb schrieb:

> Sorry to be so direct, but this "prognose" appeared entirely
> brainless to me. I know, that your arguments usually aren't.
> I guess this is just the type of answer you give to people
> that you "identified" as trolls.

I guess you guys have nothing todo, except feeding each other
in a troll like manner.

Bye

Jan Burse

unread,
Aug 9, 2011, 5:32:42 AM8/9/11
to
Jan Burse schrieb:

Interestingly ArrayList has ensureCapacity() which
is public. Whereby in Vector ensureCapacityHelper() is
private.

You are right, one could do a half way efficient setSize()
with ensureCapacity() of ArrayList, by calling
ensureCapacity() and then looping with add() of null.

But the request and idea is to have an efficient setSize().
In the spirit of the Vector setSize(). Namely:

/* from vector */

public synchronized void setSize(int newSize) {
modCount++;
if (newSize > elementCount) {
ensureCapacityHelper(newSize);
} else {
for (int i = newSize ; i < elementCount ; i++) {
elementData[i] = null;
}
}
elementCount = newSize;
}

The last statement and the preceeding loop are crucial.
They require direct access to elementCount and elementData.
These fields correspond to size and elementData in
ArrayList.

But maybe in some VMs/Plattforms/Architectur a loop add()
after ensureCapacity() doesn't show any performance penalty
and would be feasible. Have to check.

Best Regards

BTW: Both Vector and ArrayList have a little glitch, an
unused variable, probably anyway optimized away by the
JIT, but nevertheless:

ArrayList:

/* Glitch: oldData not used anymore. */

public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = (oldCapacity * 3)/2 + 1;
if (newCapacity < minCapacity)
newCapacity = minCapacity;
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
}

Vector:

/* Glitch: oldData not used anymore. */

private void ensureCapacityHelper(int minCapacity) {
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object[] oldData = elementData;
int newCapacity = (capacityIncrement > 0) ?
(oldCapacity + capacityIncrement) : (oldCapacity * 2);
if (newCapacity < minCapacity) {
newCapacity = minCapacity;
}
elementData = Arrays.copyOf(elementData, newCapacity);
}
}

Jan Burse

unread,
Aug 9, 2011, 5:35:00 AM8/9/11
to
Jan Burse schrieb:

>
> BTW: Both Vector and ArrayList have a little glitch, an
> unused variable, probably anyway optimized away by the
> JIT, but nevertheless:

Ok, they saw that:
http://bugs.sun.com/view_bug.do?bug_id=6812879

Patricia Shanahan

unread,
Aug 9, 2011, 7:00:44 AM8/9/11
to

I meant "loop adding null elements" literally.

while(size() < targetSize) {
add(null);
}

That uses only the public methods size and add. It would not be
efficient for a significantly sparse array, but should work fine if
there are only a few null elements. If there are large blocks of null
elements and efficiency matters, I would use neither Vector nor ArrayList.

Patricia

Joshua Cranmer

unread,
Aug 9, 2011, 11:36:55 AM8/9/11
to
On 8/9/2011 2:56 AM, Andreas Leitgeb wrote:

> Joshua Cranmer<Pidg...@verizon.invalid> wrote:
>> Now you're going to argue that it's brainless that ...
>
> Sorry to be so direct, but this "prognose" appeared entirely
> brainless to me. I know, that your arguments usually aren't.
> I guess this is just the type of answer you give to people
> that you "identified" as trolls.

I meant it as a rhetorical device to show that the argument, applied
strictly, leads to a rather untenable proposition. There are a fair
number of observable differences between Vector and ArrayList, even if
you exclude the part about synchronized methods.

I would also like to point out that in the time you have spent arguing
for this feature, you could have implemented a small class that had this
feature already.

Jan Burse

unread,
Aug 9, 2011, 12:30:26 PM8/9/11
to
Joshua Cranmer schrieb:

> There are a fair number of observable differences between Vector and
> ArrayList, even if you exclude the part about synchronized methods.
What are you refering to? Can you elaborate on your thoughts.

Roedy Green

unread,
Aug 9, 2011, 12:58:57 PM8/9/11
to
On Tue, 09 Aug 2011 00:47:12 +0200, Jan Burse <janb...@fastmail.fm>
wrote, quoted or indirectly quoted someone who said :

>When will ArrayList have a setSize() method. Its lack
>makes it practically impossible to consistently replace
>Vector by ArrayList.

This will not make you happy, but there is ArrayList.trimToSize

To grow an ArrayList, you must do it one slot at a time.

Further, I would hope ArrayList.addAll would be smart enough to grow
the array only once, if needed.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Most of computer code is for telling the computer
what do if some very particular thing goes wrong.

Patricia Shanahan

unread,
Aug 9, 2011, 1:25:31 PM8/9/11
to
On 8/9/2011 9:58 AM, Roedy Green wrote:
...

> Further, I would hope ArrayList.addAll would be smart enough to grow
> the array only once, if needed.
...

It does the following:

1. Grow to the needed size.

2. Call the other collection's toArray method.

3. System.arraycopy the toArray result into the ArrayList's elementData.

This double copy is going to be faster than the one at a time approach
only if large numbers of nulls are being added. If that is the case, the
structure is probably too sparse for ArrayList to be a good choice.

Patricia

Joshua Cranmer

unread,
Aug 9, 2011, 5:30:53 PM8/9/11
to

Anything reflective is a dead giveaway, and I'm pretty sure that the two
classes have slightly different sizes, so you could observe differences
in memory characteristics.

More seriously, the Collections API tweaked some method names
differently, so there are a few methods in Vector which are kept around
for legacy use (addElement, e.g.).

In general, an ArrayList is a list that happens to be backed by an
array. A Vector is a synchronized, automatically-growing array that
leaks details about its array all over the place; it was shoehorned into
the Collections API upon introduction to allow for a more gradual,
correct migration.

Tom Anderson

unread,
Aug 9, 2011, 5:38:03 PM8/9/11
to

It would be nice if ArrayList used a loop to do the copy for small added
collections; it could cut over to the array method for larger addends.

Anyway, with List.addAll and Collections.nCopies, we can write:

<T> void setSize(List<T> list, int size) {
int change = size - list.size();
if (change > 0) {
list.addAll(Collections.nCopies(change, null));
}
else if (change < 0) {
list.subList(size, list.size()).clear();
if (list instanceof ArrayList) ((ArrayList)list).trimToSize();
}
// else do nothing
}

I haven't tried that, but it should work.

So, Jan, less whining, more coding, please.

tom

--
As Emiliano Zapata supposedly said, "Better to die on your feet than
live on your knees." And years after he died, Marlon Brando played him
in a movie. So just think, if you unionize, Marlon Brando might play
YOU in a movie. Even though he's dead. -- ChrisV82

Jan Burse

unread,
Aug 9, 2011, 6:31:33 PM8/9/11
to
Joshua Cranmer schrieb:

> Anything reflective is a dead giveaway, and I'm pretty sure that the two
> classes have slightly different sizes, so you could observe differences
> in memory characteristics.

In vector you can parametrisize the growing characteristics, either
by a constant increment or doubling the size. For this purpose there
is an extra field not found in ArrayList.

Array list always follows a 150% + 1 rule when growing the internal
data buffer. There is no extra field need to store some parameter.
But otherwise the two use the same data representation.

But I think it would not prevent me from using ArrayList instead
of Vector, that this parameter is missing. The 150% has its merits
over the strategies implemented in vector.

> More seriously, the Collections API tweaked some method names
> differently, so there are a few methods in Vector which are kept
> around for legacy use (addElement, e.g.).

Yes when replacing Vector by ArrayList, one has to rename the method
calls. For example instead of addElement() one needs to use add(), and
instead of elementAt().

But renaming methods does not prevent me from using ArrayList. As
long each legacy method has a new buddy, I don't see any problem
whatever with it.

> In general, an ArrayList is a list that happens to be backed by an
> array. A Vector is a synchronized, automatically-growing array that
> leaks details about its array all over the place; it was shoehorned
> into the Collections API upon introduction to allow for a more
> gradual, correct migration.

I don't see that Vector leaks more details than ArrayList about its
implementation. Could you make an example? I only see that the fields
of vector are protected, and well yes this means vector is not fully
encapsulated.

But I want to migrate to ArrayList whereever possible, so why should
I bother that Vector is not fully encapsulated. And if ArrayList has
the merrit that it is fully encapsulated the better.

But the protected fields would only be seen by a subclass, which could
sneak into some of your parameters of your methods, and spy on you or
cheat on you. Since vector is not final. That is a drawback. But same
problem with ArrayList.

Vector and ArrayList both implement the following protocolls:
AbstractList<E>
List<E>,
RandomAccess,
Cloneable,
Serializable

If only somebody would have had the brains to include setSize()
somewhere. List has the methods size(), add() and remove() in it.
setSize(n) could be neatly abstractly specified as having the
post condition size()=n and after.get(i)=before.get(i) for
i<before.size() and after.get(i)=null for i>=before.size() and
i<after.size().

And the AbstractList could implement abstractly the inefficient
setSize() that would use remove() and add(), when Random access is
present via index, or otherwise maybe with a backward iterator.
Backward iterator is also missing btw. And then concrete classes
could provide more efficient implementations if necessary.

Bye


Best Regards

Jan Burse

unread,
Aug 9, 2011, 6:36:03 PM8/9/11
to
Tom Anderson schrieb:

This is not efficient. You don't get it what the
problem is. I really really need a highly efficient
setSize() specialized, otherwise my stuff will not work.


Roedy Green

unread,
Aug 9, 2011, 6:55:45 PM8/9/11
to
On Wed, 10 Aug 2011 00:36:03 +0200, Jan Burse <janb...@fastmail.fm>

wrote, quoted or indirectly quoted someone who said :

>This is not efficient. You don't get it what the


>problem is. I really really need a highly efficient
>setSize() specialized, otherwise my stuff will not work.

ArrayList is not that complicated a class. You might write your own
in an afternoon. If you get stuck, you can always peek at Sun's
source.

I wrote a "SortedArrayList" that lazily kept ArrayLists sorted. You
might cannibalise it. (I don't think I ever got around to putting
generics in it though). see http://mindprod.com/products1.html#SORTED

Roedy Green

unread,
Aug 9, 2011, 7:02:11 PM8/9/11
to
On Tue, 09 Aug 2011 00:47:12 +0200, Jan Burse <janb...@fastmail.fm>

wrote, quoted or indirectly quoted someone who said :

>


>When will ArrayList have a setSize() method. Its lack
>makes it practically impossible to consistently replace
>Vector by ArrayList.

Have you checked if ArrayList is final? If not, you can extend it
with just a setSize method.

Even if it is final, you could implement a SizeableArrayList class
like this:

class SizeableArrayList implements List{

ArrayList inner;

void int setSize(int size )
{
// to change the size you reallocate and copy the ArrayList, or use //
//setToSize
}

Then create facade methods to implement List that pass parms on to
inner.

This is how I implemented LEInputDataStream to use InputDataStream
methods when InputDataStream was final.
see http://mindprod.com/products1.html#LEDATASTREAM

Knute Johnson

unread,
Aug 9, 2011, 7:03:40 PM8/9/11
to
On 8/9/2011 3:36 PM, Jan Burse wrote:
> This is not efficient. You don't get it what the
> problem is. I really really need a highly efficient
> setSize() specialized, otherwise my stuff will not work.

That's BS. Either it's sparse enough to use a Map or it's too large to
use ArrayList and that would make it too large to use Vector. Just use
arrays, they're a lot faster than a Vector or any List. Or just get a
faster computer.

I should have long since killed this thread.

--

Knute Johnson

Arne Vajhøj

unread,
Aug 9, 2011, 8:09:15 PM8/9/11
to
On 8/9/2011 12:58 PM, Roedy Green wrote:
> On Tue, 09 Aug 2011 00:47:12 +0200, Jan Burse<janb...@fastmail.fm>
> wrote, quoted or indirectly quoted someone who said :
>> When will ArrayList have a setSize() method. Its lack
>> makes it practically impossible to consistently replace
>> Vector by ArrayList.
>
> This will not make you happy, but there is ArrayList.trimToSize

Which seems completely irrelevant for the problem.

Arne

Arne Vajhøj

unread,
Aug 9, 2011, 8:11:22 PM8/9/11
to
On 8/9/2011 5:16 AM, Jan Burse wrote:
> Patricia Shanahan schrieb:
>> On 8/8/2011 7:16 PM, Jan Burse wrote:
>> ...
>>> I actually do use setSize for a kind of sparse Vector.
>>> Sparse in the sense that my Vector will have a couple
>>> of holes represented by null value elements. Which
>>> is eventually abuse of the term "sparse", but the use
>>> case is there.
>> ...
>>
>> If you only need small numbers of null elements, you could write a class
>> extending ArrayList that has setSize(). All you would do is loop adding
>> null elements or removing the tail elements until the ArrayList is the
>> required size.
>
> If only so many fields in ArrayList would not be private
> I could do that. But since for example in JDK 1.6.0_26
> none of the fields are protected, everything is private.

You can do it with only public methods.

Arne

Arne Vajhøj

unread,
Aug 9, 2011, 8:13:49 PM8/9/11
to
On 8/9/2011 7:02 PM, Roedy Green wrote:
> On Tue, 09 Aug 2011 00:47:12 +0200, Jan Burse<janb...@fastmail.fm>
> wrote, quoted or indirectly quoted someone who said :
>> When will ArrayList have a setSize() method. Its lack
>> makes it practically impossible to consistently replace
>> Vector by ArrayList.
>
> Have you checked if ArrayList is final? If not, you can extend it
> with just a setSize method.

It is not final.

But a helper method may actually be more useful, because
you can not assign from a standard ArrayList to the extended
class.

Arne

Joshua Cranmer

unread,
Aug 9, 2011, 9:36:38 PM8/9/11
to
On 8/9/2011 5:31 PM, Jan Burse wrote:
> And the AbstractList could implement abstractly the inefficient
> setSize() that would use remove() and add(), when Random access is
> present via index, or otherwise maybe with a backward iterator.
> Backward iterator is also missing btw. And then concrete classes
> could provide more efficient implementations if necessary.

Or you could use addAll and Collections.nCopies to implement setSize,
and use ListIterator to do reverse iteration.

Notice that ArrayList does not sport a sort method--it's on
Collections.sort. Just because the method is not on the list itself does
not mean it's not implemented. Implementing it on the class implies that
anyone who implements the same interface has to reimplement it or
forward it to the same implementation all over again.

Patricia Shanahan

unread,
Aug 10, 2011, 1:13:55 AM8/10/11
to
On 8/9/2011 3:36 PM, Jan Burse wrote:
...
> This is not efficient. You don't get it what the
> problem is. I really really need a highly efficient
> setSize() specialized, otherwise my stuff will not work.
...

How about supplying some numbers, such as the proportion of nulls and
the frequency distribution of adding various sizes of null blocks
relative to other activity. Without numbers, it is rarely possible to
get a performance issue.

Maybe even a benchmark showing typical activity for one of your
sparse-ish arrays?

Patricia

Jan Burse

unread,
Aug 10, 2011, 3:04:20 AM8/10/11
to
Joshua Cranmer schrieb:

> Or you could use addAll and Collections.nCopies to implement setSize,
> and use ListIterator to do reverse iteration.

No you didn't get it. I need a fast specialized
setSize() as argued in the bug reference. According
to the collections documentation, the nCopies method
does create an extra object:

http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collections.html#nCopies%28int,%20java.lang.Object%29
... The newly allocated data object is tiny (it contains
a single reference to the data object). ...

But it is not only that this tiny object of type
Collections.CopiesList will be created. For addAll
call an Iterator AbstractList.Itr will be
created I guess, not sure.

But most likely given the abstract implementation of
addAll, the abstract way it treats its argument, and
in the particular situation that the argument is a
CopiesList.

So this is already two temporary objects on the heap.

Don't have a good feeling with this.

Bye

Jan Burse

unread,
Aug 10, 2011, 3:18:21 AM8/10/11
to
Knute Johnson schrieb:

> That's BS. Either it's sparse enough to use a Map or it's too large to
> use ArrayList and that would make it too large to use Vector. Just use
> arrays, they're a lot faster than a Vector or any List. Or just get a
> faster computer.
>
> I should have long since killed this thread.

I am refering to the number of created temporary objects
during the operation. See my other post. Using the
official existing API with nCopies and addAll will use
at least two additional temporary objects.

But there is one more disadvantage of the helper
going the official way approach. If you look at the
addAll you see that it will repeatedly call add().
So it will repeatedly enlarge the ArrayList.

So instead of jumping instantly to the desired size
as a setSize implementation can do. It will temporarly
create elementData arrays of different sizes until
it has reached the final size. Given the 150% + 1
expension rule, if I do a setSize(100) from start
it will create the following temporary elementData
array objects:

Nr Size
1 1
2 2
3 4
4 7
5 11
6 17
7 26
8 40
9 61
10 92
11 139

So there are 10 temporary array allocations, until
we reach the final array. Also the resultinhg elementData
has 39 excess place holders which I don't need.
A setSize() implementation might just create the
array right sized.

So now you can go about and kill whatever you like.

Bye

Robert Klemme

unread,
Aug 10, 2011, 4:21:38 AM8/10/11
to
On 09.08.2011 04:16, Jan Burse wrote:
> Eric Sosman schrieb:
>> Perhaps if you'd explain your use case in more detail
>> someone will have a suggestion you may find helpful.
>
> I have only referted to the meta use case in my post.
> The software refactoring use case, is replacing a synchronized
> Vector by an unsychronized ArrayList in regions where no synchronization
> is necessary.

This can be done in many cases, especially in all cases where Vector
instances are used through the List interface.

> I actually do use setSize for a kind of sparse Vector.
> Sparse in the sense that my Vector will have a couple
> of holes represented by null value elements. Which
> is eventually abuse of the term "sparse", but the use
> case is there.

If you really need a sparse type then you could use Map<Integer,E> or
implement your own version of List<E>.

> The missing thought in the reponse is the design goal
> of ArrayList. They are explicitly advertised in the
> class comment as unsynchronized version of Vector. It
> is brainless not to honor such a design goal in a response.

This is not true. Quote: "This class is roughly equivalent to Vector,
except that it is unsynchronized."
http://download.oracle.com/javase/6/docs/api/java/util/ArrayList.html

Note the "roughly". This is the only reference to "Vector" on that doc
page (apart from the link to Vector's page).

> On the other hand it is very clever to have the possibility
> of turning initial design goals into requests for
> enhancement. Imagine the following situation:
>
> You go into a restaurant and order pizza,
> you get your pizza without cheese, you
> ask the waiter what is wrong, and the waiter
> says, oh this cannot be changed, cheese has
> to be ordered separately.
>
> Brainless are we, who accept this.

Engineers at Sun might be less brainless than you think. In fact, they
probably used the chance to clean the interface. I believe the design
rationale here is that the size of a collection should only be modified
by methods which actually add or remove from the collection.

Frankly, all the efforts you spend for ranting about Sun / Oracle not
"fixing" a bug with prio "low" (and also the erroneous statement about
"sparse Vectors") could be better spent for any of these solutions:

1. Write your own version of ArrayList by inheriting AbstractList and
adding all the methods you need.

2. Writing a static helper method which uses ArrayList.ensureCapacity()
and a loop which invokes add() or removeRange() depending on whether the
new capacity is larger or smaller.

3. Do nothing. Synchronization overhead of Vector isn't too bad if it's
not used by multiple threads. After all the replacement is more like an
optimization. If you want to be sure, do the measurements.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Andreas Leitgeb

unread,
Aug 10, 2011, 4:50:52 AM8/10/11
to
Jan Burse <janb...@fastmail.fm> wrote:
> According to the collections documentation, the nCopies method
> does create an extra object: ...

How do Java developers change lightbulbs?
First they create an instance of ...

I'm afraid, you'll have to get used to that certain tasks (involving
classes of the standard library) are not done the straightforward way,
but by abstrahizing the problem "beyond recognition", and then throwing
some use-once instances at it. ;-)

Posting about perceived imperfections of Java here in cljp typically
has one of these outcomes:
- If you just missed something simple, you'll be told so. :-)
That's why it is often good to ask here.
- Otherwise, you're told of some roundabout way to do it - and will
be seen as a troll if you comment on the roundaboutedness per se.
One should always measure it. And measuring indeed often shows
that the roundabout way isn't all that bad after all. If it is,
then you're suggested to buy a stronger machine (for you and for
all those who will run your code).

What one can be entirely sure won't happen under any circumstances
for a deficiency posted here:
- The straightforward solution will be implemented in Java.
A feature request on oracle's site might have had little more than
zero-chance, if also avoiding the word "sparse" in the description.

Andreas Leitgeb

unread,
Aug 10, 2011, 5:14:19 AM8/10/11
to
Robert Klemme <short...@googlemail.com> wrote:
> On 09.08.2011 04:16, Jan Burse wrote:
>> The software refactoring use case, is replacing a synchronized
>> Vector by an unsychronized ArrayList in regions where no synchronization
>> is necessary.
> This can be done in many cases, especially in all cases where Vector
> instances are used through the List interface.

Originally, Vector wasn't a List. So, really old code isn't
very "likely" to use the List interface of Vector.

>> Which is eventually abuse of the term "sparse",

> If you really need a sparse type ...
didn't seem so.

Mayeul

unread,
Aug 10, 2011, 5:26:11 AM8/10/11
to
On 10/08/2011 09:18, Jan Burse wrote:
> Knute Johnson schrieb:
>> That's BS. Either it's sparse enough to use a Map or it's too large to
>> use ArrayList and that would make it too large to use Vector. Just use
>> arrays, they're a lot faster than a Vector or any List. Or just get a
>> faster computer.
>>
>> I should have long since killed this thread.
>
> I am refering to the number of created temporary objects
> during the operation. See my other post. Using the
> official existing API with nCopies and addAll will use
> at least two additional temporary objects.

Wow, a whole two objects? While manipulating a typical Java Collection?
That just simply won't do.

I actually have my doubts that it would be any inefficient at all until
I've measured it. Anyway, if I have read the thread correctly, there is
claim that we have here a tradeoff: ensure that a List's size will only
change on element insertion or deletion, a rather natural assertion. At
the cost of possibly, if measured so, forcing developers in your
situation to keep Vector or to get a more specialized implementation of
List.
How long do you plan to ignore this claim?

> But there is one more disadvantage of the helper
> going the official way approach. If you look at the
> addAll you see that it will repeatedly call add().

Your prediction proved wrong. Though I do not think my eyes are playing
tricks on me, I saw nothing of the sort.

> So it will repeatedly enlarge the ArrayList.
>
> So instead of jumping instantly to the desired size
> as a setSize implementation can do. It will temporarly
> create elementData arrays of different sizes until
> it has reached the final size. Given the 150% + 1
> expension rule, if I do a setSize(100) from start
> it will create the following temporary elementData
> array objects:
>
> Nr Size
> 1 1
> 2 2
> 3 4
> 4 7
> 5 11
> 6 17
> 7 26
> 8 40
> 9 61
> 10 92
> 11 139
>
> So there are 10 temporary array allocations, until
> we reach the final array. Also the resultinhg elementData
> has 39 excess place holders which I don't need.

Nope

> A setSize() implementation might just create the
> array right sized.

As does an addAll() of any size.

--
Mayeul

Jan Burse

unread,
Aug 10, 2011, 6:29:29 AM8/10/11
to
Robert Klemme schrieb:

>
> 2. Writing a static helper method which uses ArrayList.ensureCapacity()
> and a loop which invokes add() or removeRange() depending on whether the
> new capacity is larger or smaller.

removeRange is protected, so I cannot use it in a helper.

Jan Burse

unread,
Aug 10, 2011, 6:35:23 AM8/10/11
to
Mayeul schrieb:

>>
>> So there are 10 temporary array allocations, until
>> we reach the final array. Also the resultinhg elementData
>> has 39 excess place holders which I don't need.
>
> Nope

Yes that you are right, since array list does override
the addAll method, and there they do set the size correctly.

But a new object pops up there. The CopiesList is materialized.
Here you see the code:

public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacity(size + numNew); // Increments modCount
System.arraycopy(a, 0, elementData, size, numNew);
size += numNew;
return numNew != 0;
}

So there are two temporary objects, the CopiesList, and the
Object[] a from above. No enumerator.

Not nice

Lew

unread,
Aug 10, 2011, 12:33:54 PM8/10/11
to
Jan Burse wrote:
> I am refering to the number of created temporary objects
> during the operation. See my other post. Using the
> official existing API with nCopies and addAll will use
> at least two additional temporary objects.
>
> But there is one more disadvantage of the helper
> going the official way approach. If you look at the
> addAll you see that it will repeatedly call add().
> So it will repeatedly enlarge the ArrayList.
>

'ArrayList#ensureCapacity()'?

> So instead of jumping instantly to the desired size
> as a setSize implementation can do. It will temporarly
> create elementData arrays of different sizes until
> it has reached the final size. Given the 150% + 1
> expension rule, if I do a setSize(100) from start
> it will create the following temporary elementData
> array objects:

'ArrayList#ensureCapacity()'?

> Nr Size
> 1 1
> 2 2
> 3 4
> 4 7
> 5 11
> 6 17
> 7 26
> 8 40
> 9 61
> 10 92
> 11 139
>
> So there are 10 temporary array allocations, until
> we reach the final array. Also the resultinhg elementData
> has 39 excess place holders which I don't need.
> A setSize() implementation might just create the
> array right sized.
>

'ArrayList#ensureCapacity()'?

> So now you can go about and kill whatever you like.

Your objection?

--
Lew

Robert Klemme

unread,
Aug 10, 2011, 1:27:16 PM8/10/11
to

Sorry about that. Then just use a loop with remove(int).

Actually how to do it was not my main point. You can still do it
yourself and efficiently so. Your complaint about a bug report which is
not handled to your liking is most likely futile - and also in the wrong
place. And please also keep in mind that the JDK is given to you for
free which kinda limits the handle you have on Oracle to force them to
do what you want. Maybe you need to sink Larry's ship - but even in
that case I reckon it's more likely that he gets after you than the
other way round. :-)

Cheers

Jan Burse

unread,
Aug 10, 2011, 2:22:28 PM8/10/11
to
Robert Klemme schrieb:

I guess Larry will give me some money, when he sees that
with the new setSize() his servers will be more green. There
should be a market for energy efficiency nowadays.

Bye

P.S.: I don't champion energy efficiency yet, one of the
applications I am working on is rather a black hole concerning
energy, compared to the video streaming which only eats up
3W or so (which still impresses me), when I run my application
energy consumption goes up by 16W or so.


Jan Burse

unread,
Aug 10, 2011, 2:22:39 PM8/10/11
to
Lew schrieb:

> 'ArrayList#ensureCapacity()'?
>
>> > So now you can go about and kill whatever you like.
> Your objection?

Although it is a public method, it wouldn't work,
since it does not set the size field of the ArrayList.
So you only have as a post condition after ensureCapacity(n):

elementData.length>=n

But not:

size=n

You would still need to invoke a couple of add() calls
after calling ensureCapacity().

But calling ensureCapacity() before calling the add()s
would prevent gradually enlarging the ArrayList, so that
we would have a solution with zero unnecessary temporary
objects.

gr8 step forward, many thanks.

But still not what I consider an efficient solution. You
will burn some CPU cycles in the add()s call for nothing,
since the elementData is already nulled from Array.newInstance()
as required by the JLS.

So not a green solution.

Best Regards

Jan Burse

unread,
Aug 10, 2011, 2:37:54 PM8/10/11
to
Jan Burse schrieb:

> I guess Larry will give me some money, when he sees that
> with the new setSize() his servers will be more green. There
> should be a market for energy efficiency nowadays.

Look see: Migrating from Vector to ArrayList is all
about energy saving. ArrayList does not use synchronized,
therefore it does use less CPU. But in my case the
migration is not perfect, since setSize() is missing.

Migrating Hashtable to HashMap didn't cause me any pain
so far. I saw the following speed-up and thus
energy saveing in some sense:

32-bit JDK Synchronized: 12'898 ms (*)
32-bit JDK Un-Schronized: 12'438 ms
64-bit JDK Synchronized: 7'769 ms
64-bit JDK Un-Synchronized: 7'308 ms

Now I am in the progress of migrating the Vectors to
ArrayList. But because ArrayList is mindlessly lacking
setSize() I am stuck. The bug report shows that I
am not the only one who has experienced this problem
so far.

Oracle is probably less pronounced in being green
compared to IBM. IBM does it under the cover of
its smarter planet initiative.
http://www.eweek.com/c/a/IT-Infrastructure/5-Steps-to-Green-IT/
For oracle I found also something:
http://www.oracle.com/us/products/applications/green/061929.html

Best Regards

(*)
http://www.facebook.com/photo.php?fbid=177557452312810

Andreas Leitgeb

unread,
Aug 10, 2011, 6:23:18 PM8/10/11
to
Jan Burse <janb...@fastmail.fm> wrote:
> I guess Larry will give me some money, when he sees that
> with the new setSize() his servers will be more green.

I'm all in favor of setSize() for easing mechanical Vector->
ArrayList transition, but I'm sure, some of the workarounds
for its lack could lead to even "greener" code:

1.) a good estimate on what size you'll end up using, and
create the ArrayList with that size. Eating up a few
kilobytes (or even up to a megabyte) of nulls too much
is probably still greener than re-allocating the buffer
each time a new largest-so-far index shows up. If the
original estimation turns out to be too small, then grow
it with addAll(...nCopies(...)) - it wouldn't be done
all that often, so the extra effort wouldn't matter.

2.) if no such limit can be reasonably predicted, then
chances are that a real sparse structure really fits
the bill better. (I didn't quite understand your argument
against using a Map - it didn't make sense to me when you
wrote it)

The ideal (semi-)sparse structure wouldn't even need a setSize(),
as it would grow the internal buffer as needed to allow .set() to
work with any (non-negative) index, and let .get() just return null
for indices beyond the current end instead of throwing
IndexOutOfBounds-Exceptions.

Joshua Cranmer

unread,
Aug 10, 2011, 7:16:57 PM8/10/11
to
On 8/10/2011 2:04 AM, Jan Burse wrote:
> Joshua Cranmer schrieb:
>> Or you could use addAll and Collections.nCopies to implement setSize,
>> and use ListIterator to do reverse iteration.
>
> No you didn't get it. I need a fast specialized
> setSize() as argued in the bug reference. According
> to the collections documentation, the nCopies method
> does create an extra object:
>
> http://download.oracle.com/javase/1.4.2/docs/api/java/util/Collections.html#nCopies%28int,%20java.lang.Object%29
>
> ... The newly allocated data object is tiny (it contains
> a single reference to the data object). ...
>
> But it is not only that this tiny object of type
> Collections.CopiesList will be created. For addAll
> call an Iterator AbstractList.Itr will be
> created I guess, not sure.

Okay, if you're so annoyed about the creation of several objects, you
can create your own version of a CopiesList that implements both
Collection and Iterator and implements everything on itself, for a
single overhead of around 12 bytes. If that is still too much, run a
garbage collection. You now have a few hundred extra kilobytes of data.
If you're still complaining, you're using the wrong language.

Joshua Cranmer

unread,
Aug 10, 2011, 7:22:31 PM8/10/11
to
On 8/10/2011 1:37 PM, Jan Burse wrote:
> Look see: Migrating from Vector to ArrayList is all
> about energy saving. ArrayList does not use synchronized,
> therefore it does use less CPU. But in my case the
> migration is not perfect, since setSize() is missing.

We have given *several* ways to re-emulate setSize using ArrayList.
Since moving from Vector to ArrayList causes savings in loss of
synchronized, perhaps the speed slowdown in your emulations is more than
made up for.

Jan Burse

unread,
Aug 10, 2011, 7:47:04 PM8/10/11
to
Andreas Leitgeb schrieb:

> 1.) a good estimate on what size you'll end up using, and
> create the ArrayList with that size. Eating up a few
> kilobytes (or even up to a megabyte) of nulls too much
> is probably still greener than re-allocating the buffer
> each time a new largest-so-far index shows up. If the
> original estimation turns out to be too small, then grow
> it with addAll(...nCopies(...)) - it wouldn't be done
> all that often, so the extra effort wouldn't matter.

> 2.) if no such limit can be reasonably predicted, then
> chances are that a real sparse structure really fits
> the bill better. (I didn't quite understand your argument
> against using a Map - it didn't make sense to me when you
> wrote it)

I never placed an argument against map. Where was this?
Just assume that the original code had setSize() for
whatever reason, and that the redesign of the original
code is not at stake.

But you are right, a redesign could of course be an option.

> The ideal (semi-)sparse structure wouldn't even need a setSize(),
> as it would grow the internal buffer as needed to allow .set() to
> work with any (non-negative) index, and let .get() just return null
> for indices beyond the current end instead of throwing
> IndexOutOfBounds-Exceptions.

Actually in the current application the Vector/ArrayList need
not necessarely be sparse. It could be, but it does not have
to be, both scenarios should possible coexist, sparse and non-sparse.
The current application has a high frequency of creating for an
initially Vector/ArrayList of size 0.

Then upon some event (imagine also high frequency) the size
might jump to n, and the element at position n-1 is set to
non null. And then upon some other event the size might jump
to m, and the element at position m-1 is set to non null. But
there is no guarantee that n and m are close. Also it can
happen that an event j arrives, and the element at position
j is set to non null, j being somewhere between 0 and m-1. So
that the array might become gradually less sparse.

Imagine the above process to last a very short time, until the
livecycle of the array ends. Before the livecycle ends, the
array might contain up to 100 or more non-null positions. But
there is no guarantee that they are close together or not.
But when there are 100 entries their indexes might span the
range of 1000. But it is very important that the access to
the elements is O(1).

The sparseness is only mentioned in the oracle bug. But actually
I think the sparseness is irrelevant. Just assume you have an
existing application with Vector that happily uses setSize(). So
what do you do?

Bye

Jan Burse

unread,
Aug 10, 2011, 7:52:28 PM8/10/11
to
Joshua Cranmer schrieb:

> On 8/10/2011 1:37 PM, Jan Burse wrote:
>> Look see: Migrating from Vector to ArrayList is all
>> about energy saving. ArrayList does not use synchronized,
>> therefore it does use less CPU. But in my case the
>> migration is not perfect, since setSize() is missing.
>
> We have given *several* ways to re-emulate setSize using ArrayList.
> Since moving from Vector to ArrayList causes savings in loss of
> synchronized, perhaps the speed slowdown in your emulations is more than
> made up for.
>

If the emulation eats up the speed gain of the remove the
synchronization, then there is no use in doing the migration
at all. The synchronization primitive in Java is already
highly optimized, it has a fastpath in the non-competitive
case. Removing synchronization gives some performance gain,
but it is in the range of 5-20% only nowadays. It depends of
course on how much synchronization the original application used.

So the chances are very high, that an emulation eats up
exactly these 5-20% percents.

Bye

Jan Burse

unread,
Aug 10, 2011, 7:55:43 PM8/10/11
to
Joshua Cranmer schrieb:

> Okay, if you're so annoyed about the creation of several objects, you
> can create your own version of a CopiesList that implements both
> Collection and Iterator and implements everything on itself, for a
> single overhead of around 12 bytes. If that is still too much, run a
> garbage collection. You now have a few hundred extra kilobytes of data.
> If you're still complaining, you're using the wrong language.
>

The language is fine, since it could have an efficient setSize().
Its not a problem of the Java language at all. And of course
I could create also a clone of ArrayList and add setSize there.

But I don't like the idea of copy/pasting the ArrayList implementation.
Once you start implementing your own primitive classes your project
reaches another dimension. But yes of course this is an option, if
the out of the box "collections" do not fit.

Bye

Jan Burse

unread,
Aug 10, 2011, 8:01:20 PM8/10/11
to
Patricia Shanahan schrieb:

You can assume the constraint that the element access
should stay at O(1). This narrows down the possible
data structures a little bit.

Bye

Andreas Leitgeb

unread,
Aug 10, 2011, 8:23:07 PM8/10/11
to
Jan Burse <janb...@fastmail.fm> wrote:
> But when there are 100 entries their indexes might span the
> range of 1000. But it is very important that the access to
> the elements is O(1).

In *that* case, I'd use a plain array, sized to say 2048 and
just index into it.

> Just assume you have an existing application with Vector
> that happily uses setSize(). So what do you do?

I'd try to find out, whether the current .size() is relevant to the
business logic.
If not, then most of the repeated setSize()ing maybe was just a
waste of cpu-time&energy, anyway.
If it is, then I might keep it in a separate variable or field,
and keep the array or ArrayList at some sufficiently large size.
If it is, and the collection itself gets passed to external
interfaces that care about the size ... or similar misfortunes,
then I'd give up and stay with Vector. (Yes, if ArrayList had
.setSize(), then this particular leaf of the decision tree would
fare better.)

Based on that you mentioned "no money (no time) for a rewrite", I
might just as well give up on conversion, anyway and stay with Vector.
If that is too slow, then *they* should spend some money (or time)
on making it faster.

Patricia Shanahan

unread,
Aug 10, 2011, 9:42:44 PM8/10/11
to

Does that include accesses that require a size increase?

Patricia

Joshua Cranmer

unread,
Aug 10, 2011, 10:43:33 PM8/10/11
to
On 8/10/2011 6:52 PM, Jan Burse wrote:
> So the chances are very high, that an emulation eats up
> exactly these 5-20% percents.

I honestly doubt that. But instead of arguing about it, why not *try*
implementing it and measuring the performance implications to verify for
certain?

Jan Burse

unread,
Aug 11, 2011, 3:19:50 AM8/11/11
to
Andreas Leitgeb schrieb:
> In*that* case, I'd use a plain array, sized to say 2048 and
> just index into it.

Actually this is a very good idea. But it only
works when the application did not use the ArrayList
as a point of reference for sharing, so that
by pointing to the ArrayList object all clients of
this object would automatically see any changes
on the object. This is not possible with an array.

So have to check how the ArrayList is used in
the application. There are chances that the ArrayList
itself sits in an object and that this parent object
can be used as a point for sharing instead of the
ArrayList. And then I could replace the ArrayList
by an array. If there is no such parent object, then not.

If there is no such parent object, I would need
to wrap the array in some object, and would end
up re-implementing ArrayList. Lets see.

But you are also right that if I know an the size
of the array, and could make the size fixed, then
no change of the array itself would happen. I would
not need to bother about having a shared reference.
But I guess the fixed size assumption is not possible
here. The array might or might not filled at all.

Bye

Jan Burse

unread,
Aug 11, 2011, 8:19:52 AM8/11/11
to
Joshua Cranmer schrieb:

> On 8/10/2011 6:52 PM, Jan Burse wrote:
>> So the chances are very high, that an emulation eats up
>> exactly these 5-20% percents.
>
> I honestly doubt that. But instead of arguing about it, why not *try*
> implementing it and measuring the performance implications to verify for
> certain?
>

I don't exclude doing this some time ahead.
Currently I was not yet able to do this
given the time and cost constraints around.

Bye

Jan Burse

unread,
Aug 11, 2011, 8:21:31 AM8/11/11
to
Patricia Shanahan schrieb:

>> You can assume the constraint that the element access
>> should stay at O(1). This narrows down the possible
>> data structures a little bit.
>
> Does that include accesses that require a size increase?
>
> Patricia

I don't have yet frequency figures, and thus cannot
give you much non-functional requirements on
setSize(). But you are right, if no fast path is
available, in some circumstances a semi-fast path
or even a slow path might also do.

Sorry for not being able to provide you more information.

Bye

Andreas Leitgeb

unread,
Aug 12, 2011, 11:06:39 AM8/12/11
to
Jan Burse <janb...@fastmail.fm> wrote:
> Andreas Leitgeb schrieb:
>> In*that* case, I'd use a plain array, sized to say 2048 and
>> just index into it.
> But I guess the fixed size assumption is not possible
> here. The array might or might not filled at all.

Is that now about an odd chance that the array/ArrayList might
grow beyond the initial estimate? Or is that about other code
(that sees the ArrayList) that may be relying on that the last
non-null element is at position "size()-1" ?

If the former: for an odd-chance event, doing
addAll(...nCopies(...))
is good enough.

If the latter: stick to Vector ;-/ Even if ArrayList.setSize()
was ever added, that would surely happen too late for your project.

0 new messages