Ordering.random()

1,065 views
Skip to first unread message

Willi Schönborn

unread,
Jul 14, 2010, 7:30:07 AM7/14/10
to guava-...@googlegroups.com
What's your opinion about a RandomOrdering which just returns a random
value between -1 and 1?
This can be useful in combination with another ordering (compound) where
the first ones define the two groups and the random ordering just
shuffles the groups.

Kevin Bourrillion

unread,
Jul 20, 2010, 7:35:25 PM7/20/10
to guava-...@googlegroups.com
[Sorry, this message got held as possible-spam (perhaps you are not a group member, Willi?).  I've just gone and approved a few messages that had accumulated and whitelisted their senders for all future messages.]

Comparators have rules -- they have to be reflexive, antisymmetric, and transitive (see Comparator documentation for details).  What you probably want is Ordering.arbitrary(), which does exist.  Note that it is identity-based, so won't recognize two equal Objects as equivalent (you could run objects through an Interner first if that is a concern).



--
guava-...@googlegroups.com.
http://groups.google.com/group/guava-discuss?hl=en
unsubscribe: guava-discus...@googlegroups.com

This list is for discussion; for help, post to Stack Overflow instead:
http://stackoverflow.com/questions/ask
Use the tag "guava".



--
Kevin Bourrillion @ Google
http://guava-libraries.googlecode.com

Tobias Thierer

unread,
Jul 20, 2010, 7:44:39 PM7/20/10
to guava-...@googlegroups.com
On Wed, Jul 21, 2010 at 9:35 AM, Kevin Bourrillion <kev...@google.com> wrote:
[Sorry, this message got held as possible-spam (perhaps you are not a group member, Willi?).  I've just gone and approved a few messages that had accumulated and whitelisted their senders for all future messages.]

Comparators have rules -- they have to be reflexive, antisymmetric, and transitive (see Comparator documentation for details). 

And in addition, sorting using a Comparator that violates this and *does* return a random value between -1 and +1 would almost certainly not shuffle randomly, i.e. it wouldn't generate a uniform distribution over all permutations of the original order.

Ordering.arbitrary() is nice in that it depends only on the IdentityHashCode (and rarely on the original order) of the objects you are ordering, not their value.

A disadvantage of Ordering.arbitrary() is that it can very occasionally throw AssertionError. Also, what might be an advantage or disadvantage is that if you order the same objects twice with the same Ordering.arbitrary(), you always get the same result, rather than a different random permutation.

I think the original author may want to consider Collections.shuffle() instead.

Tobias

Dimitris Andreou

unread,
Jul 20, 2010, 8:04:30 PM7/20/10
to guava-...@googlegroups.com


2010/7/21 Tobias Thierer <tob...@google.com>
I guess "very occasionally" was that a typo. Just in case, it would take at least 2^32 collisions in System.identityHashCode() comparisons, and it would also require imposing an arbitrary order to a collection where one of the objects (which identity hashcode must also be colliding with some of the others) is old enough to have survived all those 2^32 collisions.
 

I think the original author may want to consider Collections.shuffle() instead.

Tobias

--

Kevin Bourrillion

unread,
Jul 20, 2010, 8:04:39 PM7/20/10
to guava-...@googlegroups.com
On Tue, Jul 20, 2010 at 4:44 PM, Tobias Thierer <tob...@google.com> wrote:

A disadvantage of Ordering.arbitrary() is that it can very occasionally throw AssertionError.

"Very occasionally"?  That's an understatement!  Any two objects have maybe a 1 in 2^28 (?) probability of having colliding identity hash codes, and you need to encounter a new collision at least 2^32 times before this even has a chance of happening.

If anyone's looking for something fun to do this weekend, try to write a program that makes that AssertionError happen. :-)  Should be interesting.

 
Also, what might be an advantage or disadvantage is that if you order the same objects twice with the same Ordering.arbitrary(), you always get the same result, rather than a different random permutation.

(Note that Ordering.arbitary() is a singleton.)


Willi Schönborn

unread,
Jul 21, 2010, 4:50:50 AM7/21/10
to guava-...@googlegroups.com
On 21/07/10 01:35, Kevin Bourrillion wrote:
[Sorry, this message got held as possible-spam (perhaps you are not a group member, Willi?).
I thought i am. I may have used the wrong mail account though.

 I've just gone and approved a few messages that had accumulated and whitelisted their senders for all future messages.]

Comparators have rules -- they have to be reflexive, antisymmetric, and transitive (see Comparator documentation for details).  What you probably want is Ordering.arbitrary(), which does exist.  Note that it is identity-based, so won't recognize two equal Objects as equivalent (you could run objects through an Interner first if that is a concern).
I took a look at Ordering.arbitrary() but it would produce the same "strange" (random?) ordering
for the same collection when applied multiple times.

Wouldn't it be possible if Ordering.random() would return a new instance (in constrast to a singleton) which
keeps track of generated values (map of comparees to integer?). This would also result
in the same ordering when applied multiple times, but clients could use a new instance of Ordering.random()
to produce a new sorting. I hope my idea is clear.

Greetings
W

Willi Schönborn

unread,
Jul 21, 2010, 5:58:57 AM7/21/10
to guava-...@googlegroups.com
Collections.shuffle() shuffles the whole collection. I want to "group" a collection first (using another comparator) and the shuffle
the resulting groups internally.

Tobias

Tobias Thierer

unread,
Jul 21, 2010, 8:55:42 PM7/21/10
to guava-...@googlegroups.com
I'm not sure what you mean with "group"ing your collection, but is your collection is a List and each group is a sublist?

If so, you should be able to do

  Collections.shuffle(myList.subList(start, end));

Tobias

Willi Schönborn

unread,
Jul 22, 2010, 4:01:15 AM7/22/10
to guava-...@googlegroups.com
On 22/07/10 02:55, Tobias Thierer wrote:
I'm not sure what you mean with "group"ing your collection, but is your collection is a List and each group is a sublist?
No, what i mean is using e.g. a comparator which sorts by a boolean member (e.g. BusinessEntity.isPaying()) which
moves paying business entites to the top and non-paying to the bottom. In combination with


If so, you should be able to do

  Collections.shuffle(myList.subList(start, end));
This is no solutation for me, because i don't now the indices.

Tobias Thierer

unread,
Jul 25, 2010, 9:42:14 PM7/25/10
to guava-...@googlegroups.com
Then why don't you shuffle first, and then subsequently sort by BusinessEntity.isPaying() alone?

Tobias

Willi Schönborn

unread,
Jul 26, 2010, 11:15:17 AM7/26/10
to guava-...@googlegroups.com
On 26/07/10 03:42, Tobias Thierer wrote:
Then why don't you shuffle first, and then subsequently sort by BusinessEntity.isPaying() alone?
I'm so dumb.

Joshua O'Madadhain

unread,
Jul 26, 2010, 11:33:04 AM7/26/10
to guava-...@googlegroups.com

Alternatively, how about using 2 lists?

Joshua

On Jul 26, 2010 8:16 AM, "Willi Schönborn" <w.scho...@googlemail.com> wrote:
> On 26/07/10 03:42, Tobias Thierer wrote:
>> Then why don't you shuffle first, and then subsequently sort by
>> BusinessEntity.isPaying() alone?
> I'm so dumb.
>>
>> Tobias
>>
>> On Thu, Jul 22, 2010 at 6:01 PM, Willi Schönborn
>> <w.scho...@googlemail.com <mailto:w.scho...@googlemail.com>> wrote:
>>
>> On 22/07/10 02:55, Tobias Thierer wrote:
>>> I'm not sure what you mean with "group"ing your collection, but
>>> is your collection is a List and each group is a sublist?
>> No, what i mean is using e.g. a comparator which sorts by a
>> boolean member (e.g. BusinessEntity.isPaying()) which
>> moves paying business entites to the top and non-paying to the
>> bottom. In combination with
>>
>>>
>>> If so, you should be able to do
>>>
>>> Collections.shuffle(myList.subList(start, end));
>> This is no solutation for me, because i don't now the indices.
>>
>>>
>>> Tobias
>>>
>>> On Wed, Jul 21, 2010 at 7:58 PM, Willi Schönborn
>>> <w.scho...@googlemail.com

>>>> <mailto:guava-...@googlegroups.com>.
>>>> http://groups.google.com/group/guava-discuss?hl=en
>>>> unsubscribe: guava-discus...@googlegroups.com
>>>> <mailto:guava-discus...@googlegroups.com>


>>>>
>>>> This list is for discussion; for help, post to Stack
>>>> Overflow instead:
>>>> http://stackoverflow.com/questions/ask
>>>> Use the tag "guava".
>>>
>>> --
>>> guava-...@googlegroups.com

>>> <mailto:guava-...@googlegroups.com>.
>>> http://groups.google.com/group/guava-discuss?hl=en
>>> unsubscribe: guava-discus...@googlegroups.com
>>> <mailto:guava-discuss%2Bunsu...@googlegroups.com>


>>>
>>> This list is for discussion; for help, post to Stack Overflow
>>> instead:
>>> http://stackoverflow.com/questions/ask
>>> Use the tag "guava".
>>>
>>>
>>> --
>>> guava-...@googlegroups.com

>>> <mailto:guava-...@googlegroups.com>.
>>> http://groups.google.com/group/guava-discuss?hl=en
>>> unsubscribe: guava-discus...@googlegroups.com
>>> <mailto:guava-discus...@googlegroups.com>


>>>
>>> This list is for discussion; for help, post to Stack Overflow
>>> instead:
>>> http://stackoverflow.com/questions/ask
>>> Use the tag "guava".
>>
>> --
>> guava-...@googlegroups.com

>> <mailto:guava-...@googlegroups.com>.
>> http://groups.google.com/group/guava-discuss?hl=en
>> unsubscribe: guava-discus...@googlegroups.com
>> <mailto:guava-discuss%2Bunsu...@googlegroups.com>

Kevin Bourrillion

unread,
Jul 26, 2010, 11:34:36 AM7/26/10
to guava-...@googlegroups.com
I didn't think of it either. :-)  Perfect!

Dimitris Andreou

unread,
Jul 26, 2010, 11:36:50 AM7/26/10
to guava-...@googlegroups.com
Heh. Good old radix (shuffling N elements is the same as sorting on a random permutation of 1...N numbers, so there are two "digits"/columns on which to sort, your boolean one, and the random number). 

Nobody knows how many years it took for some random (pun intended) sorting machine operator (late 19th century, or early 20th) to realize that sorting the digits in right-to-left order was much better (the old way, sorting on the most important digit first, meant that piles with equal values for that digit had to be separated and sorted independently). 

(Based on Knuth's record of sorting history, at least).

2010/7/26 Willi Schönborn <w.scho...@googlemail.com>

Dimitris Andreou

unread,
Jul 26, 2010, 11:42:16 AM7/26/10
to guava-...@googlegroups.com


2010/7/26 Dimitris Andreou <jim.a...@gmail.com>

Heh. Good old radix (shuffling N elements is the same as sorting on a random permutation of 1...N numbers, so there are two "digits"/columns on which to sort, your boolean one, and the random number). 

Nobody knows how many years it took for some random (pun intended) sorting machine operator (late 19th century, or early 20th) to realize that sorting the digits in right-to-left order was much better (the old way, sorting on the most important digit first, meant that piles with equal values for that digit had to be separated and sorted independently). 

(It should also be mentioned, for readers at home, that this trick only works with a stable sort, such as mergesort, which luckily Java provides))

Dimitris Andreou

unread,
Jul 26, 2010, 11:48:52 AM7/26/10
to guava-...@googlegroups.com
This is the radix sort with left-to-right sorting, which was invented first. This partitions the elements into separate piles, that need to be sorted (here, "shuffled") separately, which would be problematic if the domain of the value was large (instead of just 2).

This whole discussion is sooo 19th century :P

2010/7/26 Joshua O'Madadhain <jr...@google.com>

Joshua O'Madadhain

unread,
Jul 26, 2010, 11:51:01 AM7/26/10
to guava-...@googlegroups.com

I would still expect this trick to work with an unstable sort.  You're not guaranteed to get the same random ordering, but why would that matter?

Joshua

>>>>> unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>


>>>>>
>>>>> This list is for discussion; for help, post to Stack Overflow instead:
>>>>> http://stackoverflow.com/questions/ask
>>>>> Use the tag "guava".
>>>>>
>>>>
>>>> --
>>>> guava-...@googlegroups.com.
>>>> http://groups.google.com/group/guava-discuss?hl=en
>>>> unsubscribe: guava-discus...@googlegroups.com
>>>>
>>>> This list is for discussion; for help, post to Stack Overflow instead:
>>>> http://stackoverflow.com/questions/ask
>>>> Use the tag "guava".
>>>>
>>>>
>>>> --
>>>> guava-...@googlegroups.com.
>>>> http://groups.google.com/group/guava-discuss?hl=en

>>>> unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>


>>>>
>>>> This list is for discussion; for help, post to Stack Overflow instead:
>>>> http://stackoverflow.com/questions/ask
>>>> Use the tag "guava".
>>>>
>>>
>>> --
>>> guava-...@googlegroups.com.
>>> http://groups.google.com/group/guava-discuss?hl=en
>>> unsubscribe: guava-discus...@googlegroups.com
>>>
>>> This list is for discussion; for help, post to Stack Overflow instead:
>>> http://stackoverflow.com/questions/ask
>>> Use the tag "guava".
>>>
>>>
>>> --
>>> guava-...@googlegroups.com.
>>> http://groups.google.com/group/guava-discuss?hl=en

>>> unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>

Dimitris Andreou

unread,
Jul 26, 2010, 12:00:21 PM7/26/10
to guava-...@googlegroups.com
I was referring to the right-to-left sorting trick in general, in this particular case it doesn't matter indeed. 

2010/7/26 Joshua O'Madadhain <jr...@google.com>

I would still expect this trick to work with an unstable sort.  You're not guaranteed to get the same random ordering, but why would that matter?

Reply all
Reply to author
Forward
0 new messages