Would an aggregation pattern fit into Guava?

589 views
Skip to first unread message

Emily Soldal

unread,
Apr 20, 2012, 5:22:50 AM4/20/12
to guava-...@googlegroups.com
The past year and a bit I've found myself taking advantage of an aggregation pattern I would like to see in Guava but I'm not entierly possitive on how it should be implemented.

Iterable<Iterable<T>> aggregate = Iterables.aggregate(Iterable<T>, Equivalence<T>) 

or 

Iterable<Iterable<T>> aggregate = Iterables.aggregate(Iterable<T>, Comparator<T>)  

This would return an iterable view containing chunks of the underlying iterable which evaluate as similar with the supplied Equivalence or Comparator. This has been phenomenally useful for me in reducing the amount of complexity I have to deal with.

Gregory Kick

unread,
Apr 20, 2012, 5:42:58 AM4/20/12
to Emily Soldal, guava-...@googlegroups.com
So, Multimaps.index() is pretty close.  Could your Equvialences and Comparators be traded for Functions<T, K>s?




--
Greg Kick
Java Core Libraries Team

Emily Soldal

unread,
Apr 20, 2012, 5:46:32 AM4/20/12
to guava-...@googlegroups.com, Emily Soldal

I see what you mean, although I don't think that would work. How would you decide what K should be? 

Torbjorn Gannholm

unread,
Apr 20, 2012, 5:48:47 AM4/20/12
to Emily Soldal, guava-...@googlegroups.com
On Fri, Apr 20, 2012 at 11:46 AM, Emily Soldal <em...@soldal.org> wrote:

I see what you mean, although I don't think that would work. How would you decide what K should be? 

The identityHashCode of the Comparator, perhaps?
 


On Friday, 20 April 2012 11:42:58 UTC+2, Gregory Kick wrote:
So, Multimaps.index() is pretty close.  Could your Equvialences and Comparators be traded for Functions<T, K>s?


Kevin Bourrillion

unread,
Apr 20, 2012, 5:49:25 AM4/20/12
to Emily Soldal, guava-...@googlegroups.com
I always suspected in the back of my mind we'd need this, but over the years, every time it's come up so far, index() has turned out to be fine.  So can we have some more fleshed-out code examples that illustrate the need?





--
Kevin Bourrillion @ Google
Java Core Libraries Team
http://guava-libraries.googlecode.com

Emily Soldal

unread,
Apr 20, 2012, 5:56:15 AM4/20/12
to guava-...@googlegroups.com, Emily Soldal
I can whip something up. I'm almost tempted to say this method should be called chunk() rather than aggregate though. The reason being that I'd rather not do any sorting before hand, if its named chunk() then it explains how it divides the underlying Iterable into sections along the boundaries defined by the comparator/equivalence.

Kevin Bourrillion

unread,
Apr 20, 2012, 6:04:43 AM4/20/12
to Emily Soldal, guava-...@googlegroups.com

Emily Soldal

unread,
Apr 20, 2012, 6:10:49 AM4/20/12
to guava-...@googlegroups.com, Emily Soldal
Oh I like the idea of a Partition. It would also compliment nicely with Iterables.partition(Iterable,int) to be able to supply Iterables.partition(Iterable,Comparator) or Iterables.partition(Iterable.Partition.from(Comparator))

This hinges on making a complimentary Partition class of course

Kevin Bourrillion

unread,
Apr 20, 2012, 6:13:08 AM4/20/12
to Emily Soldal, guava-...@googlegroups.com
imho, it just presents a further argument why the existing Iterables.partition() shouldn't be called partition at all. :-(

Emily Soldal

unread,
Apr 20, 2012, 7:05:00 AM4/20/12
to guava-...@googlegroups.com
I'm quite happy with it being named partition and wouldn't mind seeing some of the following additions to it either:
Iterables.partition(Iterable,Comparator);
Iterables.partition(Iterable,Equivalence);
Iterables.partition(Iterable,Partition);
Iterables.partition(Iterable<? extends Comparable>);

But perhaps thats a bit off topic. What are the reasons for calling it something else? And if a Partition class is required for DisjointSet it makes sense that it should work for this too but I'm not entierly sure how it should behave. I don't  think DisjointSet overlaps with this proposal in terms of functionality because partition preserves both order and multiplicity.

Emily Soldal

unread,
Apr 30, 2012, 4:52:31 AM4/30/12
to guava-...@googlegroups.com
*bump*

I'm happy to open an issue and complete this myself, name not withstanding ofcourse. 

Louis Wasserman

unread,
Apr 30, 2012, 5:37:44 AM4/30/12
to Emily Soldal, guava-...@googlegroups.com
I guess I'd feel a lot happier with half a dozen real-world use cases where we could compare APIs side by side and see what the different code would look like.  This doesn't strike me as a trivial API design problem.

Emily Soldal

unread,
May 14, 2012, 4:24:44 AM5/14/12
to guava-...@googlegroups.com
Places where you might find aggregation patterns in disguise:

* Where MultiMap.index().asMap().values() is used twice and where the actual map isn't used but rather the indexing ability.
* Optimization of compound-key queries on large data, some form of equivalence or grouping will likely be used to reduce complexity in each query, eliminating as many variables as possible.
* Looking for sequences of like elements in a collection without concatonating all like elements; think of checks done in a loop looking for similarities between the current value and some previous value.
Reply all
Reply to author
Forward
0 new messages