Post Processing Results: ResultTransformer and GroupBy

4,131 views
Skip to first unread message

Samppa Saarela

unread,
Sep 27, 2011, 9:08:17 AM9/27/11
to quer...@googlegroups.com
In many cases one needs to load data with *-to-many relations. It is efficient to query one *-to-many relation at a time. Unfortunately normal constructor expressions cannot be used to bundle multiple rows into one.  They are only used to construct one DTO per row. 

I've been working on post processing "ResultSet" rows. To be more exact, on grouping multiple rows into a single Group that can be used as such or transformed into e.g. DTOs. A group can contain columns of multiple rows as single value, Set, List or Map. 

The basis of grouping is a new interface called ResultTransformer which is a callback for Projectable. It uses Projectable to query and transform results into something else. It's related to Hibernate's ResultTransformer.transformList, but more powerful. 

In future Querydsl releases we plan to make result transformation directly available in Projectable interface: Projectable.transform(ResultTransformer). For example

SQLQuery qry = newQuery().from(post)
.innerJoin(post._commentPostFK, comment)
.where(…);

Map<Integer, Group> results = qry.transform(GroupBy.create(post.id, post.name).withSet(comment.id)); 

Group group = results.get(1); // Post with id 1
Integer id = group.getOne(post.id);
String name = group.getOne(post.name);
Set<Integer> commentIds = group.getSet(comment.id);


But before integrating with Projectable we would like to get some community feedback on these features. What do you think of 1) ResultTransformer in general and 2) grouping provided by GroupBy. 

Do you suppose you might need this kind of features? Are these new interfaces and implementations intuitive and easy to grasp? Do you have some related use cases in mind that don't quite fit in?

You'll find more examples from Querydsl master: 

GroupByTest:

SQLQuery test: 


What do you think of GroupProcessor? It's quite a powerful extension point that allows analyzing (and filtering) all result rows in the first pass and then transforming the whole result into something else than the default Map of Groups utilizing this analyzed information. In GroupByTest I have implemented a POC of lazy batch fetching of Post.comments, but other use case that comes to mind is for example aggregating column values and providing results along with aggregated values. 

As with Querydsl's Expressions, GroupBy definitions are stateless with factories for required stateful query execution time processing.

Travell

unread,
Sep 27, 2011, 11:01:09 AM9/27/11
to quer...@googlegroups.com
So in QuertyDSL how would I for example be able to leverage this to aggregate several ResultSets?  I'd like to pass it to something that can do Group and OrderBy.  Maybe pagination too.

Samppa Saarela

unread,
Sep 28, 2011, 3:24:37 AM9/28/11
to quer...@googlegroups.com
Could you be more specific about your use cases?

SQL group by, order by and pagination (limit/offset) are already supported by SQL/Querydsl. However, SQL group by only returns one row with aggregated values per group. These new features don't add anything to those basic SQL-features but provide something extra that SQL itself doesn't support: namely 1) a change to create efficient generic result set post processors (ResultTransformer) and 2) as an application of that, a way to group multiple rows into groups containing not only single column values but also collections (Set, List, Map). 

So what GroupBy ResultTransformer can do, is transform basic ResultSets, e.g. superior and he's subordinates:

[
  ["John", "Mike"],
  ["John", "Matt"],
  ["John", "Mary"],
  ["Jane", "Jack"],
  ["Jane", "Jenny"]
]

into something like

{
  "John": { name:"John", subordinates: ["Mike", "Matt", "Mary"] },
  "Jane": { name:"Jane", subordinates: ["Jack", "Jenny"] }
}

Luis Fernando Planella Gonzalez

unread,
Sep 28, 2011, 8:05:05 AM9/28/11
to Querydsl
This is nice.
It would work like groups in report tools (like JasperReports).
I've already used many times things like:
String oldName = null;
while (rs.next()) {
String name = rs.get("name");
if (oldName == null || !oldName.equals(name)) {
// Break section, render a divider, etc...
} else {
// Use other columns related to that name
}
}

This would be perfect by nesting those iterations.
For me, this is definitively a nice feature.

I took a quick look on the test cases linked on the first post, but
couldn't get a deep understanding of the feature.

One question: does it support multiple levels of nesting?

Samppa Saarela

unread,
Sep 28, 2011, 12:11:06 PM9/28/11
to quer...@googlegroups.com
That example (oldName.equals(name)) is very efficient when one can rely on ordering by the "group id". Current implementation doesn't require any special ordering, but of course it introduces one extra pass to collect all rows. 

Current implementation doesn't support deeply nested groups. You can however use Map for second level grouping:

QTuple childTuple = new QTuple(child.id, child.name);
Map<Integer, Group> groups = GroupBy.create(parent.id, parent.name)
    .withMap(child.id, childTuple).transform(query);

Group group = groups.get(1);
Map<Integer, Tuple> children = group.getMap(child.id, childTuple);
Tuple child = children.get(...);
String childName = child.get(child.name);

I rarely use queries with nested *-to-many relations myself. I usually resort to lazy (batch) loading in such cases, but I do see that there are some good use cases for such. Defining the grouping could maybe look something like this (logically):

GroupBy.create(parent.id, parent.name)
    .withSet(new SubGroup(child.id, child.name)
        .withSet(new SubGroup(grandchild.id, grandchild.name)));

But keeping it in sync with how Groups are accessed (using Expressions themselves), this would be in practice something like this:

SubGroup<Group> grandchildrenGroup = new SubGroup(grandchild.id, grandchild.name);

SubGroup<Group> childrenGroup = new SubGroup(child.id, child.name).withSet(grandchildrenGroup);

GroupBy.create(parent.id, parent.name).withSet(childrenGroup);
...
Group parent =...;
Set<Group> children = parent.getSet(childrenGroup);
Set<Group> firstChildsChildren = children.iterator().next().getSet(grandchildrenGroup);

That might work technically, but is it too confusing? (Please keep in mind that the actual query defining where, from and joins is defined using normal Querydsl expressions. This ResultTransformer is an alternative for Projectable.list, iterate etc. methods.)

This kind of sub groups might actually be usable with a DTO projection pattern I had in mind: GroupBy....as(SomeDto.class). I got stuck with nested relations...

McKinley McKinley

unread,
Sep 28, 2011, 10:50:25 PM9/28/11
to quer...@googlegroups.com
Because Querydsl did not support these features, I have been using http://code.google.com/p/totallylazy/ and http://code.google.com/p/lambdaj/. Totallylazy has worked best for me. I would encourage you to build these features on top of a general functional library like these because you will save time and improve existing projects.

Here is an example of using Querydsl with totallylazy:

List<SurveyTemplate> results = query.list(SurveyTemplate.class, getAll());
List<List<SurveyTemplate>> questions = Sequences.sequence(results)
    .groupBy( new Callable1<SurveyTemplate, Integer>() {
        public Integer call(SurveyTemplate option) {
            return option.getQuestionId();
        }
    })
    .sortBy(compareQuestionPos())
        .map( new Callable1<Group<Integer, SurveyTemplate>, List<SurveyTemplate>>() {
            public List<SurveyTemplate> call(final Group<Integer, SurveyTemplate> question) {
                    return question.sortBy(compareOptionPos()).toList();
            }
         })
    .toList();

Of course, the new Callable1 business does not have to be inline and could be an inner class instead. Comparators can be inner classes too.

Here is an example calling a comparator:

return question.sortBy(compareOptionPos()).toList();

And here is the inner class comparator:

public static Comparator<SurveyTemplate> compareOptionPos() {
    return new Comparator<SurveyTemplate>() {
        public int compare(SurveyTemplate x, SurveyTemplate y) {
            return Numbers.compare(x.getOptionPosition(),
                    y.getOptionPosition());
        }
    };
}

Cheers,

McKinley

Timo Westkämper

unread,
Sep 29, 2011, 2:01:12 AM9/29/11
to quer...@googlegroups.com
Hi McKinley.

We try to keep external dependencies of Querydsl Core to a minimum, but these integrations could be done in extension modules.

The Transformer functionality is mostly meant for grouping and also other result processing, but if something can be done on the database level like ordering it should be used there.

Br,
Timo Westkämper

Luis Fernando Planella Gonzalez

unread,
Sep 29, 2011, 9:05:39 AM9/29/11
to Querydsl
Hi.
I'm still not 100% sure I understood everything with groups....
I'd need a clearer high level view of the rationale for it, and some
use cases.
For the case I mentioned earlier, I think that just an improved
projection would do it.
This would, of course, require the order to be consistent with groups.
Here's an sketch of what could be done (kind of requires DTO
projection):

//DTOs
class User {
Long id;
String name;
List<Blog> blogs;
}
class Blog {
Long id;
User user;
String name;
List<Post> posts;
}
class Post {
Long id;
Blog blog;
Date date;
String user;
String comments;
}
QUser u = ...;
QBlog b = ...;
QPost p = ...;

Group<User> usersGroup = Group
.create(User.class)
.projecting(u.id, u.name)
.by(u.name);
Group<Blog> blogsGroup = usersGroup.subGroup(SubGroup
.create("blogs", Blog.class) // "blogs" is the property in User
which holds the collection
.owner("user") //Optional: property in blog which holds the
reference to User
.projecting(b.id, b.name)
.by(b.name));
blogsGroup.details(Details
.create("posts", Post.class)
.owner("blog")
.projecting(p.id, p.date, p.user, p.comments));

CloseableIterator<User> users = query....
.orderBy(u.name, b.name)
.iterate(usersGroup);


Perhaps this is not exactly what is meant with ResultTransformer.
It seems like a valid (and common - faced it several times already)
use case, but maybe, it's complementary, not conflicting with
ResultTransformer...

Samppa Saarela

unread,
Sep 29, 2011, 12:28:58 PM9/29/11
to quer...@googlegroups.com
ResultTransformer is needed in order to combine multiple rows. Q-constructors (@QueryProjection) can only be used to convert columns of a single row. With Querydsl-collections one "column" may refer to a collection-typed bean property, but with SQL one column of a row cannot hold a collection of values. This is something one simply cannot do with FactoryExpressions: in your example data model, they can provide individual Posts but a Blog requires a Set<Post>.

Extended bean projection (with collection-valued properties) requires that those values are first collected into some intermediate data structure. One must have all User related data, Blogs and Posts, collected and ready before we can instantiate a User DTO. Group is just that data structure. 

I think that the current implementation could easily support bean projection that doesn't require deeper nesting. E.g. Blog with Posts:

With constructors 

    @QueryProjection
    public User(Integer id, String name) {...}

    public Blog(Integer id, User user, String name, List<Post> posts) {...}

    @QueryProjection
    public Post(Integer id, String text) {...}

We could quite easily combine Q-constructors and groupings like this:

  Map<Integer, Blog> blogs = 
    GroupBy.create(
            blog.id
            new QUser(blog.user.id, blog.user.name), 
            blog.name)
        .withList(new QPost(post.id, post.text)) 
        .as(Blog.class).transform(query)

Note that the GroupBy definition matches directly with Blog constructor (Integer, User, String, List<Post>).

I haven't yet had time to sketch up deep nesting (sub grouping), but once that is supported, similar approach could be used also for User withList( Blog withList( Post )). And if that kind of GroupBy definitions were generated by APT using e.g. @BeanProjection annotation, we might just end up with something like (using G-prefix to distinguish from regular Q-Constructors): 

List<User> users = 
  query.transform( 
    new GUser( user.id, user.name
      new GBlog(blog.id, blog.name, new QPost(post.id, post.text) ) ) ); 

What do you think? Getting closer? 

Luis Fernando Planella Gonzalez

unread,
Oct 2, 2011, 10:47:31 AM10/2/11
to Querydsl
> What do you think? Getting closer?
Yes...
And with generated types, this is much easier and readable than what I
was thinking....

Timo Westkämper

unread,
Oct 5, 2011, 5:13:50 AM10/5/11
to quer...@googlegroups.com
I modified the GroupBy support to work with normal FactoryExpression instances for projections. Here are some examples from Querydsl Collections :
   
    @Test
    public void First_Set_And_List() {              
        Map<Integer, Group> results = MiniApi.from(post, posts).from(comment, comments)
            .where(comment.post.id.eq(post.id))
            .transform(groupBy(post.id, post.name, set(comment.id), list(comment.text)));
       
        Group group = results.get(1);
        assertEquals(toInt(1), group.getOne(post.id));
        assertEquals("Post 1", group.getOne(post.name));
        assertEquals(toSet(1), group.getSet(comment.id));
        assertEquals(Arrays.asList("Comment 1"), group.getList(comment.text));
    }

    @Test
    public void Transform_Results() {       
        Map<Integer, Post> results = MiniApi.from(post, posts).from(comment, comments)
            .where(comment.post.id.eq(post.id))
            .transform(groupBy(post.id, QPost.create(post.id, post.name, post.user)));
       
        Post post = results.get(1);
        assertNotNull(post);
        assertEquals(1, post.getId());
        assertEquals("Post 1", post.getName());
    }
   
    @Test
    public void Transform_As_Bean() {
        Map<Integer, Post> results = MiniApi.from(post, posts).from(comment, comments)
            .where(comment.post.id.eq(post.id))
            .transform(groupBy(post.id, Projections.bean(Post.class, post.id, post.name)));
       
        Post post = results.get(1);
        assertNotNull(post);
        assertEquals(1, post.getId());
        assertEquals("Post 1", post.getName());
    }

I switched to static imports to get a more compact syntax. Here are the static imports

import static com.mysema.query.group.GroupBy.groupBy;
import static com.mysema.query.group.GroupBy.list;
import static com.mysema.query.group.GroupBy.map;
import static com.mysema.query.group.GroupBy.set;

This is really just a different syntax layer compared to the previous efforts and an integration of the transformation via FactoryExpressions.

Do you think the following overloading is intuitive :

    public static <K> ResultTransformer<Map<K, Group>> groupBy(Expression<K> key, Expression<?>... expressions) {
        return new GroupBy<K, Group>(key, true, expressions);
    }
   
    public static <K,V> ResultTransformer<Map<K, V>> groupBy(Expression<K> key, FactoryExpression<V> expression) {
        return new GroupBy<K, V>(key, false, expression);
    }

Or should we use different methods for this? Basically if you provide a single FactoryExpression argument after key this is a taken as a transformation, otherwise you get Group typed results.

It would also be possible to separate the grouping from the projection, e.g.

    Map<Integer, Post> results = MiniApi.from(post, posts).from(comment, comments)
        .where(comment.post.id.eq(post.id))
        .transform(GroupBy.by(post.id).as(QPost.create(post.id, post.name, post.user)));

       

Luis Fernando Planella Gonzalez

unread,
Oct 5, 2011, 7:54:26 AM10/5/11
to Querydsl
> Do you think the following overloading is intuitive :
>
>     public static <K> ResultTransformer<Map<K, Group>> groupBy(Expression<K>
> key, Expression<?>... expressions) {
>         return new GroupBy<K, Group>(key, true, expressions);
>     }
>
>     public static <K,V> ResultTransformer<Map<K, V>> groupBy(Expression<K>
> key, FactoryExpression<V> expression) {
>         return new GroupBy<K, V>(key, false, expression);
>     }
>
> Or should we use different methods for this? Basically if you provide a
> single FactoryExpression argument after key this is a taken as a
> transformation, otherwise you get Group typed results.
>
> It would also be possible to separate the grouping from the projection, e.g.
>
>     Map<Integer, Post> results = MiniApi.from(post, posts).from(comment,
> comments)
>         .where(comment.post.id.eq(post.id))
>         .transform(GroupBy.by(post.id).as(QPost.create(post.id, post.name,
> post.user)));

Any of the 2 approaches are fine.
I have, however, a slight preference to separating the group from
projection, as it is more "future proof".
For example, adding subgroups could be something like:
GroupBy.by(...).as(...).subGroup(SubGroup.by(...).as(...))

Samppa Saarela

unread,
Oct 5, 2011, 6:03:13 PM10/5/11
to quer...@googlegroups.com
Bridging the gap between GroupDefinitions and (Factory)Expressions is definitely good.

Does that support nested groups (set(list(map(...))))? It seems to me that it does, but I didn't see a test case for this.

You threw away some quite useful features. For example
  • a possibility for custom GroupDefinitions (SortedSet, SortedMap etc.) - all possible grouping types are currently enumerated in GroupBy constructor
  • a possibility for analyzing transformer that can analyze the results in the first pass and then apply that to the final results. I think it would be nice to have a way to support (lazy and) batch loading. See e.g. SIQ "Avalanche safe nesting".
Could those WRAPPED operations be WrappedExpressions functioning as factories for GroupDefinitions, instead of SimpleExpressions? That would get rid of those ifs in the constructor and allow extensions. Supporting batch loading while being stateless without something like GroupProcessor might be a bit trickier...

I did some clean up on GroupBy: The boolean asGroup -functionality was better modeled in the sub class that was there already. This implementation lacks Map-value transforming (Group as something), but that is anyway better modeled using constructor expressions, or for custom transformations Google Collections (Maps.transformValues).

Timo Westkämper

unread,
Oct 6, 2011, 2:24:41 AM10/6/11
to quer...@googlegroups.com

On Thursday, October 6, 2011 1:03:13 AM UTC+3, Samppa Saarela wrote:
Bridging the gap between GroupDefinitions and (Factory)Expressions is definitely good.

Does that support nested groups (set(list(map(...))))? It seems to me that it does, but I didn't see a test case for this.

I have not yet tested them. Probably they don't yet work.

You threw away some quite useful features. For example
  • a possibility for custom GroupDefinitions (SortedSet, SortedMap etc.) - all possible grouping types are currently enumerated in GroupBy constructor
  • a possibility for analyzing transformer that can analyze the results in the first pass and then apply that to the final results. I think it would be nice to have a way to support (lazy and) batch loading. See e.g. SIQ "Avalanche safe nesting".
Could those WRAPPED operations be WrappedExpressions functioning as factories for GroupDefinitions, instead of SimpleExpressions? That would get rid of those ifs in the constructor and allow extensions. Supporting batch loading while being stateless without something like GroupProcessor might be a bit trickier...

That would require new Expression types. I don't want to introduce them at this point.
Reply all
Reply to author
Forward
0 new messages