Consider the following:
User user = new User();
user.setName("user");
Ebean.save(user);
Role role1 = new Role();
role1.setName("role1");
Ebean.save(role1);
user.getValidRoles().add(role1);
Role role2 = new Role();
role2.setName("role2");
Ebean.save(role2);
user.getValidRoles().add(role2);
Ebean.save(user);
List<Integer> roleIds = new ArrayList<Integer>();
roleIds.add(role1.getId());
List<User> result =
Ebean.find(User.class).join("validRoles").where().in("validRoles.id",
roleIds).findList();
assertEquals(1, result.size());
User resultUser = result.get(0);
assertEquals(user, resultUser);
assertEquals(2, resultUser.getValidRoles().size());
It returns only the valid roles that are in the roleIds list. If I
remove join("validRoles") from the query, the validRoles collection is
lazy loaded (both roles) and the test passes. Surely this difference
indicates a bug? JPA would always return the entire list of
validRoles.
/Daryl
On Mar 3, 9:48 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> Ok, how about a different example ...
>
> Lets say you have customers and each customer can have 10000's of orders...
> Aka ... sometimes you don't want ALL beans on a given path.
>
> Does this example make sense?
Yes, in fact I thought that's the way JPA would work at first. Now
that I understand ORM better, I feel it's proper for the collection to
represent the contents of the database. In this case I would select
orders rather than customers.
Suppose I didn't have join("validRoles") in the query. Wouldn't the
auto fetch figure out I'm using validRoles and automatically join it
in thus changing the behavior? A bit of a shifting sand problem.
Interestingly, when I change the query to this (adding JoinConfig), it
works as I expect:
List<User> result = Ebean.find(User.class).join("validRoles", new
JoinConfig().query()).where().in("validRoles.id", roleIds).findList();
So at least in the name of internal consistency (and with JPA) it
should return everything.
I can work up a JPA example, but it will take a bit so I'll get to it
tonight. I believe OpenJPA will generate a subquery in the where
clause to handle the criteria. So the fetch join and the criteria join
are independent.
/Daryl
On Mar 3, 9:48 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> ... I want to filter the customer orders to just get the Order.Status.NEW
> ones since last week (for all the customers starting with 'A%'.
In a sense, this is a "partial object graph", yes? I assume I can add/
remove from the filtered list and save it without blowing away all the
records I didn't load? I don't have a problem with the concept, it's
just that in this case I don't *feel* like I asked for a partial
graph.
/Daryl
> So, lets say each customer had tens of thousands of Orders
> you have some work to do to find
> the list of unique customers first and then ...
Yes, I understand the challenge. As I said, when I first found that
JPA did NOT filter the list I was surprised - you always get the whole
list. Likewise, I was surprised that Ebean filtered the list. I
definitely see it being useful to filter the list. My main point here
is that I didn't *think* I was asking for a filtered list. Sometimes
you want it one way, sometimes the other, and it needs to be clear
what will happen. Perhaps the default should be to load it all to be
consistent with JPA and JoinConfig should be used to specify that it
should be filtered. Not sure that's the "Ebean way" to do it, but you
see what I mean.
> > JoinConfig().query() ...
>
> Hmmm, I'll have a look at that (I'd suggest that is a bug).
Yes, that's the second part of my complaint - eager does not produce
the same as lazy. It seems like it would be easier to make eager
return all than lazy return filtered, but that's your problem,
right? :-)
I'll try to put together a JPA example tonight.
/Daryl
I'd suggest what you wanted to do was more like:
List<Integer> myRoles = ...;
Query<User> subquery =
Ebean.find(User.class).select("id").where().in("validRoles.id",
myRoles);
Ebean.find(User.class)
.where().in("id", subquery)
.findList();
Personally I have a SQL bias and I'd suggest the reason you where
suprised is because the ORM translated the query into sql that
filtered the users (root level object) and not filtered the *ToMany
(validRoles).
That is, when you do the JPA query and see the generated SQL I'd
expect to see a subquery used to limit the users - and the question
becomes ... if you want to end up with a sql subquery... why not use
an subquery with the ORM query in the same way?
That said... lets make sure we are talking about the exact same thing
when we have the JPA query and the sql it generates... we can put that
side by side with the orm query - and hopefully that clarifies it.
> the second part of my complaint - eager does not produce the same as lazy
Yup - that is going to be a bug - but firstly I want to get to the
bottom of the *ToMany filtering. Specifically I want to check the sql
that the jpa query generates.
Cheers, Rob.
On Mar 4, 8:07 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> > I was surprised - you always get the whole list.
>
> I'd suggest what you wanted to do was more like:
... subquery illustration
>
> Personally I have a SQL bias and I'd suggest the reason you where
> suprised is because the ORM translated the query into sql that
> filtered the users (root level object)
I've been mulling the whole thing over since my last post and came to
the same conclusion. If my original query was an API for generating
SQL, I'd totally expect it to filter the children. I even put a
comment in my code that I might have to convert the criteria to a
subquery.
I'm thinking that people (like me) coming from JPA are going to be
confused, but I'm not willing to say your current implementation is
not the right one. If you do implement JPQL some day, though, I think
the difference will be glaring. Maybe people will use only one
approach or the other...
> That is, when you do the JPA query and see the generated SQL I'd
> expect to see a subquery used to limit the users -
Yes that's what I expected to see, and I'm sure I've seen it in some
cases, but I can't dig one up quickly. My first attempt was straight
JPA with a join fetch. This produced a double cartesian of the root
with the children, even with DISTINCT. Anyway, here's the test case:
User user = setup.insertUser("user1");
Role role1 = setup.insertRole("role1");
user.getValidRoles().add(role1);
Role role2 = setup.insertRole("role2");
user.getValidRoles().add(role2);
user.save();
List<Integer> roleIds = new ArrayList<Integer>();
roleIds.add(role1.getId());
Query query = ExecutionResources.getEntityManager().createQuery(
" select distinct o from User as o" +
" join o.validRoles as vr" +
" where vr.name = :name");
query.setParameter("name", "role2");
JpaUtils.addFetchFields(query, User.class, "validRoles");
List<User> result = query.getResultList();
assertEquals(1, result.size());
User resultUser = result.get(0);
assertEquals(user, resultUser);
assertEquals(2, resultUser.getValidRoles().size());
JpaUtils.addFetchFields takes advantage of fetch plan manipulation to
eager fetch validRoles. The result is 2 queries rather than the
subquery. OpenJPA tends to generate more queries than I think it
should.
SELECT DISTINCT t0.userid, ...
FROM public.users t0
INNER JOIN public.userroles t1 ON t0.userid = t1.userid
INNER JOIN public.roles t2 ON t1.roleid = t2.roleid
WHERE (t2.name = ?) [params=(String) role2]
SELECT DISTINCT t0.userid...
FROM public.users t0
INNER JOIN public.userroles t1 ON t0.userid = t1.userid
INNER JOIN public.userroles t3 ON t0.userid = t3.userid
INNER JOIN public.roles t2 ON t1.roleid = t2.roleid
INNER JOIN public.roles t4 ON t3.roleid = t4.roleid
WHERE (t2.name = ?)
ORDER BY t0.userid ASC [params=(String) role2]
It looks a bit whacky to me, double joining the child tables. Not sure
this is all that helpful.
> and the question
> becomes ... if you want to end up with a sql subquery... why not use
> an subquery with the ORM query in the same way?
Yup, I hear ya. Looks like that's where I'm going...
> Yup - that is going to be a bug - but firstly I want to get to the
> bottom of the *ToMany filtering. Specifically I want to check the sql
> that the jpa query generates.
Just so we're clear, the default join/fetch produces a filtered child
set, the default lazy load produces a full set and the
JoinConfig().query() produces a full set.
/Daryl
On Mar 5, 1:57 am, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> > " join o.validRoles as vr"
>
> If you get a chance ... can you change this join to a "fetch join"...
I can't change it since I need the alias in the where clause, but I
can add the fetch:
" select distinct o from User as o" +
" left join fetch o.validRoles" +
" left join o.validRoles as vr" +
" where vr.name = :name");
SELECT DISTINCT t0.userid, t0.archivedate, t0.isarchived,
t0.authorizationLevel, t0.dateofhire, t0.email,
t0.firstname, t0.javascript, t0.lastname, t0.lastworkpatternstart,
t0.logindisableduntil, t0.loginlockedout,
t0.mi, t0.password, t0.passwordmodifiedon,
t0.roletimecreditdisplayunit, t0.schedulesharepasscode,
t0.swapcontactinfo, t0.swappreferences, t0.timecardid, t0.usercode,
t0.userlogin, t3.usertypeid, t3.name,
t3.printorder, t3.scheduleable, t0.workpatternlength, t4.userid,
t5.roleid, t5.breaklength, t5.isdisplaceable,
t5.endtime, t5.isexclusivewithrooms, t5.iconfileextension,
t5.iconpdfsize, t5.iconwebsize, t5.name,
t5.notificationtargets, t5.printcode, t5.printorder, t5.isrequired,
t5.rolelength, t6.rolepublicviewpolicyid,
t6.name, t7.roletypeid, t7.descr, t7.name, t7.printcode,
t5.showascover, t5.starttime, t8.approvalpolicyid,
t8.name, t5.istimetrackable, t9.usertypeid, t9.name, t9.printorder,
t9.scheduleable
FROM public.users t0
LEFT OUTER JOIN public.userroles t1 ON t0.userid = t1.userid
INNER JOIN public.usertypes t3 ON t0.usertypeid = t3.usertypeid
LEFT OUTER JOIN public.userroles t4 ON t0.userid = t4.userid
LEFT OUTER JOIN public.roles t2 ON t1.roleid = t2.roleid
LEFT OUTER JOIN public.roles t5 ON t4.roleid = t5.roleid
LEFT OUTER JOIN public.rolepublicviewpolicies t6 ON
t5.rolepublicviewpolicyid = t6.rolepublicviewpolicyid
LEFT OUTER JOIN public.roletypes t7 ON t5.roletypeid = t7.roletypeid
LEFT OUTER JOIN public.approvalpolicies t8 ON t5.swapapprovalpolicyid
= t8.approvalpolicyid
LEFT OUTER JOIN public.usertypes t9 ON t5.usertypeid = t9.usertypeid
WHERE (t2.name = ?)
ORDER BY t4.userid ASC
[params=(String) XX2_]
So we're down to one query, however the test fails now because it
returns 2 (of the same) users. The validRoles is still fully
populated, of course.
> Yes. I also think JPQL didn't help with it's need for distinct.
It doesn't seem to be working in this case. I do know there is a bug
with DISTINCT in OpenJPA but I'm not exactly sure of the nature.
> I think we want to do both ... and we want to make this as easy for people
> to use and understand. Hmmm.
The docs describe join() as a "join fetch" and Ebean will
automatically add joins to support select/where/order clauses. I
interpret this to mean that the join fetch is independent of the joins
used to satisfy the select/where/order. So this:
query.where().in("validRoles", myRoles)
has no visible relational join so I think it's just a criteria for
selecting my users. This:
query.join("validRoles").where().in("validRoles", myRoles)
adds the join fetch which is simply a "directive" (rather than a
relational join) to eagerly populate the validRoles. So the mechanism
that joins users to validRoles is not visible which leads to my
confusion. I'm not telling you either query should not filter
validRoles, just where my thinking was. I think we are in agreement
that you should be able to achieve both filtered and full, the issue
is it needs to be clear and easy.
/Daryl
I can see something like subquery() being generally useful - not just
in the case of *toMany. As you've written subquery().in, something
like subquery().eq would be useful, too. Not sure how chaining should
work, if you want more than one criteria in the subquery. How would
things "switch back" to the root query (how would you add more
criteria, maybe even another subquery, on the root where).
/Daryl
On Mar 5, 5:00 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> I'd suggest that is a bug because you put the distinct in your query. Note
> there is a difference here with Ebean. With Ebean you don't need to specify
> the distinct as the persistence context is used during the building of the
> object graph ... and that by definition we can't have 2 (of the same) users.
> To me this is like JPA implementations are not using the persistence context
> during it's object graph construction which I find very strange
Yes, OpenJPA just slaps the DISTINCT on to the beginning of the query.
I haven't analyzed the query to determine why this doesn't work.
The results contain 2 copies of the user, but I'm sure the 2 are the
same instance. I'll modify the unit test when I get a chance to show
this.
> So now I'm thinking we can go two ways...
> Option 1. stick with the current interpretation ... and force people to
> use subqueries and the like
> Option 2. change to use the same interpretation as JPQL ... and provide
> some mechanism so that people that want to filter the many can do so.
I've really had a hard time trying to decide what side of this I want
to be on. I've decided I do not want the child list filtered. In SQL
you have complete control over all the result rows. You get to put
things together however you want. With Ebean ORM queries you are
asking for a collection of an entity. You are not responsible for the
"intermediate" child rows. You don't "see" them or "process" them. So
the "sense" is that you are only working with the root entity rows.
Therefore the criteria should refer to these rows only.
We talked about the case of Customers and Orders. I suggested
selecting Orders rather than Customers which leads to the problem of
organizing Orders by Customers. It seems like (a variation of?)
query.findMap() could take care of this:
query = Ebean.find(Order.class)...
query.setMapKey("customer");
Map<Customer, List<Order>> resultMap = query.findMap();
/Daryl
> We talked about the case of Customers and Orders. I suggested
> selecting Orders rather than Customers which leads to the problem of
> organizing Orders by Customers. It seems like (a variation of?)
> query.findMap() could take care of this:
>
> query = Ebean.find(Order.class)...
> query.setMapKey("customer");
> Map<Customer, List<Order>> resultMap = query.findMap();
This is something I could have made use of sometimes either.
In fact, today I query the Order.class and build the map manually then
by walking through the list in a first pass and fetch the customer
property.
Hmmm .... and when I think of, it often gets even more complicated and
ends in something like this:
Map<Customer, Map<DeliveryMethod, List<Order>>> resultMap = ....
Such a result is also easily built by walking through the list in a
first pass.
I think it is much like the gropu by feature in sql plus keeping the
detail data.
Which then might look like this (I bring OrderItem in the game now):
Map<Customer, Map<DeliveryMethod, Map<Order, List<OrderItems>>>>
resultMap = Ebean.query(OrderItems.class).findGroups("order.customer",
"order.deliveryMethod", "order") // el expression
Question is if Ebean can make any use of the information (for
performance) that it has to find a map or group, else the findGroups()
method can also be a utility method in your application.
With a little bit of reflection magic you can do that.
Ciao,
Mario
Let me state the behavior as I understand it so we're all clear. The
criteria on filterMany() affects the orders collection and not the
root. I could restrict the customers based on one property of orders
while filtering orders on a different property of orders. The behavior
will be the same for default eager, join query, and lazy loading.
Multiple filterMany can be applied for different child collections
(though not fluid).
I like it, I think "filterMany" is very clear.
> Daryl, Mario ... are you happy to go with Option 2 ?
Yes.
/Daryl
On Mar 7, 3:00 am, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> >> I like it, I think "filterMany" is very clear.
>
> Great. I think if we document it well with some good examples it should be
> reasonably
> clear.
One more thing. What will the behavior of Ebean.refreshMany(...) be?
Will it reload the collection as filtered or load the full list?
/Daryl
> Daryl, Mario ... are you happy to go with Option 2 ?
Yep, 2 seems good to me!
> Ebean.find(Customer.class)
> .where().startsWith("name","A")
> .*filterMany()*.eq("orders.status",
> Order.Status.NEW).gt("orders.orderDate", lastWeek)
Hmmm ... about the filterMany() api.
In the above statement it is not clear which relation is filtered. If
you mix filtering *ToMay and Root-Filtering you might get confused.
At least adding the relation name on filterMany() might be required:
Ebean.find(Customer.class)
.where().startsWith("name","A")
.*filterMany("orders")*.eq("orders.status",
Order.Status.NEW).gt("orders.orderDate", lastWeek)
... or what about that:
Query<Customer> custQuery = Ebean.find(Customer.class);
custQuery.where().startWith("name", "A");
Query<Order> custOrderQuery = custQuery.findMany("orders",
Order.class);
custOrderQuery.where().eg("status", Order.Status.NEW).gt("orderDate",
lastWeek);
But this might be too much work, no?
Ciao,
Mario
On Mar 7, 2:13 pm, Mario Ivankovits <mario.ivankov...@gmail.com>
wrote:
> Hmmm ... about the filterMany() api.
>
> In the above statement it is not clear which relation is filtered. If
> you mix filtering *ToMay and Root-Filtering you might get confused.
>
> At least adding the relation name on filterMany() might be required:
Right, if you did this:
Ebean.find(Customer.class)
.where()
.filterMany().eq("orders.status", Order.Status.NEW)
.startsWith("name", "A")
that would be ambiguous (assuming order.name). Maybe adding the
relation name in filterMany implies a new "namespace" for the paths
under it.
Ebean.find(Customer.class)
.where()
.filterMany("orders").eq("status", Order.Status.NEW)
.startsWith("name", "A")
So status and name are properties of order (not that name makes sense
for an order, but you get my point - "orders" doesn't proceed each
property, they are all relative to the "orders" property of customer.)
I imagine a lot of this will have to do with the implementation
details.
>
> ... or what about that:
>
> Query<Customer> custQuery = Ebean.find(Customer.class);
> custQuery.where().startWith("name", "A");
>
> Query<Order> custOrderQuery = custQuery.findMany("orders",
> Order.class);
> custOrderQuery.where().eg("status", Order.Status.NEW).gt("orderDate",
> lastWeek);
>
> But this might be too much work, no?
I'm not really clear what the next step for the above is, I'm
expecting to execute one query to get a list of customers with order
collections filtered. Can you show how this would be executed to get
the result? Plus, I'm lazy, too much work. :-)
/Daryl
... or what about that:
Query<Customer> custQuery = Ebean.find(Customer.class);
custQuery.where().startWith("name", "A");
Query<Order> custOrderQuery = custQuery.findMany("orders",
Order.class);
custOrderQuery.where().eg("status", Order.Status.NEW).gt("orderDate",
lastWeek);
But this might be too much work, no?
Yea, my example was not quite right - probably this makes more sense?
Query<Customer> custQuery = Ebean.find(Customer.class);
custQuery.where().startWith("name", "A");
ExpressionList<Order> custOrderQuery = custQuery.findMany("orders",
Order.class);
custOrderQuery.eg("status", Order.Status.NEW).gt("orderDate",
lastWeek);
List<Customer> list = custQuery.findList();
Point is, that I think the findMany() should return some sort of
"other" query which one should configure the filter instead of simply
"streamlining" it into a normal query. I think that makes it clearer
what the result of the criteria will be. For all the eq/gt etc
Next, what if you need to join custQuery and custOrderQuery? Then we
would need something like this:
custOrderQuery.eq("xyz", custQuery.getProperty("xyz"));
I would like that syntax, even for the normal "query.join" as then you
do not have to take care about the alias stuff.
In fact, in our application I've create a slim query wrapper which
simulates that syntax.
Ciao,
Mario
Yep, it is not required to fix this filter thing. I just thought about
it as additional feature ... but yes, nothing we have to address yet.
Ciao,
Mario
Yeah, that's a tough one. I'm not sure what I'd expect it to do.
/Daryl
On Mar 11, 2:53 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
Hi Rob, thanks for working on this. I've updated and rebuilt from
source.
> NB: At this point a refresh on the many property ... will apply/preserve the
> filter.
>
> // the filter is remembered and applied on refresh
> Ebean.refreshMany(customer, "orders");
I don't see this behavior. Maybe I'm not understanding things. Here's
my test. I have one user with 2 validRoles, role1 and role2.
List<Integer> roleIds = new ArrayList<Integer>();
roleIds.add(role1.getId());
List<User> userResults =
Ebean.find(User.class).join("validRoles").where().in("validRoles.id",
roleIds).findList();
assertEquals(1, userResults.size());
User resultUser = userResults.get(0);
assertEquals(user, resultUser);
assertEquals(2, resultUser.getValidRoles().size()); // used to return
filtered
Ebean.refreshMany(resultUser, "validRoles");
assertEquals(2, resultUser.getValidRoles().size());
userResults = Ebean.find(User.class).join("validRoles")
.where().in("validRoles.id", roleIds) // roleIds contains only role1
id
.filterMany("validRoles").eq("name", "role2")
.findList();
assertEquals(1, userResults.size());
resultUser = userResults.get(0);
assertEquals(user, resultUser);
assertEquals(1, resultUser.getValidRoles().size());
assertEquals(role2, resultUser.getValidRoles().get(0));
Ebean.refreshMany(resultUser, "validRoles");
assertEquals(1, resultUser.getValidRoles().size()); // expected 1 but
was 2
It fails on the last line. So I'm restricting user by role1 and
filtering validRoles by role2. It works until the refresh.
/Daryl
Confirmed.
/Daryl