Query management when dinamically combining statements on @OneToMany

domenico stragliotto

unread,

Apr 14, 2021, 5:15:34 AM4/14/21

to Ebean ORM

I've run into a couple of problems when trying to combine query statements, to create a dynamic api, since having a oneToMany association between entities is leading to some performance problems with the underlying generated queries.

1)

- .raw("mMarkedForDeletion is ?", false)

I had to rely on raw since i didn't find a way to add this condition with the orm methods.

Using the eq("mMarkedForDeletion", false) this created the condition "deleted = false" and for some reason unknown to me it was a LOT slower than using the "IS" condition.

Still i had a problem since it did not seem to being able to bind the parameter to the value ( now it's hardcoded but it should be a input parameter in the future) with the following exception: "javax.persistence.PersistenceException: Query threw SQLException:unexpected token: ? in statement"

I've had to resolve putting the whole sentence for making it work :

.raw("mMarkedForDeletion is false")

2)

- setDistinct(false)

I've tried using this, since looking at the underlying generated query it always added the distinct on the query and removing it drastically improved performance ( for how the db is constructed i already know i will not have duplicated rows even after the join with the oneToMany relation), but i see that even with this distinct is still alwasy added to the query.

3) .filterMany("mCustomAttributes")

.eq("mCustomAttributes.mFlag", true).eq("mCustomAttributes.mCompositeId.mUserId", userId)

I've tried to limit the number of fetcher subentities on the oneToMany since the associated rows could be a lot ( 1M+ in bad scenarios), but i find that after creating the first query, ebean starts querying the second table, the many side, pulling out batches of rows while passing the ids on the "in(?,?,?,...) (id1,id2,id3,...)" i think for fetching the subentities. but should not filtermany avoid this problem and fetch only the matched sub entities instead of all the rows of the second table?

If you need other details about the cases i'm here.

Thanks a lot for the help

Rob Bygrave

unread,

Apr 19, 2021, 9:37:16 PM4/19/21

to ebean@googlegroups

> .raw("mMarkedForDeletion is ?", false)

I think it would be good to know the database and database version in question. Ideally we see the query plan(s) [working and non working query plans] ... that gives us the ability to reproduce and see if this is a database query planner issue that can be resolved or how wide spread the issue would be.

> - setDistinct(false)

Can you show me any example query for this?

> .filterMany("mCustomAttributes")

I can have a look at this and see if we can adjust this such that the ToMany with filterMany can be sql joined. I'll have a look and see.

--

---
You received this message because you are subscribed to the Google Groups "Ebean ORM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ebean+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ebean/8767e3d8-f5cf-4eb4-932e-0b4f2c5e9dbbn%40googlegroups.com.

domenico stragliotto

unread,

Apr 20, 2021, 4:58:13 AM4/20/21

to Ebean ORM

> .raw("mMarkedForDeletion is ?", false)

We're using HSQLDB version 2.4.1

> - setDistinct(false)

We're using ebean 12.7.2.

I'll add an extract of the node to explain the case.

class Node {

...

@OneToMany(fetch = FetchType.LAZY)
@JoinColumn(name = "node_id", referencedColumnName = "node_id", insertable = false, updatable = false)
private List<ShareEbean> mShares;

@OneToMany(fetch = FetchType.LAZY)
@JoinColumn(name = "node_id", referencedColumnName = "node_id", insertable = false, updatable = false)
private List<NodeCustomAttributesEbean> mCustomAttributes;

...

}

Both ShareEbean and NodeCustomAttributesEbean have an embeddedId composed of two fields.

I'll paste a case that we were trying to solve before having to switch to solutions we are using to circumvent the problem

this.db.find(Node.class)
.where()

.raw("mMarkedForDeletion is ?", false)

.or()
.eq("mOwnerId", userId)
.eq("mShares.mComposedPrimaryKey.mTargetUserId", userId)
.endOr();

db is an instance of the Database class.

We then save the query in the class since the other conditions are optional and if necessary we append

this.query

.filterMany("mCustomAttributes")
.eq("mCustomAttributes.mFlag", true)
.eq("mCustomAttributes.mCompositeId.mUserId", userId)

Which is the filtermany condition causing the aforementioned problem

We then close and start the query with:

this.query.setDistinct(false).order().asc(Drive.Db.Node.NAME).findList()

Even without ordering the problem does not change.

But looking at the db we see the sent query always starts with "select distinct" , trying to examine the code we saw that distinct is added if setDistinct is true OR ebean evaluates the query needed that, and it seems in case of a join it always adds that, being an OR condition setDistinct(false) cannot override that calculation.

( ebean-core-12.7.2.jar!/io/ebeaninternal/server/query/CQueryBuilder.class row 594

this.distinct = query.isDistinct() || select.isSqlDistinct();

Shouldn't this condition be and AND instead of an OR ? )

On other note we are having now bit of a problem while unit testing ebean query. We have usually no problem mocking finds like:

when(mEbeanServerMock.find(Node.class)
.where()
.idEq(mNodeId)
.findOneOrEmpty()
).thenReturn(Optional.of(node));

but when trying to mock calls with ordering it give exception since :

when(mEbeanServerMock.find(Node.class)
.where()
.idEq(mNodeId)

.order.asc(Drive.Db.Node.NAME)
.findOneOrEmpty()
).thenReturn(Optional.of(node));

order ( or orderBy) returns null ( even when mocking ebeanServer with deepstubs) since orderBy is final and cannot be mocked thus preventing us for being able to mock and test server find with ordering.

Rob Bygrave

unread,

Apr 20, 2021, 5:59:56 AM4/20/21

to ebean@googlegroups

db.find(Node.class)
.where()
.eq("mShares.mComposedPrimaryKey.mTargetUserId", userId)

Run this query. Have a look at the SQL that is generated, noting the inclusion of the sql distinct.

The sql distinct is included because the predicate is on a cardinality ToMany path and the predicate is effectively a filter on Node [the root type]. This is unlike filterMany predicates which are a filter on the many path.

If you have a look at that, I think you will see that is why Ebean is adding the SQL distinct (the predicate is applying a filter on the "root type" / Node, but we have a cardinality many join so it must add SQL distinct).

This is the distinction between filterMany predicates [that apply to the ToMany] and all other predicates [that apply to the root type]. We can't just remove the sql distinct and get the same result.

Does that make sense?

Cheers, Rob.

To view this discussion on the web visit https://groups.google.com/d/msgid/ebean/6ff41e72-b203-4284-b7d7-9b6a62b191b1n%40googlegroups.com.

Rob Bygrave

unread,

Apr 21, 2021, 3:08:42 AM4/21/21

to ebean@googlegroups

> I've tried to limit the number of fetcher subentities on the oneToMany since the associated rows could be a lot ( 1M+ in bad scenarios), but i find that after creating the first query, ebean starts querying the second table, the many side, pulling out batches of rows while passing the ids on the "in(?,?,?,...) (id1,id2,id3,...)" i think for fetching the subentities. but

> ... should not filtermany avoid this problem and fetch only the matched sub entities instead of all the rows of the second table?

Currently filterMany() has the effect of automatically making that path a fetchQuery path. What would fix this issue would be a change to filterMany such that it didn't do this and then that path and the filterMany predicates would all be included in the sql (the origin query).

I've created a PR that does this:

https://github.com/ebean-orm/ebean/pull/2222

If you are able to build from source you could build a version of Ebean from that branch/PR and confirm that sorts the issue.

Note: I'm happy with the PR but it is a behaviour change so I am pondering what the impact of that will be. Effectively to get the current behaviour people might need to explicitly mark that path as fetchQuery so that seems ok and good but I'm just pondering that a little bit and wondering if people will have feedback on the PR/change.

Cheers, Rob.

domenico stragliotto

unread,

Apr 21, 2021, 8:55:23 AM4/21/21

to Ebean ORM

I get that, what i find an odd behaviour than is the behaviour of the setDistinct method since

- if i use setDistinct(true) the query will always have distinct independently of how ebean evaluates if the query needs it or not

- if i use setDistinct(false) the query will always have the value evalueted by ebean, since it's an OR clause

I have no way of forcing the removal of the distinct, which is my case and what i was expecting the setDistinct method was doing, forcing the value i choose on the query, since i know that even with the join i will not have duplicates and distinct is not needed and it's only degrading performance on my query.

Rob Bygrave

unread,

Apr 26, 2021, 8:42:34 PM4/26/21

to ebean@googlegroups

> - if i use setDistinct(false) the query will always have the value evalueted by ebean, since it's an OR clause

Yes, let me have a look. We should be able to treat this as a bug.

[That is change DefaultOrmQuery distinct field from boolean to Boolean ... and handle the case of explicit false being set/used].

Cheers, Rob.

To view this discussion on the web visit https://groups.google.com/d/msgid/ebean/dae06974-d0ba-40f4-ac94-15944a5b8032n%40googlegroups.com.

domenico stragliotto

unread,

Apr 27, 2021, 4:52:52 AM4/27/21

to Ebean ORM

Thanks a lot Rob, this could solve a lot of problems for us :D

Cheers, domenico

domenico stragliotto

unread,

Apr 27, 2021, 4:57:36 AM4/27/21

to Ebean ORM

Sorry for the late reply about this, sadly at the moment i cannot build from source because of our project build system so i have no way to test the issue.

I'll follow the PR and see when it could be released, i hope it does since it could help with performance on our queries since it's a metric we have to be really careful about.

Cheers, Domenico

Reply all

Reply to author

Forward