Hard to force the execution order for tables?

53 views
Skip to first unread message

Pablo Beltran

unread,
Jun 22, 2015, 10:06:18 PM6/22/15
to h2-da...@googlegroups.com
Hi,

Sometimes the execution order of tables can vary a lot by add-in a simple condition.

select * from
tableA a
INNER JOIN
tableB  b on b.id=a.id
WHERE
..

depending on the where conditions tableA is invoked first or not.

How to force H2 to use tableA prior tableB always?

Thanks!
Pablo.


Pablo Beltran

unread,
Jun 23, 2015, 6:08:31 AM6/23/15
to h2-da...@googlegroups.com
A more detailed example:

The query below:

select i.key, c.field, c.created , c.fromvalue , c.tovalue  from issues i, issuechanges c where  i.jql='project=DPO' and c.field = 'Sprint' and  c.issueid = i.id

has the following execution plan:

SELECT
    I.KEY,
    C.FIELD,
    C.CREATED,
    C.FROMVALUE,
    C.TOVALUE
FROM PUBLIC.ISSUES I
    /* function: JQL = 'project=DPO' */
    /* WHERE I.JQL = 'project=DPO'
    */
INNER JOIN PUBLIC.ISSUECHANGES C
    /* function: ISSUEID = I.ID
        AND FIELD = 'Sprint'
     */
    ON 1=1
WHERE (C.ISSUEID = I.ID)
    AND ((I.JQL = 'project=DPO')
    AND (C.FIELD = 'Sprint'))


Notice please that the execution order of the tables ISSUES -> ISSUECHANGES above  is reversed if a new condition is added:

select i.key, c.field, c.created , c.fromvalue , c.tovalue  from issues i, issuechanges c where  i.jql='project=DPO' and c.field = 'Sprint' and  c.issueid = i.id and c.created < now()

Execution plan:

SELECT
    I.KEY,
    C.FIELD,
    C.CREATED,
    C.FROMVALUE,
    C.TOVALUE
FROM PUBLIC.ISSUECHANGES C
    /* function: CREATED < NOW()
        AND FIELD = 'Sprint'
     */
    /* WHERE (C.FIELD = 'Sprint')
        AND (C.CREATED < NOW())
    */
INNER JOIN PUBLIC.ISSUES I
    /* function: ID = C.ISSUEID
        AND JQL = 'project=DPO'
     */
    ON 1=1
WHERE (C.CREATED < NOW())
    AND ((C.ISSUEID = I.ID)
    AND ((I.JQL = 'project=DPO')
    AND (C.FIELD = 'Sprint')))



How could the initial execution order be preserved?


Thanks in advance!
Pablo.

Pablo Beltran

unread,
Jun 23, 2015, 6:23:09 AM6/23/15
to h2-da...@googlegroups.com
I forgot to mention that there are not indexes on any column because they are plugged tables....


On Tuesday, June 23, 2015 at 4:06:18 AM UTC+2, Pablo Beltran wrote:

Sergi Vladykin

unread,
Jun 23, 2015, 7:07:31 AM6/23/15
to h2-da...@googlegroups.com
Hi!

Just use left join on foreign key, your conditions in where clause already will make sure that there will be no extra entries in result set.

Sergi

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Pablo Beltran

unread,
Jun 23, 2015, 10:30:32 AM6/23/15
to h2-database
Hi Sergi, I think this thread is related to other thread I started some days ago:


I plugged some tables without indexes and I think this brings some confusion to the H2 execution query optimizer because the engine is not aware about the optimized columns (indexed) in order to choose the better sequence to invoke the tables. In my opinion, supporting indexes on the plugged tables would be the right way because it would also prevent no expert users from write heavy queries furthermore it is the standard way for databases in general and the expected way by the H2 engine, of course

Anyway thanks!!
Pablo.

--
You received this message because you are subscribed to a topic in the Google Groups "H2 Database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/h2-database/7O4chpOQ8vo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.

Sergi Vladykin

unread,
Jun 23, 2015, 12:57:55 PM6/23/15
to h2-da...@googlegroups.com
Pablo,

Indexes are supported in custom tables - see org.h2.table.Table.getIndexes
Here you can return any Index implementations you want and optimizer will try to choose the best one.

Sergi

Pablo Beltran

unread,
Jun 23, 2015, 5:29:43 PM6/23/15
to h2-database
Yeap, this was my guess. However, it does not work and it's becoming a headache. I would say there is a bug in the Table class method below:


public PlanItem getBestPlanItem(Session session, int[] masks,
TableFilter filter, SortOrder sortOrder) {
PlanItem item = new PlanItem();
item.setIndex(getScanIndex(session));
item.cost = item.getIndex().getCost(session, null, null, null);
ArrayList<Index> indexes = getIndexes();
if (indexes != null && masks != null) {
for (int i = 1, size = indexes.size(); i < size; i++) {
Index index = indexes.get(i);
double cost = index.getCost(session, masks, filter, sortOrder);
if (cost < item.cost) {
item.cost = cost;
item.setIndex(index);
}
}
}
return item;
}

because the first Index defined on a custom Table is totally ignored. The initial value of the for loop should start at 0 rather than 1:

for (int i = 1, size = indexes.size(); i < size; i++) {

for (int i = 0, size = indexes.size(); i < size; i++) {

With the change above, it seems to work a bit better. The Index.getCost() method is then invoked with and without the masks bits in order to compare both costs (fullscan vs index). However there are still some issues which require further investigation because despite this fix the joined tables are not invoked in the expected order yet.




Sergi Vladykin

unread,
Jun 23, 2015, 5:55:06 PM6/23/15
to h2-da...@googlegroups.com
The scan index must be always at 0. As you can see the PlanItem is initialized with it, so starting with 1 is correct here.
I think you are trying to have your scan index sorted but this is wrong, data can be sorted but sort order must be exposed using separate index object in current H2 design. You need to have two indexes: one is a scan index which does not know about filtering, sorting and stuff at 0 and another one is order/filter/mask aware index at 1. Both indexes can share the same data without any duplication (I'd suggest to just create ScanIndex wrapper around your sorted index).

Sergi

Pablo Beltran

unread,
Jun 26, 2015, 4:14:05 AM6/26/15
to h2-da...@googlegroups.com
Fortunately, H2 allows to override the Table.getBestPlanItem. Then starting at 0 or at 1 depends on whether you use or not the default implementation. I was aware that H2 invokes the Index.getCost() with all masks set to 0 (no index -> full scan) at the first attempt and with the bit masks set for each column condition defined on the table (via TableFilter) later, therefore developers have to analyze the bit masks and return the cost accordingly because H2 will choose the lower cost Index for the execution plan. I also guessed that I must add a index (no mask aware) first, in order to make it work as you mentioned. However, despite I think this approach is really great!!, I missed some improvements when I looked at the "using a function as a table" implementation sources because the index relies on the related table to calculate the cost and the table returns a fixed cost (10000), therefore all the indexes have the same cost: then the first index found by H2 is used regardless the rest, because all of them return the same cost. This makes H2 looks like choosing indexes randomly for function tables. I have to clarify that I took the "function table" implementation as reference for my custom plugged tables, hence good and bad things were inherited. Overriding the getBestPlanItem was the solution because the bit masks allows to get the right index for the plan by ignoring the "cost" concept totally. I mean that the bit masks unveil the columns using a condition in the where clause, so you simply need to look for a index matching the columns with the bits enabled (1).That's all. It's working great and now H2 is choosing the right index always for any query :). 

Pablo.




On Tuesday, June 23, 2015 at 4:06:18 AM UTC+2, Pablo Beltran wrote:
Reply all
Reply to author
Forward
0 new messages