Very slow calculateCost

100 views
Skip to first unread message

itineric

unread,
Dec 1, 2016, 11:15:38 AM12/1/16
to H2 Database
Hi,

I have a database with many tables. This database is most part empty or has at most 10 lines per table.
I have a query that uses many tables (~50) and uses inner/left join (always on indexed columns (PK/FK)).
The query is really slow but not on execution, during prepareSatement.
I activated traces on the database and find out that there were 10 000 000 (at least) calls to "potential plan item cost".
There is also the same call repeatedly "calculate cost for plan [T1 ... T30]" (where T1 ... T30 lists all the aliases of my tables.
I had to stop the database when trace file got 1Gb big.
Last cost printed in trace file was "best plan item cost 1,358,740,342,666,069,800,000,000,000,000"... It seams really big... for such an empty database.

On other databases the query executes in less than 10ms. On H2, it needs between 5 and 6 seconds to perform prepareStatement.

Is there a way to configure some optimization algorythm ? Or a way to not optimize at all ?

Regards,

Eric

Sergi Vladykin

unread,
Dec 1, 2016, 11:35:57 AM12/1/16
to h2-da...@googlegroups.com
Hi!

If you know correct join order for tables in your query, you can enable setting FORCE_JOIN_ORDER on you connection. This way H2 optimizer will not try to find the optimal join order, but will just find the best indexes for the join order as it is written in query. It will be much faster.

Another option is to use some connection pool with caching of prepared statements. This will allow to optimize query only once and then reuse the prepared statement.

Sergi

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscribe@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at https://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Christian MICHON

unread,
Dec 1, 2016, 12:57:13 PM12/1/16
to H2 Database
Can you please share the database with H2 community?

It's hard to tell what is happening without a concrete dataset or at least the complete DDL.

Please do so if the above suggested solution does not work for you.

itineric

unread,
Dec 2, 2016, 9:54:06 AM12/2/16
to H2 Database
I have tested the FORCE_JOIN_ORDER option, prepareStatement is faster but it is still not good enough. Now it takes 1.5s.

I had to impersonate the database and cannot provide any data but you will find the database creation script attached.

I also joined the query which I have troubles with. The query is generated from Java code (representing search filters) to SQL.
You may have a solution that whould be to rewrite the query but I cant do that since the query is generated from Java filters that can be combined in many different ways.
The main issue is that optimise algorythm does too many iterations on each joins. You will see that with the last h2 version and trace level 3.

As I said before, the query is top-fast on other DBMS, optimiser does the job right, removes unnecessary joins, etc.

Hope it helps,

Regards,

Eric

h2_database.sql
h2_query.sql

Noel Grandin

unread,
Dec 2, 2016, 9:58:07 AM12/2/16
to h2-da...@googlegroups.com
Also, which version are you testing against?

One of our recent versions had a regression where it prepared statements slower than before.

itineric

unread,
Dec 2, 2016, 10:21:47 AM12/2/16
to h2-da...@googlegroups.com
When I found the problem I was using version "1.4.182". I tested on version "1.4.193" the result was pretty much the same. I just find out what the problem was using the last version since the debug is more verbose during optimization phase in the latest version.


2016-12-02 15:57 GMT+01:00 Noel Grandin <noelg...@gmail.com>:
Also, which version are you testing against?

One of our recent versions had a regression where it prepared statements slower than before.

itineric

unread,
Dec 7, 2016, 11:02:55 AM12/7/16
to H2 Database
Hello,

Does anyone have an idea how I could get better performances ?

Regards,

Eric

Steve McLeod

unread,
Dec 8, 2016, 3:22:46 AM12/8/16
to H2 Database
Your query is extreme. Hundreds of joins, dozens of nested selects. Way too big, and way too complicated. You'll never get any decent performance with a query like that, nor will you ever be able to analyse and understand the reasons for the performance problems.

The solution to your problem is to redesign your database. Read up on database normalisation.

If you need ad hoc, complicated queries, first load your data from your normal schema into a star schema as described in database warehouse textbooks.

Christian MICHON

unread,
Dec 8, 2016, 7:30:09 AM12/8/16
to H2 Database
I beg to differ: I ran the DDL into DBVisualizer (references mode, circular view) and from the look of it this is a real database with obfuscated names.

Redesigning the database is not an option if it's not from the original author: the query itself has to be redesigned and this leads to the validity of the use case.

@Itineric: can you share the application type and the use case of your query?

itineric

unread,
Dec 8, 2016, 7:52:42 AM12/8/16
to h2-da...@googlegroups.com
Yes it is an obfuscated real database as I mentionned it when I attached the files.

The database is designed to trace all actions performed on an application. The database design is what it is and is mostly optimized to store/restore particular data fastly.
The use case of this query is that the application provides an API to query "everything". The querying API provides classes / methods to filter returned elements. So the query is mostly generated to represent what was asked through the API.
I know the query is complex and one solution whould be to change the query itself (but as I said, it is generated).
If other databases had similar problems, I would be looking into that but MySql, PostgreSql, Oracle handle it very fast. I would be expecting H2 to have better performances then it does, especially when the database is empty ! (I am not expecting the same performance from H2 that I get on Oracle but like to get something acceptable).


--
You received this message because you are subscribed to a topic in the Google Groups "H2 Database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/h2-database/6B5Sla2PkG8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to h2-database+unsubscribe@googlegroups.com.

Christian MICHON

unread,
Dec 8, 2016, 8:17:48 AM12/8/16
to H2 Database
Thanks for clarifying. So you've no control on the database DDL and no control on the query (as it is a generic query for an API).

My usual practice for long API is to at least split the internal elements into subqueries and only compile the result at the end in the business logic server language. Roda on ruby is typically a good tool for this.

If you cannot split this long generic query into manageable bits, then you have a good reproducible slow performance testcase for H2 maintainers.

My experience is that H2 is really good as temporary database with decent performances. When used in ETL (extract-tranform-load) data migrations, it is faster than major DBs.

Yet if you need stable DB engine for 24/7 operations, this might not be the best candidate until we move out of beta again. 1.3.176 was really good.
To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.

Noel Grandin

unread,
Dec 10, 2016, 1:38:48 PM12/10/16
to h2-da...@googlegroups.com

I suspect that we have an O(n^2) problem here, triggered by the way TableView and ViewIndex tend to pass around query information as a SQL string.

Which causes us to unnecessarily reparse and re-optimise subtrees of queries.

Changing that, however, is a fair amount of work which I'm not likely to get to anytime.
Reply all
Reply to author
Forward
0 new messages