Potential performance improvement

Carl Hasselskog

unread,

Apr 23, 2012, 4:01:46 AM4/23/12

to h2-da...@googlegroups.com

Hi,

I've noticed that when I make queries that return just a single row (or very few rows) then H2 spends a fair amount of time in JdbcResultSet.getColumnIndex(String columnLabel). I've profiled it and noticed that a big part of that is spent in StringUtils.toUpperEnglish(String s). It does this despite the fact that I've specified DATABASE_TO_UPPER=FALSE in the URL. A potential performance improvement could be to not call StringUtils.toUpperEnglish(String s) if DbSettings.databaseToUpper==false. I guess that would be more correct in a semantic sense as well.

Regards

Carl Hasselskog

Noel Grandin

unread,

Apr 23, 2012, 5:08:32 AM4/23/12

to h2-da...@googlegroups.com, Carl Hasselskog

Apparently the JDBC spec requires this.

You could probably optimise your code by using the ResultSet getters
that index by column position.

> --
> You received this message because you are subscribed to the Google
> Groups "H2 Database" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/h2-database/-/QqDpmEkg1gEJ.
> To post to this group, send email to h2-da...@googlegroups.com.
> To unsubscribe from this group, send email to
> h2-database...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/h2-database?hl=en.

Noel Grandin

unread,

Apr 23, 2012, 8:07:02 AM4/23/12

to h2-da...@googlegroups.com, Carl Hasselskog

I had an idea.

Have a look at the code in org.h2.jdbc.JdbcResultSet at line 2856.
Replace
if (columnCount >= 3) {
with
if (columnCount * result.getRowCount() >= 10) {
and tell us what your performance profile looks like, and if it's any
better.

You might need to play with the constant a little bit.

Basically, for small numbers of rows*columns, it's cheaper to call
equalsIgnoreCase() than it is to call toUpper()

On 2012-04-23 10:01, Carl Hasselskog wrote:

Carl Hasselskog

unread,

Apr 23, 2012, 4:21:05 PM4/23/12

to Noel Grandin, h2-da...@googlegroups.com

That's a cool idea! I also had another (which is not mutually exclusive to yours). Perhaps one can cache the column-mapping in JdbcPreparedStatement and re-use it for every execution of the same query. I don't know if there's any corner case where JdbcPreparedStatement might have different column-mappings in different executions but perhaps one can make it an optional feature?

Regards
Carl

Noel Grandin

unread,

Apr 23, 2012, 4:44:42 PM4/23/12

to h2-da...@googlegroups.com

Good idea. We already do that :-)

Carl Hasselskog

unread,

Apr 23, 2012, 4:52:32 PM4/23/12

to h2-da...@googlegroups.com

Haha, good! But what about the field columnLabelMap? From what I can that is re-calcalculated in every new instance of JdbcResultSet. Can't that be cached as well?

-----Original Message-----
From: h2-da...@googlegroups.com [mailto:h2-da...@googlegroups.com] On Behalf Of Noel Grandin
Sent: den 23 april 2012 22:45
To: h2-da...@googlegroups.com
Subject: Re: Potential performance improvement

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.

Noel Grandin

unread,

Apr 24, 2012, 2:48:51 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

That's a good idea. It would only work in the case of PreparedStatement
of course.

It should be possible to create a subclass of JdbcResultSet called
JdbcPreparedStatementResultSet, which stores the cache in the
JdbcPreparedStatement object, so that it is only calculated once for the
PreparedStatement.

Care to give it a bash?

Carl Hasselskog

unread,

Apr 24, 2012, 4:05:46 AM4/24/12

to h2-da...@googlegroups.com

Sure, I can take a look at it. Do you have any procedures on how to submit patches? Any special format you want it in?

-----Original Message-----
From: h2-da...@googlegroups.com [mailto:h2-da...@googlegroups.com] On Behalf Of Noel Grandin

Sent: den 24 april 2012 08:49
To: h2-da...@googlegroups.com
Cc: Carl Hasselskog
Subject: Re: Potential performance improvement

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.

Noel Grandin

unread,

Apr 24, 2012, 4:12:37 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

Unified diff is the best patch format.
Generally we like people to add unit tests, but in this case you're not
adding a new feature, so don't worry about that.
But you should at least make sure that the existing unit tests pass.

Oh, and we'll need a license statement with your patch to the effect
that you're contributing the patch under the terms of the H2 license:
http://www.h2database.com/html/license.html

Carl Hasselskog

unread,

Apr 24, 2012, 5:12:00 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

I just checked out the latest version and started running the tests (build.bat test) but many of the tests failed. It hasn't finished yet but I've attached the errors that have been logged so far. Any idea on what's causing this?

Regards

Carl

error.txt

test.out.txt

Noel Grandin

unread,

Apr 24, 2012, 5:17:25 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

Ha. That was my fault for not testing a recent patch properly.
Please update from SVN and it should be fixed.

Carl Hasselskog

unread,

Apr 24, 2012, 5:42:19 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

That solved most the problems but not all. Here's the new error-log. The Swedish message in the IOException says that the file is locked.

error.txt

test.out.txt

Noel Grandin

unread,

Apr 24, 2012, 5:46:49 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

Try adding the jars in the /ext folder into your classpath.

Carl Hasselskog

unread,

Apr 24, 2012, 5:52:35 AM4/24/12

to h2-da...@googlegroups.com

Still get the same errors (I added it to the -cp argument in Build.bat, I assume that was what you meant).

-----Original Message-----
From: Noel Grandin [mailto:noelg...@gmail.com]

Sent: den 24 april 2012 11:47
To: h2-da...@googlegroups.com
Cc: Carl Hasselskog
Subject: Re: Potential performance improvement

test.out.txt

error.txt

Noel Grandin

unread,

Apr 24, 2012, 6:08:14 AM4/24/12

to h2-da...@googlegroups.com, Carl Hasselskog

No, I don't use build.bat, I do all my work inside Eclipse, so I run
stuff directly.

Honestly, sorry, I have no idea what your problem is, Thomas wrote all
of that build stuff.

Thomas Mueller

unread,

Apr 26, 2012, 1:18:22 AM4/26/12

to h2-da...@googlegroups.com

Hi,

But what about the field columnLabelMap? From what I can that is re-calcalculated in every new instance of JdbcResultSet. Can't that be cached as well?

It could be cached in theory, but I don't think it would be worth the effort. If you care a lot about performance, I suggest to use column indexes.

> The Swedish message in the IOException says that the file is locked.

That could mean another build is currently running. Could you verify this using "jps -l"?

> java.lang.NoClassDefFoundError: org/apache/lucene/search/Searcher

That means Lucene is not available, or (more likely) a different, incompatible version is available on the classpath. Could you run

built.bat clean

and then

built.bat

before running the test? Or use Eclipse.

Regards,

Thomas

Carl Hasselskog

unread,

Apr 26, 2012, 5:16:34 AM4/26/12

to h2-da...@googlegroups.com

I’ve identified the problem. It was Window’s indexing service that locked the files. Moving it to a non-indexed dir solved it. I’ll run the benchmark now and then we can see if the caching improves anything.

Regards

Carl

Carl Hasselskog

unread,

Apr 26, 2012, 7:05:45 AM4/26/12

to h2-da...@googlegroups.com

I’ve now created a patch for this (see attachment). Feel free to use it under whatever license suits you.

When running then benchmark the difference was negligible. However, after reading the benchmark code I couldn’t find any part of that actually would benefit from this patch (since all read are by column-index not by name). Perhaps that it something that could be added (e.g. by making queryReadResult read by column-name)? My guess would be that getting by column-name would be more common so that could potentially make the tests more similar to a real-world scenario.

Regards

Carl

cachedColumnMappingPatch.diff

Noel Grandin

unread,

Apr 26, 2012, 7:22:06 AM4/26/12

to h2-da...@googlegroups.com, Carl Hasselskog

Cool, thanks for your work.

Can you take measurements, then apply this patch to the test code, and report on the difference?

patch.txt

Noel Grandin

unread,

Apr 26, 2012, 7:33:44 AM4/26/12

to h2-da...@googlegroups.com, Carl Hasselskog

Hmm, I ran the measurements myself, and I'm seeing an improvement, but it's on the order of 0.1%

What might be more interesting is to see how much of a difference it makes to your own application, since your tests indicated that it was showing up on the performance profile.

Carl Hasselskog

unread,

Apr 26, 2012, 8:18:21 AM4/26/12

to h2-da...@googlegroups.com

I debugged your test and the column-label map is never used because the query only has two columns.

Regards

Carl

From: Noel Grandin [mailto:noelg...@gmail.com]
Sent: den 26 april 2012 13:34
To: h2-da...@googlegroups.com
Cc: Carl Hasselskog
Subject: Re: Potential performance improvement

Hmm, I ran the measurements myself, and I'm seeing an improvement, but it's on the order of 0.1%

Noel Grandin

unread,

Apr 26, 2012, 8:40:53 AM4/26/12

to h2-da...@googlegroups.com, Carl Hasselskog

OK, with that fixed, the patch now makes a small but consistent difference in the test (0.1%)
So I've committed it, it will be in the next release.

--

You received this message because you are subscribed to the Google Groups "H2 Database" group.

Carl Hasselskog

unread,

Apr 26, 2012, 8:42:27 AM4/26/12

to h2-da...@googlegroups.com

Great, thanks! How big was the difference for the particular queries that were effected?

Reply all

Reply to author

Forward