Hi,
We have been happily using h2 for a number of years now for our development teams where our final deployment is on an Oracle or DB2 database – thank you!
What we have recently noticed is some performance issues with JUnit testing against H2 versus Oracle so over the last few days I have spent some time investigating this and would like to report the problem.
We use the same JDBC access across all the databases that our product supports and the recent performance issues come down to some testing using large Blobs.
The code that we use to retrieve Blobs from the database does not use what may be the typical getBinaryStream but instead uses Blob.getBytes(1, blob.length()).
The reason we do this is that the Blob getBytes and Clob.getSubString is the most performing on DB2 and Oracle and we share the code across all databases.
H2’s performance here was nearly twice as slow as Oracle’s with either an embedded or server scenario.
I looked into the code of H2 and in particular the JDBCBlob class getBytes method and found that if a byteBuffer is used to read from the Inputstream here, performance improves by 400%.
The code I used was below to read the full blob back from the stream but you may wish to reduce this to some other BufferSize like the length() method in the same class.
public byte[] getBytes(final long pos, final int length) throws SQLException {
try {
debugCode("getBytes(" + pos + ", " + length + ");");
checkClosed();
final ByteArrayOutputStream out = new ByteArrayOutputStream();
final InputStream in = value.getInputStream();
try {
IOUtils.skipFully(in, pos - 1);
final byte[] buf = new byte[length];
in.read(buf);
out.write(buf, 0, buf.length);
} finally {
in.close();
}
return out.toByteArray();
} catch (Exception e) {
throw logAndConvert(e);
}
}
I’d believe the JDBCClob method getSubString could benefit from the same consideration for a buffered Reader read.
I appreciate any consideration given to inclusion of this in the future release.
Thanks,
Ronan
Thanks for the patch! I will implement a similar solution, however not
exactly this one. Reason: the value of the parameter length could be
larger than the blob itself. Therefore, I will still use a
ByteArrayOutputStream, but copy the data in blocks of 4 KB. According
to my test, this is about 3 times faster than the current single byte
loop.
Actually my solution is already committed to the trunk.
Regards,
Thomas