Performance of Blob getBytes

276 views
Skip to first unread message

Ronan Mulvaney

unread,
Feb 7, 2011, 5:26:46 PM2/7/11
to h2-da...@googlegroups.com

Hi,

 

We have been happily using h2 for a number of years now for our development teams where our final deployment is on an Oracle or DB2 database – thank you!

 

What we have recently noticed is some performance issues with JUnit testing against H2 versus Oracle so over the last few days I have spent some time investigating this and would like to report the problem.

We use the same JDBC access across all the databases that our product supports and the recent performance issues come down to some testing using large Blobs.

The code that we use to retrieve Blobs from the database does not use what may be the typical getBinaryStream but instead uses Blob.getBytes(1, blob.length()).

The reason we do this is that the Blob getBytes and Clob.getSubString is the most performing on DB2 and Oracle and we share the code across all databases.

H2’s performance here was nearly twice as slow as Oracle’s with either an embedded or server scenario.

 

I looked into the code of H2 and in particular the JDBCBlob class getBytes method and found that if a byteBuffer is used to read from the Inputstream here, performance improves by 400%.

The code I used was below to read the full blob back from the stream but you may wish to reduce this to some other BufferSize like the length() method in the same class.

 

  public byte[] getBytes(final long pos, final int length) throws SQLException {

 

    try {

      debugCode("getBytes(" + pos + ", " + length + ");");

      checkClosed();

      final ByteArrayOutputStream out = new ByteArrayOutputStream();

      final InputStream in = value.getInputStream();

      try {

        IOUtils.skipFully(in, pos - 1);

        final byte[] buf = new byte[length];

        in.read(buf);

        out.write(buf, 0, buf.length);

      } finally {

        in.close();

      }

      return out.toByteArray();

    } catch (Exception e) {

      throw logAndConvert(e);

    }

  }

 

I’d believe the JDBCClob method getSubString could benefit from the same consideration for a buffered Reader read.

 

I appreciate any consideration given to inclusion of this in the future release.

Thanks,


Ronan

 



The information in this email is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this email by anyone else
is unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful. If you are not the intended
addressee please contact the sender and dispose of this e-mail. Thank you.

Thomas Mueller

unread,
Feb 12, 2011, 4:50:18 AM2/12/11
to h2-da...@googlegroups.com
Hi,

Thanks for the patch! I will implement a similar solution, however not
exactly this one. Reason: the value of the parameter length could be
larger than the blob itself. Therefore, I will still use a
ByteArrayOutputStream, but copy the data in blocks of 4 KB. According
to my test, this is about 3 times faster than the current single byte
loop.

Actually my solution is already committed to the trunk.

Regards,
Thomas

Reply all
Reply to author
Forward
0 new messages