Some observations regarding large table/DB scalability

452 views
Skip to first unread message

wburzyns

unread,
Mar 29, 2011, 3:56:48 PM3/29/11
to H2 Database
Hi there,

I'd like to share some observations about H2 scalability. As inspired
by the http://stackoverflow.com/questions/784173/what-are-the-performance-characteristics-of-sqlite-with-very-large-database-files
I decided to carry out some testing as my own project requires
scalability that goes far beyond a few dozen millions of records.

In my DB I've got a single table that is created as follows: "CREATE
TABLE (ID INTEGER PRIMARY KEY
AUTO_INCREMENT, CONTENT BLOB NOT NULL, DESCRIPTION VARCHAR NOT NULL,
STAMP BIGINT DEFAULT NEXT VALUE FOR GLOBAL_STAMP_SEQUENCE)". Both blob
& varchar aren't very big: blob is around 1 kB, varchar is a few
hundred characters. The DB URL is nothing unusual - just
"jdbc:h2:<path>".

Here are the results:
No. of rows in the DB: 8,33M 16,55M 41,18M
Add* [s] 16,00 16,59 16,15
Retrieve** (by PK) [s] 5,1 4 5,5
Close right after open [s] 24,41 48 117

*) Insertion of 4106 records. The operation itself includes some
expensive non H2-related operations. Their time is approximately
constant.

**) Retrieval of 20 rows. The operation itself includes some expensive
non H2-related operations. Their time is approximately constant.


From the above one can see that in this case the H2 scales very well -
there is no noticeable performance drop as the table/DB grows bigger.
The only issue observed is the closing time - it looks like it
linearly depends on the number of records in the DB (approx. 2,9 s per
million of records on my machine).


Greets,
wburzyns

Thotheolh

unread,
Mar 29, 2011, 8:51:24 PM3/29/11
to H2 Database
Thank you for the experiment. What version of H2 are you using by the
way ? Do you mean by scaling as in putting H2 databases as clusters or
are they all sitting in a single JVM in a machine ?

The results are interesting and very encouraging for people (like me)
who use H2 heavily in applications and software developments.

Regards,
Thotheolh.

wburzyns

unread,
Mar 30, 2011, 2:35:09 AM3/30/11
to H2 Database
The results are obtained using H2 v. 1.3.153.

For the purpose of this topic by "scalability" I mean the ability of a
single DB instance to "handle growing amounts of work in a graceful
manner".


Regards,
wburzyns

Thomas Mueller

unread,
Mar 30, 2011, 1:39:56 PM3/30/11
to h2-da...@googlegroups.com
Hi,

Most likely the problem is a bug in the LOB mechanism, which results
in very slow connection closing if there are a lot of CLOBs or BLOBs
(I know I already wrote that in another mail). I will fix this in the
next release.

> CONTENT BLOB NOT NULL

Could you try with BINARY instead of BLOB? I know it's not the exactly
the same, but just to verify this is really the problem.

Closing the database is supposed to take at most 0.2 second by default
(database setting MAX_COMPACT_TIME). Closing the last connection will
close the database, but closing any other connection should take very
close to 0 seconds.

Regards,
Thomas

wburzyns

unread,
Apr 6, 2011, 7:34:38 AM4/6/11
to H2 Database
To everybody interested: as of H2 v. 1.3.154 the long close problem
doesn't occur anymore.


wburzyns
Reply all
Reply to author
Forward
0 new messages