Tunning H2 for simple key-value store

537 views
Skip to first unread message

Lukas Zapletal

unread,
Jun 15, 2009, 4:52:42 PM6/15/09
to H2 Database
Hello,

I am using H2 in a project that uses it as a "simple key-value
reliable store". Since there are some open-source key-value database
products written in Java I think H2 is the most reliable one.
Opinions?

Anyway what settings do you suggest to operate in this very specific
mode:

- there are no multi-command transactions (only set or get)
- each table has two columns (key and value)
- I am operating in embedded mode (running multiple threads)

Thanks

Lukas Zapletal

unread,
Jun 15, 2009, 5:00:27 PM6/15/09
to H2 Database
Maybe:

SET UNDO_LOG 0 - turns off undo log (I do not need rollbacks)
SET LOCK_MODE ?
SET LOG ?

LZ

David Brown

unread,
Jun 15, 2009, 5:59:27 PM6/15/09
to h2-da...@googlegroups.com
I have found with simple tables, a lot of space gets wasted in the
database store. Don't know if there is any way to improve this with
settings.

2009/6/15 Lukas Zapletal <lukas.z...@gmail.com>:

bob mcgee

unread,
Jun 18, 2009, 1:49:33 PM6/18/09
to H2 Database
I would also be interested in something similar -- I am looking at
tuning H2 for object,key,value stores to hold user-defined document/
object properties.

For simple key-value store, a dedicated, hash-table based system will
outperform H2 or any other relational DBMS, because the DBMS is
required to provide a lot more advanced features (referential
integrity, transactions, etc). Lookup for these should always be ~O
(1) for large numbers of items with a good hash function, rather than
the O(log n) of conventional B-tree indexes.

If H2 is required, and there is always a 1:1 key:value mapping (single-
valued mappings), the following might be of use:
For non-numeric keys, particularly strings: use hashed index option
for the key column, and may the key column your table's primary key,
OR apply a UNIQUE (hashed) constraint.
Define the table as 'CREATE MEMORY TABLE...' rather than 'CREATE
TABLE' to allow the complete indices to reside in RAM (much faster
lookups).

If a key may return multiple values (my use case, for multi-value
properties, such as authors or graph relations, etc) you're kind of
out of luck. Clustered indices are designed for that sort of thing,
but AFAIK H2 doesn't support that yet; maybe when the new page store
scheme is implemented?

Perhaps Thomas Mueller can suggest settings (or post them on the site)
for optimizing queries on large tables (>1M rows) with very simple
column structures, since the defaults do not seem to work that well
for that case?

Cheers,
Bob McGee
On Jun 15, 5:59 pm, David Brown <d3z...@gmail.com> wrote:
> I have found with simple tables, a lot of space gets wasted in the
> database store.  Don't know if there is any way to improve this with
> settings.
>
> 2009/6/15 Lukas Zapletal <lukas.zaple...@gmail.com>:

Thomas Mueller

unread,
Jun 21, 2009, 7:25:32 AM6/21/09
to h2-da...@googlegroups.com
Hi,

> SET UNDO_LOG 0 - turns off undo log (I do not need rollbacks)
> SET LOCK_MODE ?
> SET LOG ?

I wouldn't disable the undo log, unless you have very large
transactions. Also, I wouldn't change the other settings unless you
fully understand what those mean - see the documentation for more
information.

> Clustered indices are designed for that sort of thing, but AFAIK H2 doesn't support that yet; maybe when the new page store scheme is implemented?

Yes, that's the plan. Clustered indices should be easy to add once the
"page store" is done.

> settings ... for optimizing queries on large tables

Sorry, I don't think there are settings that would help.

Regards,
Thomas

Lukas Zapletal

unread,
Jun 23, 2009, 9:20:15 AM6/23/09
to H2 Database
Ok, to wrap up:

For 1:1 key:value mapping I should use unique hashed constriant and
memory table to keep indexes in the memory.

One two questions about MEMORY TABLE (http://www.h2database.com/html/
grammar.html#createtable)

a) Are data within the MEMORY TABLE safe? What if anything bad occurs?
If I understand correctly H2 should start up and recreate index
automaticaly, there should be no data loss.

b) Where can I manage memory consumation of the MEMORY TABLE index? I
mean indicies will not be stored in the main cache (CACHE_SIZE
setting), right? If I use DB in the embedded mode I think I will not
like H2 to eat all the heap. :-(

Thank you very much for the support and help.

LZ

bob mcgee

unread,
Jun 24, 2009, 6:00:44 PM6/24/09
to H2 Database
For the simplest, fastest 1:1 key, value store, use "CREATE MEMORY
TABLE IF NOT EXISTS store(key VARCHAR PRIMARY KEY HASH, value
VARCHAR);"

If you are using numeric keys, you should compare performance for
hashed and standard primary keys to see which one is faster for you.
More on this far, far below, if you want the nitty-gritty.

Note that you do not need to specify an additional UNIQUE or NOT NULL
constraint on the key, since the primary key constraint includes
this. Similar, you do not need additional indexes on the key column,
although you may want an index on ht

In answer to your questions:
a)Yes. You're perfectly safe, but recreating the index & rebuilding
from the log after an unclean shutdown may be very slow for large
tables. The "NOT PERSISTENT" table option is the unsafe one, where
everything is in memory only. It's really, really fast though.

b)AFAIK you cannot control how much heap a MEMORY table uses, except
by limiting how many rows it has. Bear in mind that (unless you know
a lot about VM guts), java objects use more memory than you'd expect,
so always increase the max memory more than you expect it to need (see
below). The MEMORY_FREE() and MEMORY_USED() SQL functions will be
helpful in estimating how big your table can get. Using a very large
cache_size (yes, that was right) should give similar results to using
a memory table.

With either memory tables or large cache sizes, remember to give your
Java app or H2 server instance plenty of memory! Use the "-
Xmx512M" (where 512 is how many MB or RAM to use at max) command line
option, and make sure this is significantly larger than your cache
size. Further instructions on the command-line switch:
http://javahowto.blogspot.com/2006/06/6-common-errors-in-setting-java-heap.html

Cheers,
Bob McGee

Thomas Mueller

unread,
Jun 27, 2009, 10:18:56 AM6/27/09
to h2-da...@googlegroups.com
Hi,

>> CREATE MEMORY TABLE


> You're perfectly safe, but recreating the index & rebuilding
> from the log after an unclean shutdown may be very slow for large
> tables.

> b)AFAIK you cannot control how much heap a MEMORY table uses, except
> by limiting how many rows it has.

Exactly. The indexes are kept fully in memory, and with the indexes
some of the data. It doesn't really work for large tables, except if
you have a lot of memory, and don't mind if opening the database is
slow.

> Using a very large
> cache_size (yes, that was right) should give similar results to using
> a memory table.

This would be the preferred option.

Regards,
Thomas

Reply all
Reply to author
Forward
0 new messages