Load database file from file to in-memory database?

3,414 views
Skip to first unread message

ilm...@gmail.com

unread,
Aug 12, 2007, 2:02:40 PM8/12/07
to H2 Database
Is it possible to load an in-memory database from file (similar the
hsqldb jdbc:hsqldb:res: loading from JAR-files)? See
http://hsqldb.sourceforge.net/doc/guide/ch04.html#N108E0.

This capability is extremely useful for integration-testing, quickly
initializing the database between test executions.

Thanks, Trond Arve

Thomas Mueller

unread,
Aug 13, 2007, 4:26:00 PM8/13/07
to h2-da...@googlegroups.com
Hi,

I suggest to load it from a SQL script. This is very portable. There
is a feature request for 'Read-only databases inside a jar (splitting
large files to speed up random access)'. I will move it up. But first
I like to have a look at MVCC and/or row level locking.

Thomas

ilm...@gmail.com

unread,
Aug 14, 2007, 2:58:22 AM8/14/07
to H2 Database
Hi Thomas

Thank you for the reply. SQL scripts works fine. However, I believe
the jdbc:hsqldb:res: database load functionality is different from
both SQL scripts and read-only databases. You can initialize an in-
memory database (both definition and content) from a JAR file,
subsequently you can execute SQL statements and change the data. The
initial database JAR is typically created using SQL scripts before
archiving. The use case for this functionality is integration testing;
an initialized and ready-to-go in-memory database is much faster than
file based databases and cleaning up between test runs. See this blog
entry for more details: http://www.brodwall.com/johannes/blog/2004/11/08/using-hsqldb-for-in-memory-dao-tests/

I understand that there are other changes in the pipeline with higher
priority :)

-T

On Aug 13, 10:26 pm, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:


> Hi,
>
> I suggest to load it from a SQL script. This is very portable. There
> is a feature request for 'Read-only databases inside a jar (splitting
> large files to speed up random access)'. I will move it up. But first
> I like to have a look at MVCC and/or row level locking.
>
> Thomas
>

Thomas Mueller

unread,
Aug 16, 2007, 12:10:49 AM8/16/07
to h2-da...@googlegroups.com
Hi,

> database load functionality is different from
> both SQL scripts and read-only databases. You can initialize an in-
> memory database (both definition and content) from a JAR file,
> subsequently you can execute SQL statements and change the data.

I will add a feature request for 'support compatibility for jdbc:hsqldb:res:'.
But you can initialize a database from a script in a jar file now, using:

InputStream in = getClass().getResourceAsStream("script.sql");
RunScript.execute(conn, new InputStreamReader(in));

I will also add a code sample:

package org.h2.samples;

import java.io.*;
import java.sql.*;
import org.h2.tools.RunScript;

public class InitDatabaseFromJar {

public static void main(String[] args) throws Exception {
new InitDatabaseFromJar().createScript();
new InitDatabaseFromJar().initDb();
}

private void createScript() throws Exception {
Class.forName("org.h2.Driver");
Connection conn = DriverManager.getConnection("jdbc:h2:mem:test");
Statement stat = conn.createStatement();
stat.execute("CREATE TABLE TEST(NAME VARCHAR)");
stat.execute("INSERT INTO TEST VALUES('Hello World')");
stat.execute("SCRIPT TO 'script.sql'");
conn.close();
}

void initDb() throws Exception {
Class.forName("org.h2.Driver");
Connection conn = DriverManager.getConnection("jdbc:h2:mem:test");
InputStream in = getClass().getResourceAsStream("script.sql");
if(in == null) {
System.out.println("Please add the file script.sql to the
classpath, package " + getClass().getPackage().getName());
} else {
RunScript.execute(conn, new InputStreamReader(in));
ResultSet rs = conn.createStatement().executeQuery("SELECT
* FROM TEST");
while(rs.next()) {
System.out.println(rs.getString(1));
}
conn.close();
}
}
}

Thomas

dan.hatfield

unread,
Aug 20, 2007, 11:19:13 AM8/20/07
to H2 Database
Thomas,
I'm attempted to load a portion of the database for starters from a
CSV file (approximately 82MB).
I did this like so:

private void createScript() throws Exception {
Class.forName("org.h2.Driver");
Connection conn =
DriverManager.getConnection("jdbc:h2:mem:test");
Statement stat = conn.createStatement();

stat.execute("create table bene_uom(SEQ_UNIQ_KEY BIGINT,LINE
VARCHAR(3),UNIT_OF_MEASURE_TYPE VARCHAR(3),UNIT_OF_MEASURE_VALUE
VARCHAR(5))");
stat.execute("insert into bene_uom select * from CSVREAD('d:\
\eclipse\\workspace\\accum\\bene_uom.csv')");
System.gc();
Thread.currentThread().sleep(10000);
conn.close();
}

It loaded...but required about 1GB of memory...I was simply monitoring
the memory in Windows.
Any thoughts on this? Is this what'd you expect?

On Aug 16, 12:10 am, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:

dan.hatfield

unread,
Aug 20, 2007, 11:48:16 AM8/20/07
to H2 Database
Whoops wrong then..meant to post that here:
http://groups.google.com/group/h2-database/browse_thread/thread/cf05e23a2da90cb5

> > Thomas- Hide quoted text -
>
> - Show quoted text -

Thomas Mueller

unread,
Aug 21, 2007, 1:39:34 PM8/21/07
to h2-da...@googlegroups.com
Hi,

The problem is, this will try to insert everything using one big
transaction. Large transactions are kept in-memory by default so far,
but this can be changed using:

SET MAX_MEMORY_UNDO 100000

or using a database URL of the form
jdbc:h2:~/test;MAX_MEMORY_UNDO=100000. Even then a part of the
transaction is kept in-memory.

There is an easier solution however:

create table bene_uom AS
SELECT
CAST(KEY AS BIGINT) AS SEQ_UNIQ_KEY,
CAST(LINE AS VARCHAR(3)) AS LINE,
...
FROM CSVREAD('...\\bene_uom.csv', 'KEY,LINE,..')

This will be executed in one step without using much memory. The
primary key, indexes and constraints can be created later on.

I hope this helps,
Thomas

dan.hatfield

unread,
Aug 21, 2007, 2:07:30 PM8/21/07
to H2 Database
Hi Thomas,
Thanks for responding.
I'm getting essentially the same amount of memory usage with this
script..

private void createScript() throws Exception {
Class.forName("org.h2.Driver");
Connection conn =
DriverManager.getConnection("jdbc:h2:mem:test");
Statement stat = conn.createStatement();

stat.execute("create table bene_uom AS SELECT CAST(KEY AS BIGINT) AS
SEQ_UNIQ_KEY, CAST(LINE AS VARCHAR(3)) AS LINE, CAST(UOMTYPE AS
VARCHAR(3)) AS UNIT_OF_MEASURE_TYPE,CAST(UOMVALUE AS VARCHAR(5)) AS
UNIT_OF_MEASURE_VALUE FROM CSVREAD('d:\\eclipse\\workspace\\accum\
\bene_uom.csv', 'KEY,LINE,UOMTYPE,UOMVALUE')");
System.gc();
Thread.currentThread().sleep(10000);
conn.close();
}

At the point the thread goes to sleep, it has and holds (after the
System.gc() call) about 1GB of memory...


On Aug 21, 1:39 pm, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:

Thomas Mueller

unread,
Aug 21, 2007, 2:16:24 PM8/21/07
to h2-da...@googlegroups.com
Hi,

> DriverManager.getConnection("jdbc:h2:mem:test");

Well, this URL means 'in-memory database'...
You probably want a persistent database, URL: jdbc:h2:~/test

See also http://www.h2database.com/html/features.html#database_url

The CREATE AS .. SELECT ... memory optimization has been implemented
in the latest release, so using an older release wouldn't work.

Thomas

dan.hatfield

unread,
Aug 21, 2007, 3:09:29 PM8/21/07
to H2 Database
Can you clarify the performance difference between embedded and in
memory?
This is actually one of about 5 tables I'd like to load into memory.
All told it is about 500MB of pure data..
With sqlite, it amounts to about 1GB of memory once I load the tables
and index them.
I was hoping for somewhat similar memory requirements from H2, knowing
that probably it can't be quite as efficient at storing tables in
memory as sqlite3 is.
But I guess I'm surprised it would take about 1GB to load about 80MB
of data.....I assumed I must be doing something wrong...

My working assumption is that this is the cost of loading the table
and memory and nothing to do with a transaction right? Since this is
after a System.gc() has run....

I'm using H2 Version 1.0 / 2007-08-02.

On Aug 21, 2:16 pm, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:


> Hi,
>
> > DriverManager.getConnection("jdbc:h2:mem:test");
>
> Well, this URL means 'in-memory database'...
> You probably want a persistent database, URL: jdbc:h2:~/test
>

> See alsohttp://www.h2database.com/html/features.html#database_url

Thomas Mueller

unread,
Aug 21, 2007, 3:39:20 PM8/21/07
to h2-da...@googlegroups.com
Sorry I thought you don't actually need everything in memory... I
should have read the topic...

Java does use quite a lot of memory for simple objects (two bytes for
each character in a String for example, plus a large overhead for each
String). H2 uses object caches (identical Strings are only stored once
in many cases). But otherwise, I am sorry, I don't think I have a
simple solution to save memory.

I will add a feature request for data compression in in-memory
database. The main algorithms are already there (LZF compression, and
an in-memory file system using jdbc:h2:inmemory:x), I just need to
combine that. This will slow things down, but I think it can still be
faster than persistent databases.

Thomas

dan.hatfield

unread,
Aug 21, 2007, 4:06:43 PM8/21/07
to H2 Database
Great, thanks, Thomas. Appreciate your clarifications.

On Aug 21, 3:39 pm, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:


> Sorry I thought you don't actually need everything in memory... I
> should have read the topic...
>
> Java does use quite a lot of memory for simple objects (two bytes for
> each character in a String for example, plus a large overhead for each
> String). H2 uses object caches (identical Strings are only stored once
> in many cases). But otherwise, I am sorry, I don't think I have a
> simple solution to save memory.
>
> I will add a feature request for data compression in in-memory
> database. The main algorithms are already there (LZF compression, and
> an in-memory file system using jdbc:h2:inmemory:x), I just need to
> combine that. This will slow things down, but I think it can still be
> faster than persistent databases.
>
> Thomas
>

Reply all
Reply to author
Forward
0 new messages