Re: Create a in-memory Database and save it to the filesystem

3,571 views
Skip to first unread message

Ryan How

unread,
Aug 3, 2012, 9:35:11 PM8/3/12
to h2-da...@googlegroups.com
Can you do a script to, then create it again as a file based database off the script?

On 3/08/2012 10:14 PM, Robin wrote:
Hi,

For those who are interested in my use case, you may read this part or skip it: 
I need to create a huge DB (300 mb right now, and this is only some test data. Not at least close to all the Data). To create this DB I read a CSV file with information that need to be parsed for my needs. I handle all the DB action with ORMLite (http://ormlite.com/). The creation of this DB on the filesystem needs increadibly long (like forever, I stopped after half an hour). The in-memory will handle it in ~5 minutes. So it will be really important for me to create the DB in-memory. The DB needs to be reloaded every 1 or 2 weeks, over the weekend, so it should be finished within a day. But the created Data has to be used from other programms, so it won't be very clever to use it in-memory.

is there a possibility to save a h2 in-meory DB to the filesystem and read it like a regular h2 file DB ? The best would be to do all this with ORMLite, but this isn't really that important to me.

With best regards Robin

 
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To view this discussion on the web visit https://groups.google.com/d/msg/h2-database/-/eQfeJqHE07EJ.
To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/h2-database?hl=en.

Ryan How

unread,
Aug 3, 2012, 9:40:44 PM8/3/12
to h2-da...@googlegroups.com
Or what about creating it as  a file based db, but on a ram disk, then copy to actual filesystem once created?

Robin

unread,
Aug 4, 2012, 7:20:19 AM8/4/12
to h2-da...@googlegroups.com
Hi,

thank you for your reply.

Can you do a script to, then create it again as a file based database off the script?
This was the only solution I could imagine as a Work-Around. But I think it isn't a nice solution, as well as I'm not sure about the performance?

Or what about creating it as  a file based db, but on a ram disk, then copy to actual filesystem once created?
You called my attention. Are you able to provide any example or a further explanation on how to do this, because this would be more or less the solution I was looking for.

With best regards Robin
To unsubscribe from this group, send email to h2-database+unsubscribe@googlegroups.com.

Brian

unread,
Aug 4, 2012, 1:04:15 PM8/4/12
to h2-da...@googlegroups.com
First I'd recommend trying to optimize the on-disk approach - that way you don't have to deal with/worry about persistence.

See http://www.h2database.com/html/performance.html#fast_import

Tips:
- nio can work wonders depending on your platform (try with and without to see)
- CACHE_SIZE is critical
- Disabling logging (SET LOG) and adjusting isolation level (SET LOCK_MODE) during import helps - be sure to turn them back on after.

jdbc:h2:nio:dbname;LOG=0;CACHE_SIZE=262144;LOCK_MODE=0;

If it still doesn't perform well - the in memory database can be spooled off to disk with the SCRIPT TO:
SCRIPT TO 'file.gz' COMPRESSION GZIP;
and restored with:
RUNSCRIPT 'file.gz' COMPRESSION GZIP;  (Or using org.h2.tools.RunScript)

Robin

unread,
Aug 4, 2012, 1:19:53 PM8/4/12
to h2-da...@googlegroups.com
Hi Brian,

thank you very much for your answer. I will try these Options if they will help me, but I'm off the Job for a Week, so my respond on this can take up to August the 14.

First I'd recommend trying to optimize the on-disk approach - that way you don't have to deal with/worry about persistence.
Persistence doesn't matter in this case, because I create a new DB and overwrite the old data (while writing this, maybe I should consider only updating not completely rewriting everything ... but this won't change the Problem, because everything needs to be rechecked - not sure how much faster this will be)

If it still doesn't perform well - the in memory database can be spooled off to disk with the SCRIPT TO:
SCRIPT TO 'file.gz' COMPRESSION GZIP;
and restored with:
RUNSCRIPT 'file.gz' COMPRESSION GZIP;  (Or using org.h2.tools.RunScript)
Sry maybe I'm just to dumb to get this, but I think I don't get the point. Your trying to write the DB to a *.gz, but where the DB-Data comes from? That's my problem at this POV.

With best regards Robin

Brian

unread,
Aug 4, 2012, 2:19:57 PM8/4/12
to h2-da...@googlegroups.com
It sounded from the original post as if you'd already created an in memory/embedded database with something like jdbc:h2:mem: and populated it via CSVREAD().  After it's populated, you can use SCRIPT TO statements from inside your database to send the data in the in-memory database out to a file which can later be used to restore the data into another H2 database that is file based.


"the created Data has to be used from other programms, so it won't be very clever to use it in-memory"
I hear:
- It needs to be persisted and made available - either by using a file or allowing others to connect to it?  Based on the next comment, I presume a file that you can pass around or others will come get.

"is there a possibility to save a h2 in-memory DB to the filesystem and read it like a regular h2 file DB"
- Not directly.  You can:
 a.) Backup from in memory to a file, then restore that file into a new file based (Non - in memory) database that you create. then either give others access to the file, or...
 b.) Make the in memory database, or subsequently restored-to-file-based-database accessible over the network from h2 jdbc clients by using org.h2.tools.Server instead of embedded mode H2
 c.) depending on the complexity of the data and how you want to share it ... use sql statement CSVWRITE()


Also - one more thing I remembered - if you're doing CREATE TABLE AS SELECT STUFF FROM CSVREAD(); - H2 creates enourmously long varchar column types by default (cuz it doesn't know how long the data _could_ be).  It's often more appropriate to pre-create your table with sane datatypes/lengths and then INSERT INTO YOURTABLE (FIELDLIST) SELECT FIELDLIST FROM CSVREAD(), then index/analyze the table according to your subsequent operations on it.

Christoph Läubrich

unread,
Aug 5, 2012, 3:29:48 AM8/5/12
to h2-da...@googlegroups.com
Or what about creating it as  a file based db, but on a ram disk, then copy to actual filesystem once created?
> You called my attention. Are you able to provide any example or a further explanation on how to do this,
> because this would be more or less the solution I was looking for.
On modern Linux kernels you can simply place your file under /dev/shm
To view this discussion on the web visit https://groups.google.com/d/msg/h2-database/-/ZlyM-Gc9xjsJ.

To post to this group, send email to h2-da...@googlegroups.com.
To unsubscribe from this group, send email to h2-database...@googlegroups.com.

Thirion

unread,
Aug 6, 2012, 2:45:13 AM8/6/12
to h2-da...@googlegroups.com, lae...@googlemail.com
And on Windows you can use DataRam's RAMDisk.  

The free version can create disks up to 4Gig.  It can also auto backup to a physical file at customizable intervals and reload it on startup.  Quite nice.

Robin

unread,
Aug 6, 2012, 4:46:25 PM8/6/12
to h2-da...@googlegroups.com, lae...@googlemail.com
Hi,

thank you very much for you replies.

It sounded from the original post as if you'd already created an in memory/embedded database with something like jdbc:h2:mem: and populated it via CSVREAD().  After it's populated, you can use SCRIPT TO statements from inside your database to send the data in the in-memory database out to a file which can later be used to restore the data into another H2 database that is file based.
Hi, actually I don't use a function like CSVREAD(). I read the CSV with OPENCSV (http://opencsv.sourceforge.net/) and the I parse the rows in the DB with ORMLite (http://ormlite.com/). But basically, you got the situation :).  

The SCRIPT TO, would be the better solution than these two solutions? Right? Or would it be better to create a RAMDisk? 
On modern Linux kernels you can simply place your file under /dev/shm
 I'm sorry, maybe I should have mentioned, that Linux is no alternative, because I need to read Windows Active Directory, as well as NTFS-Rights. I think it will be quite hard to manage this on Linux, even though I have already solved these problem on Windows. How ever, this is really interesting for me personally because I use Linux for my self. Thank you :)
 
And on Windows you can use DataRam's RAMDisk.  

The free version can create disks up to 4Gig.  It can also auto backup to a physical file at customizable intervals and reload it on startup.  Quite nice.
I think these solutions would be great if you will have a quite small DB (<4gb), but what if it's bigger? I will give it a try if I won't be able to do it with SCRIPT TO. Thank you very much for your Reply

"the created Data has to be used from other programms, so it won't be very clever to use it in-memory"
I hear:
- It needs to be persisted and made available - either by using a file or allowing others to connect to it?  Based on the next comment, I presume a file that you can pass around or others will come get.
You'r way of thinking is absolutely right, but in this case I am able to just override the Data. The other Programms, just want to read from the Database, so they don't care about a change or inconsistent data. And I want to recreate the DB as cron (aka. planed Task? in Windows?), on the Weekend. 

"is there a possibility to save a h2 in-memory DB to the filesystem and read it like a regular h2 file DB"
- Not directly.  You can:
 a.) Backup from in memory to a file, then restore that file into a new file based (Non - in memory) database that you create. then either give others access to the file, or...
That would be the best solution I think :) 
 b.) Make the in memory database, or subsequently restored-to-file-based-database accessible over the network from h2 jdbc clients by using org.h2.tools.Server instead of embedded mode H2
This is not suitable for my case, because my boss, don't wants to spend a Database Server for Data that is accessed, maybe every three months or twice a week, but not 100 times/minute.
 c.) depending on the complexity of the data and how you want to share it ... use sql statement CSVWRITE() 
Maybe this could be a overkill, because I have like 4gig of Data or something like this 

Also - one more thing I remembered - if you're doing CREATE TABLE AS SELECT STUFF FROM CSVREAD(); - H2 creates enourmously long varchar column types by default (cuz it doesn't know how long the data _could_ be).  It's often more appropriate to pre-create your table with sane datatypes/lengths and then INSERT INTO YOURTABLE (FIELDLIST) SELECT FIELDLIST FROM CSVREAD(), then index/analyze the table according to your subsequent operations on it.
Hi, I don't use the CSVREAD, and also have except for one collumn (because it doesn't fit here), the Varchar size of 256 (or 255 not sure).
Thank you very much for you time, I will totally try the SCRIPT TO operation. I think this will be exactly what I was looking for (as long as it has a quite good performance)

With best regards Robin
Reply all
Reply to author
Forward
0 new messages