Should "SHUTDOWN DEFRAG" reduce the DB File size when all tables are empty?

1,289 views
Skip to first unread message

Kenton Garner

unread,
Jan 9, 2015, 10:07:30 AM1/9/15
to h2-da...@googlegroups.com

URL: "jdbc:h2:../data/audit;COMPRESS=TRUE;AUTOCOMMIT=FALSE;TRACE_LEVEL_FILE=4;lock_timeout=60000;MVCC=FALSE;DB_CLOSE_ON_EXIT=TRUE;MV_STORE=TRUE;AUTO_SERVER=TRUE;AUTO_SERVER_PORT=7899"
Version 1.4.184 - svn revision 6003


I know you must be tired of questions regarding db file size.  I am just having difficulty understanding expected behavior versus potential issues.

It db file is 3GB in size. My application has been running all night. 
I have been running pseudo compact routines every 10 minutes.

pseudo compact:


              stmt
.addBatch( "set EXCLUSIVE 1" );
              stmt
.addBatch( "set retention_time 0" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "set retention_time 45000" );
              stmt
.addBatch( "set EXCLUSIVE 0" );
              stmt
.executeBatch();
              conn
.commit();


Note: my pseudo compact code actually appears to have no effect ( the code runs extremely fast which might indicate that it is not actually doing anything ) .

I paused my input and let everything quiesce.

My DB has two tables. 
At this point both tables are empty - "select count(*) from table X" returns 0 for both tables.

I then ran SHUTDOWN DEFRAG;
It ran about 2 mins and the DB Size shrunk to 1.3GB - still quite large for any empty database.

I re-ran the SHUTDOWN DEFRAG two more times each taking about 1.5 mins and had no change to file size.

I do not have any specific indexes on my tables, however I do have primary keys defined which would create default indexes.
I was thinking the index table space may have been taking this extra space.
I dropped the primary keys and attempted the shutdown defrag again with no change. 

So am I seeing expected behavior or is this signs of a potential issue?









Thomas Mueller

unread,
Jan 14, 2015, 1:48:29 AM1/14/15
to H2 Google Group
Hi,

Could you send me the database file?

If not, you you run the Recover tool and check the created *.sql and *.txt files? For an empty database, they should contain no data.

Regards,
Thomas



--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Kenton Garner

unread,
Jan 16, 2015, 10:46:48 AM1/16/15
to h2-da...@googlegroups.com
I can send you the file if you need it. it is quite large 1.3GB.

I ran the Recover Tool. 
Again, the only two tables that I created are empty ( no rows ).
However the .sql file indicates that there is lots of data in the INFORMATION_SCHEMA.LOB_BLOCKS table.

I am not sure what to think about that.  Can you explain what I am seeing?

Thanks Kent
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscribe@googlegroups.com.

Thomas Mueller

unread,
Jan 17, 2015, 8:58:47 AM1/17/15
to H2 Google Group
Hi,

That means there are many CLOB or BLOB values, which are probably not bound to any table.

I would be interested in the database (or the *.sql file), and even more interested in a reproducible test case.

Regards,
Thomas



To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.

Kenton Garner

unread,
Jan 19, 2015, 4:30:02 PM1/19/15
to h2-da...@googlegroups.com
Thomas,
I noticed that after the weekend run my Db filesize was up to 6GB and again the tables were empty. 
I ran the recovery and again had lots of LOB_BLOCKS.

I was trying to come up with a test case - as I have mentioned I add file contents with one application and I have two others that read that data and the last one will delete the row.
I was trying to find an easy case that was not as complicated as my real test.

So I decided to try with just using the H2 Console.  I started with an empty database. 
I created my test table and then I inserted the about 300 characters of random text into my CLOB field.

My table create statement from java...
       stmt.execute( "CREATE TABLE IF NOT EXISTS CDS_FWD_DATA("
           
+ "FILENAME VARCHAR(255) NOT NULL, "
           
+ "DATE TIMESTAMP DEFAULT CURRENT_TIMESTAMP, "
           
+ "ACQUIRE_TYPE VARCHAR(30) NOT NULL, "
           
+ "PUBLISH_COUNT INT, "
           
+ "DATA CLOB, "
           
+ "PRIMARY KEY ( FILENAME, DATE) )" );


I queried the table and the row and data was present.
I ran the recovery tool and the SQL output indicated the CLOB data was in the LOB_BLOCKS field.

I then deleted the row from my database, committed, queried to show it was deleted and then ran "shutdown defrag".
I then ran the recovery tool.  The LOB_BLOCKS data was still in the database.

This appears to be relatively simple to reproduce unless I misunderstood the issue.

-Kent

Kenton Garner

unread,
Jan 19, 2015, 5:19:58 PM1/19/15
to h2-da...@googlegroups.com
Update...

I updated my code to the latest 185 release then re-ran the test.

I was running the recover tool in between inserting and deleting and things did act differently in this version. 
Things appear to clean up OK after deletes now.
However, I do not know why by I had a couple of instances where I had multiple copies of the LOB_BLOCKS and only one row - I swear. 
Delete seem to fix it though.

Perhaps this is a non-issue now.  I will keep an eye on it.

Thanks.

Kenton Garner

unread,
Jan 20, 2015, 9:25:33 AM1/20/15
to h2-da...@googlegroups.com
Thomas,

Ran overnight with the 185 version.
This morning the tables were empty and the DB file was 1.4GB

Ran shutdown defrag and DB file size reduced in half to 565MB.

Ran recovery tool and the majority of the 12GB ".h2.SQL" file is inserts into INFORMATION_SCHEMA.LOB_BLOCKS again.
I am not sure what kind of test case I can give you.

Kenton Garner

unread,
Jan 20, 2015, 4:01:33 PM1/20/15
to h2-da...@googlegroups.com
Thomas,

With the new 185 version I reverted back to large varchar() data types instead of CLOBS.
Seams to be working better than I have seen in the past.

When I run the recovery tool it produces a basic.  h2.sql file - contains no extra data inserts.

In addition to that - my psuedo compact code (below) is actually working now ( this is great news ) ...


              stmt
.addBatch( "set EXCLUSIVE 1" );
              stmt
.addBatch( "set retention_time 0" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "checkpoint" );
              stmt
.addBatch( "set retention_time 45000" );
              stmt
.addBatch( "set EXCLUSIVE 0" );
              stmt
.executeBatch();
              conn
.commit();


I will keep testing of course, but my earlier issue seems to be limited to the CLOB datatype.


Kenton Garner

unread,
Jan 20, 2015, 4:48:31 PM1/20/15
to h2-da...@googlegroups.com
Thomas,

I will quit updating this post - I am just hexing myself.

I let my test continue and it is back to where it was when I was using the CLOB datatype.

Tables are empty and the h2.sql recovery file is full of inserts of LOB_BLOCKS data.

Perhaps this problem is not as consent as I once thought. Urrg.

Daniele Renda

unread,
Jan 22, 2015, 3:01:18 AM1/22/15
to h2-da...@googlegroups.com
Hi Kenton,
I think I've a similar problem with LOBs in my db. Argh!

I hope that all these reports can help H2 guys to resolve the bug.

THanks

Kenton Garner

unread,
Jan 22, 2015, 10:40:20 AM1/22/15
to h2-da...@googlegroups.com
Thomas,
I can build the code here to run if you can think of any particular debug lines you would like to add - I can easily test that for you.  Maybe easier than trying to provide a test case.
Kent

--
You received this message because you are subscribed to a topic in the Google Groups "H2 Database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/h2-database/-gX356p5w3U/unsubscribe.
To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.

Kenton Garner

unread,
Jan 27, 2015, 3:19:34 PM1/27/15
to h2-da...@googlegroups.com
Just wanted to re-iterate my offer to add debug or run some particular tests if it would help.  I am not even slightly familiar with the LOB Block code - but I can run tests and send you logs if that would help.
-Kent
To unsubscribe from this group and all its topics, send an email to h2-database+unsubscribe@googlegroups.com.

Thomas Mueller

unread,
Jan 29, 2015, 12:59:30 PM1/29/15
to h2-da...@googlegroups.com
Hi,

I'm sorry for the delay.

I'm not sure what you would need to do to debug the problem on your side, I'm sorry.

I think it's easier if you have a simple, reproducible test case: then I can run it on my side. Generally if "checkpoint" reduces the size of the database, then it might be simply due to the delayed re-use of disk space. In this case, I would need to analyze what is going on - just log statements alone wouldn't help in this case. If the problem is LOB data that is not removed, then that's a different issue, but even then, just log statements are much harder to analyze than a reproducible test case.

Regards,
Thomas

Kenton Garner

unread,
Jan 29, 2015, 3:25:51 PM1/29/15
to h2-da...@googlegroups.com
 I will see what I can do.  Checkpoint only works with when there is not all these orphaned LOB_BLOCKS.  Checkpoint does nothing once It gets in this state.

It does not seem to matter if I am using a CLOB or a very large varchar datatype.  I still get lots of orphaned LOB_BLOCKS.

"delayed re-use" is a good question.  I do not know what the algorithm is but I can shutdown for days and the file continues to maintain these orphaned LOB_BLOCKS,

For a test case - I do not know what you are looking for. 
It won't be a "run these two queries and boom".  I will have to write an application that writes/reads/deletes millions of rows.
I never know when the problem has occurred.  I only know that I can run it for a period of time and the database will defrag fine and the recovery shows no orphaned BLOCKS; and other times when I run it there will be lots and lots and lots of orphaned BLOCKS.

Daniele Renda

unread,
Jan 29, 2015, 6:18:39 PM1/29/15
to h2-da...@googlegroups.com

This topic is very interesting. I agree with  Kenton: also in my case is not simple to understand when the problem happens. Sometimes all go fine, sometimes not.

The thing on which all we agree, I think, is that at some points remains orphaned Lob Blocks and they should be erased by h2.

Thanks

To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.

Thomas Mueller

unread,
Feb 17, 2015, 2:01:03 PM2/17/15
to h2-da...@googlegroups.com
Hi,

A short update: I can reproduce the orphaned lob block problem with a simple unit test now. One lob block gets orphaned when running "shutdown immediately". I'm working on a fix for this.

Regards,
Thomas

Daniele Renda

unread,
Feb 17, 2015, 3:22:41 PM2/17/15
to h2-da...@googlegroups.com

Hi Thomas, do you mean that the problem happears only when "shutdown immediatly" is enabled?

Thanks

Thomas Mueller

unread,
Feb 19, 2015, 1:30:40 PM2/19/15
to H2 Google Group
Hi,

I found other cases where the problem occurs, but with "shutdown immediately" it is easier to reproduce. 

There are two classes of leaks: temporary lobs are not removed (this is probably what you see; when using result sets that contain lob objects, if the lob objects are not explicitly closed.) The second class is a shutdown while storing a lob is in progress. This is very rare, but possible (I have a test case for that now as well).

I will fix both problems.

Regards,
Thomas

Daniele Renda

unread,
Feb 19, 2015, 1:31:54 PM2/19/15
to h2-da...@googlegroups.com
Thank you very much!
Daniele Renda

Kenton Garner

unread,
Feb 19, 2015, 2:14:03 PM2/19/15
to h2-da...@googlegroups.com
Thanks Thomas,

btw, when you speak of lob's the cases you have found  - the fixes will relate to the large varchar() types as well correct? 
I only mention that because I saw the behavior with both cases ( clobs, and varchar(MAX_INT) ).

In my case, I close all of my CLOB objects in try/finally statements, however, the finally may not be called prior to the database closing if the offending JVM does not own the embedded database.

Thanks again, Kent.

Thomas Mueller

unread,
Feb 19, 2015, 3:15:48 PM2/19/15
to H2 Google Group
Hi,

According to my tests, it only affects CLOB and BLOB data types. I'm not aware of a problem with other data types... Except that there are reports that disk space in general is not re-used as quickly when using the MVStore (compared to PageStore); but that's a different topic.

Regards,
Thomas

Kenton Garner

unread,
Feb 19, 2015, 4:01:59 PM2/19/15
to h2-da...@googlegroups.com
The test application I sent you defaulted to using a large VARCHAR- I get the same results with either a CLOB or the large VARCHAR.  source snippit:    dataType = "VARCHAR(" + Integer.toString( Integer.MAX_VALUE ) + ")";

By same results, I mean that I see the orphaned LOB_BLOCKS that I reported earlier in this thread. 
As far as not "re-used as quickly", do you have any way to quantify that? 

For instance, I can let my database go idle for a day+ and the LOB_BLOCKS will remain in the database file when the tables are otherwise empty.  I have seen this to be true even after a shutdown compact or defrag. 

And remember, I am trying to do that psuedo compact code with the checkpoint calls on regular intervals.

Thanks for looking into this.

Thomas Mueller

unread,
Feb 20, 2015, 1:44:12 AM2/20/15
to h2-da...@googlegroups.com
Hi,

If you have entries in LOB_BLOCKS, then it is either a CLOB or a BLOB value. Such values can also occur for example when using PreparedStatement.setCharacterStream; it doesn't necessarily mean the table has a CLOB / BLOB column. 

When using the MVStore, with the current released version, some of the temporary LOB_BLOCKS are never cleaned up. The fix I made now cleans them up when stopping and starting the database (unless they are removed earlier). To find out if this solves the problem for you, you can try it out with the latest trunk (you would need to build H2 yourself).

> As far as not "re-used as quickly", do you have any way to quantify that?

Yes. With the MVStore, by default, space is re-used after 45 seconds (actually a bit longer than that, as the live entries first have to be copied somewhere else). This is the default retention time (see "set retention_time" in the docs).

For instance, I can let my database go idle for a day+ and the LOB_BLOCKS will remain in the database file when the tables are otherwise empty. 

Is it possible for you (in theory) to close and re-open the database from time to time? 

Regards,
Thomas

Kenton Garner

unread,
Feb 20, 2015, 10:00:14 AM2/20/15
to h2-da...@googlegroups.com
Ahhh, yes I do used a PreparedStatement with a setCharacterStream and a regular statement with at getCharacterStream.

As far as closing and re-opening - that is tough.  I have multiple applications sharing the database.  I have had problems closing the database with the jvm still active when the other applications maintain connections.  I can try shutdown immediate from the jvm that owns the embedded database and see if I can force one of the other jvms to acquire a new instance of the database.

Let me experiment with the latest trunk I'll see what I happens.


--

ZHENCAI WANG

unread,
Nov 13, 2015, 2:08:31 PM11/13/15
to H2 Database
Hi Thomas,

I have a table with blob data. Each time I open a new connection, write blob data and then delete it, then close the connection. I tried both 1.4.185 and latest 1.4.190. I still saw the database size is growing. When I open the database with H2 console, I did not see table LOB_BLOCKS. But when I used Recover to recover the database, I saw a huge number of inserts to LOB_BLOCKS. Does those "invisible" data in LOB_BLOCKS cause the database growing? How could we solve it? Thank you for your help.

/Zhencai

Thomas Mueller

unread,
Nov 17, 2015, 1:06:52 AM11/17/15
to h2-da...@googlegroups.com
Hi,

Could you please post a simple reproducible test case? I did test this and didn't find a problem with the latest version.

Regards,
Thomas

--

ZHENCAI WANG

unread,
Nov 18, 2015, 9:17:02 PM11/18/15
to H2 Database
Hi Thomas, 
Please look at the attached test case with 1.4.190. It will run 20 seconds to insert, read then delete the data, and then run recovery from the database. The blob data is in the h2.sql file.
Thank you,
Zhencai


On Monday, November 16, 2015 at 10:06:52 PM UTC-8, Thomas Mueller wrote:
Hi,

Could you please post a simple reproducible test case? I did test this and didn't find a problem with the latest version.

Regards,
Thomas



On Friday, November 13, 2015, ZHENCAI WANG <zhenc...@gmail.com> wrote:
Hi Thomas,

I have a table with blob data. Each time I open a new connection, write blob data and then delete it, then close the connection. I tried both 1.4.185 and latest 1.4.190. I still saw the database size is growing. When I open the database with H2 console, I did not see table LOB_BLOCKS. But when I used Recover to recover the database, I saw a huge number of inserts to LOB_BLOCKS. Does those "invisible" data in LOB_BLOCKS cause the database growing? How could we solve it? Thank you for your help.

/Zhencai

On Thursday, February 19, 2015 at 10:44:12 PM UTC-8, Thomas Mueller wrote:
Hi,

If you have entries in LOB_BLOCKS, then it is either a CLOB or a BLOB value. Such values can also occur for example when using PreparedStatement.setCharacterStream; it doesn't necessarily mean the table has a CLOB / BLOB column. 

When using the MVStore, with the current released version, some of the temporary LOB_BLOCKS are never cleaned up. The fix I made now cleans them up when stopping and starting the database (unless they are removed earlier). To find out if this solves the problem for you, you can try it out with the latest trunk (you would need to build H2 yourself).

> As far as not "re-used as quickly", do you have any way to quantify that?

Yes. With the MVStore, by default, space is re-used after 45 seconds (actually a bit longer than that, as the live entries first have to be copied somewhere else). This is the default retention time (see "set retention_time" in the docs).

For instance, I can let my database go idle for a day+ and the LOB_BLOCKS will remain in the database file when the tables are otherwise empty. 

Is it possible for you (in theory) to close and re-open the database from time to time? 

Regards,
Thomas

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscribe@googlegroups.com.
BlobDB.zip

Thomas Mueller

unread,
Nov 19, 2015, 1:44:24 AM11/19/15
to h2-da...@googlegroups.com
Hi,

Thanks a lot for the test case! Yes, I see there is a LOB remaining. I think I know the problem, but will need some more time to test this.

Regards,
Thomas
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.

To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages