Continuous Increase in H2 db size after dropping and loading same data repeatedly

2,941 views
Skip to first unread message

chandra...@gmail.com

unread,
Jul 8, 2013, 1:31:09 AM7/8/13
to h2-da...@googlegroups.com

I have an H2 db named temp.h2.db which is accessed by two applications. The first one accesses it through embedded mode and the second one through server mode. Through the second application I load data into the database. But even when I drop the previous values and load the same data repeatedly, the db size increases. From about 200mb, it increased to about 2Gb. This happens even when I drop all tables and load a fresh set of data (which is almost of the same size as the previous set of data). Is this a bug?

I tried LOG_SIZE_LIMIT=32, but it didn't help.

Could this be due to open transactions? How can we determine which open transactions exist? I did a recover of the db twice and it gave me some statistics. How do I interpret these:

---- Statistics ----
-- page count: 954655, free: 776787
-- page data bytes: head 13860963, empty 26006602, rows 35134291 (66% full)
-- free 91%, 876991 page(s)
-- data leaf 3%, 36622 page(s)
-- data node 0%, 268 page(s)
-- btree leaf 3%, 36567 page(s)
-- btree node 0%, 570 page(s)
-- free list 0%, 52 page(s)
-- stream trunk 0%, 8 page(s)
-- stream data 0%, 3574 page(s)

and the next set is:

---- Statistics ----
-- page count: 235708, free: 164636
-- page data bytes: head 13268512, empty 24936708, rows 33759452 (66% full)
-- free 67%, 159364 page(s)
-- data leaf 14%, 35139 page(s)
-- data node 0%, 267 page(s)
-- btree leaf 14%, 35338 page(s)
-- btree node 0%, 568 page(s)
-- free list 0%, 15 page(s)
-- stream trunk 0%, 9 page(s)
-- stream data 2%, 5005 page(s)

Would doing MAX_COMPACT_TIME help in any way? Requesting your help here. Thanks.

Noel Grandin

unread,
Jul 8, 2013, 2:53:09 AM7/8/13
to h2-da...@googlegroups.com, chandra...@gmail.com
This is pretty much guaranteed to be because of open transactions.
Unfortunately there is no easy way of finding them, short of monitoring your own code to check that it is either using AutoCommit or calling commit()/rollback() timeously.

Thomas Mueller

unread,
Jul 8, 2013, 2:31:13 PM7/8/13
to H2 Google Group
Hi,


Please don't use _both_ the Google Group and StackOverflow at the same time for the same question. Use one or the other.

Did you run the recover tool _before_ you restored it? To find out what is using the disk space you would need to run it _before_ that (when the database file is large).

Regards,
Thomas



--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

chandra...@gmail.com

unread,
Jul 9, 2013, 8:11:42 AM7/9/13
to h2-da...@googlegroups.com
I posted in both to get a wider reach. Would be using only one in the future.
 
I had stopped both the applications and then applied the Recover tool option to generate an SQL file with the same name as the DB. This was done when the database file was of large size. One thing noted was that the size of the SQL file too was very large close to the .db file. 

On Tuesday, July 9, 2013 12:01:13 AM UTC+5:30, Thomas Mueller wrote:
Hi,


Please don't use _both_ the Google Group and StackOverflow at the same time for the same question. Use one or the other.

Did you run the recover tool _before_ you restored it? To find out what is using the disk space you would need to run it _before_ that (when the database file is large).

Regards,
Thomas

On Mon, Jul 8, 2013 at 8:53 AM, Noel Grandin <noelg...@gmail.com> wrote:
This is pretty much guaranteed to be because of open transactions.
Unfortunately there is no easy way of finding them, short of monitoring your own code to check that it is either using AutoCommit or calling commit()/rollback() timeously.


On 2013-07-08 07:31, chandra...@gmail.com wrote:

I have an H2 db named temp.h2.db which is accessed by two applications. The first one accesses it through embedded mode and the second one through server mode. Through the second application I load data into the database. But even when I drop the previous values and load the same data repeatedly, the db size increases. From about 200mb, it increased to about 2Gb. This happens even when I drop all tables and load a fresh set of data (which is almost of the same size as the previous set of data). Is this a bug?



--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscribe@googlegroups.com.

Thomas Mueller

unread,
Jul 9, 2013, 12:01:40 PM7/9/13
to H2 Google Group
Hi,

OK, could you post the 'Statistics' part of that file?

Regards,
Thomas



To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.

chandra...@gmail.com

unread,
Jul 10, 2013, 11:50:41 PM7/10/13
to h2-da...@googlegroups.com
There were two files actually..since I did it twice. The statistics are as below:-
 
---- Statistics ----
-- page count: 954655, free: 776787
-- page data bytes: head 13860963, empty 26006602, rows 35134291 (66% full)
-- free 91%, 876991 page(s)
-- data leaf 3%, 36622 page(s)
-- data node 0%, 268 page(s)
-- btree leaf 3%, 36567 page(s)
-- btree node 0%, 570 page(s)
-- free list 0%, 52 page(s)
-- stream trunk 0%, 8 page(s)
-- stream data 0%, 3574 page(s)

and the next set is:

---- Statistics ----
-- page count: 235708, free: 164636
-- page data bytes: head 13268512, empty 24936708, rows 33759452 (66% full)
-- free 67%, 159364 page(s)
-- data leaf 14%, 35139 page(s)
-- data node 0%, 267 page(s)
-- btree leaf 14%, 35338 page(s)
-- btree node 0%, 568 page(s)
-- free list 0%, 15 page(s)
-- stream trunk 0%, 9 page(s)
-- stream data 2%, 5005 page(s)

 

P.S:I had already posted this. I guess you missed it. :)

Thomas Mueller

unread,
Jul 11, 2013, 5:20:22 AM7/11/13
to H2 Google Group
Hi,

I thought the statistics are taken _after_ compacting, sorry about that.

The statistics show that most of the pages are free (free 91%, 67%). You don't seem to have open transactions (stream trunk and data).

When (temporarily) more space is needed, the database file grows more than needed (to avoid many file resize operations). While the database is open, the file does not shrink, even if you drop all tables. Empty (free) pages are re-used, but the file doesn't shrink. It only shrinks when the database is closed (that is, all connections are closed). Shrinking means pages that are still used are moved to the front of the database file, so that all free pages are at the end of the file. At the very end, the database file is truncated.

How do you close the database? Do you use "shutdown immediately"? In this case, the database file is not truncated. If you simply close all connections, or if you use "shutdown" or "shutdown compact", then the file should be automatically truncated.

Regards,
Thomas


To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.

chandra...@gmail.com

unread,
Jul 25, 2013, 5:24:35 AM7/25/13
to h2-da...@googlegroups.com
Thanks Thomas. Actually the H2 isnt shutdown completely. Even though one of the applications holding it is shutdown, the other can access it through embedded mode I believe. But as you suggested, I did try to give "shutdown" and also "shutdown compact". It showed inconsistent results. Sometimes, the size of the DB reduced. But other times, it remained the same.
 
Now another method I tried was to recover the sql, delete the existing DB and recreate a fresh one from the backup.sql through RUNSCRIPT. Here, the recover and delete went smoothly. But I am unable to get the DB back from the backup sql. It is giving me an error:-
ERROR>Exception in thread "main" org.h2.jdbc.JdbcSQLException: Sequence "SYSTEM_SEQUENCE_0324E8F5_AEC9_4E4F_83E1_580BF62F0865" already exists; SQL statement:
ERROR>.
I found this mentioned elsewhere too. But there was no solution given. Could you please help me here. The RunScript command used is:-
java -cp h2-1.3.165.jar org.h2.tools.RunScript -url dburl -user user -password password -script backup.sql.
Thanks in advance.

Thomas Mueller

unread,
Jul 29, 2013, 4:31:24 PM7/29/13
to H2 Google Group
Hi,

First, why don't you use the most recent version of H2?

Second, about the message "Sequence SYSTEM_SEQUENCE_0324E8F5_AEC9_4E4F_83E1_580BF62F0865 already exists", what is your question? The error message is pretty clear I think.

Regards,
Thomas

davide.cavestro

unread,
Aug 7, 2013, 5:57:15 AM8/7/13
to h2-da...@googlegroups.com
I have a similar problem with a legacy data import library that should commit
every N rows batch inserts and also at the end of every table import.
Before importing data I drop all the objects. However the db size increase
at every import within the same jvm instance. At shutdown, it seems like h2
tries to remove (session-related) uncommitted data... at least I see long
calls /LobStorage.removeAllForTable()/ and /LobStorage.removeLob()/. Then
the DB size decreases. Please note I'm using the /LOCK=1/ parameter into the
JDBC URL (for performance reasons).

*So is there any way to query the database and get the list of pending
transactions?* I've seen there is a /INFORMATION_SCHEMA.IN_DOUBT/ table with
columns /TRANSACTION/, /STATE/: could they be used to detect pending
transactions? I've not found yet a way to disable auto commit from h2
console, so I have no easy way to see if that table is populated with
running transaction, at least until I don't write a small standalone test
case.


Noel Grandin wrote
> This is pretty much guaranteed to be because of open transactions.
> Unfortunately there is no easy way of finding them, short of monitoring
> your own code to check that it is either using AutoCommit or calling
> commit()/rollback() timeously.
>
> On 2013-07-08 07:31,

> chandran.lekha@

> wrote:
>>
>> I have an H2 db named |temp.h2.db| which is accessed by two
>> applications. The first one accesses it through embedded mode and the
>> second one through server mode. Through the second application I load
>> data into the database. But even when I drop the previous values and
>> load the same data repeatedly, the db size increases. From about
>> 200mb, it increased to about 2Gb. This happens even when I drop all
>> tables and load a fresh set of data (which is almost of the same size
>> as the previous set of data). Is this a bug?
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027072.html
Sent from the H2 Database mailing list archive at Nabble.com.

Thomas Mueller

unread,
Aug 9, 2013, 7:16:17 AM8/9/13
to h2-da...@googlegroups.com
Hi,

Could you provide a simple, reproducible test case?

Do you actually use LOBs?

INFORMATION_SCHEMA.IN_DOUBT is for in-doubt transactions when using 2-phase commit (prepared, but not committed). 

There is currently no way to measure how large a pending transaction is; this is hard to implement right now, but should be quite simple with the MVStore.

> I've not found yet a way to disable auto commit from h2 console

There is a "auto commit" checkbox in the toolbar.

Regards,
Thomas

To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

davide.cavestro

unread,
Aug 9, 2013, 12:42:12 PM8/9/13
to h2-da...@googlegroups.com
Hi Thomas,
thanks for pointing me to the "auto commit" checkbox in the toolbar... I
completely missed it.

I actually don't use LOBs (could they be internally used by H2 in order to
save the transaction log?)

In order to produce a reproducible test case I have to figure out how: I'll
give it a look in the weekend.
The relevant code is legacy and I had clearly evidence of the problem only
after changing the LOG parameter from 0 to 1. But in the meantime we also
developed an alternative H2 driver (actually a jdbc driver that wraps the
original one) that simply deletes all the db files at first access, only if
the database is corrupted (it is acceptable in my usage scenario).

BTW I think having a way to access the list of pending transactions would be
a valuable enhancement (it could be something like a readonly view with a
minimal subset of postgresql's pg_stat_activity
<http://www.postgresql.org/docs/9.2/static/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW>
contents).

Cheers
Davide


Thomas Mueller-6 wrote
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027092.html

davide.cavestro

unread,
Aug 9, 2013, 12:46:48 PM8/9/13
to h2-da...@googlegroups.com
Sorry... in my previous post I mentioned the raw h2 driver wrapper we
developed but didn't complete the sentence: maybe it could be the cause of
the strange behavior... I'll check if the problem persists using the
original driver.

Cheers
Davide



--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027093.html

Noel Grandin

unread,
Aug 10, 2013, 7:59:51 AM8/10/13
to h2-da...@googlegroups.com
On Fri, Aug 9, 2013 at 6:42 PM, davide.cavestro
<davide....@gmail.com> wrote:
> I actually don't use LOBs (could they be internally used by H2 in order to
> save the transaction log?)

No, the LOB machinery is pretty only used for LOB's. Note that this
includes CLOB and TEXT columns, so maybe you have some of those?


> BTW I think having a way to access the list of pending transactions would be
> a valuable enhancement (it could be something like a readonly view with a

Yeah, I've tried to implement this before.
Unfortunately, the problem is that H2 normally runs using a single
global lock for all it's data, so the query that wants to display the
metadata is blocked by the query currently executing.
Maybe when we switch to using MVStore as a our default engine, and
also make MULTI_THREADED=TRUE our default, it will become more
feasible.

davide.cavestro

unread,
Aug 12, 2013, 11:15:16 AM8/12/13
to h2-da...@googlegroups.com
Thanks for the valuable info you provided.
I have no /CLOB/ nor /TEXT/ columns. These are the column types actually in
use: /BOOLEAN, INT, DOUBLE, CHAR, VARCHAR, LONGVARCHAR, TIMESTAMP/.

I confirm the db size keeps growing even using the original H2 driver... so
I still have to figure out a way to produce a standalone reproducible test
case without sharing and adapting the entire legacy library.

As further details, I've noticed that the legacy library uses a proprietary
/DataSource/ for connection pooling that at jvm shutdown calls
/Connection.close()/ on the pooled connections. This in turn calls to
/LobStorage.removeAllForTable()/. I don't know how times it gets called. I
simply see that if I take a thread dump in this phase, it is working on that
calls. At the end the DB size slightly decrease.
So i.e.
* when a start the JVM with an empty DB and launch a import I get a DB size
of less than 6gb
* after the second import (where every import drops all objects as a first
step) the DB size goes to 7667mb
* after the JVM shutdown, the db size decreases from 7667mb to 7444mb (but
it remains still much more than 6gb).

Follows a partial call stack for a /Connection.close()/ call. I took it at
JVM shutdown time for the Tomcat main thread on a JRockit JVM :


H2 version is 1.3.169.

Cheers
Davide


Noel Grandin wrote
> On Fri, Aug 9, 2013 at 6:42 PM, davide.cavestro
> &lt;

> davide.cavestro@
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027113.html

Noel Grandin

unread,
Aug 12, 2013, 11:36:49 AM8/12/13
to h2-da...@googlegroups.com, davide.cavestro



On 2013-08-12 17:15, davide.cavestro wrote:
> Follows a partial call stack for a /Connection.close()/ call. I took it at
> JVM shutdown time for the Tomcat main thread on a JRockit JVM :
>


The DB size issue is probably down to fragmentation and fill-factor in
the internal data-structures.
I don't see that we'll be able to improve things much here.

The shutdown issue is more interesting.
It's spending time cleaning up the temporary LOB's that get created when
we need to pass them around, or when other things are casted into being
an LOB.

Is it possible that something in your code is either casting to an LOB
when it retrieves data, or passing data in or out of H2 using
InputStream or OutputStream?
e.g. calling getOutputStream() on a ResultSet.

It doesn't look like we clean those things anywhere except for during a
shutdown(), which is a potential problem if you manage to generate a lot
of them.

davide.cavestro

unread,
Aug 12, 2013, 11:56:56 AM8/12/13
to h2-da...@googlegroups.com
I confirm that values for CHAR, VARCHAR and LONGVARCHAR columns are set and
read through /CharacterStream/s.
Basically we have a proprietary ORM that maps every String-related column
read and write respectively to /ResultSet.getCharacterStream() /and
/PreparedStatement.setCharacterStream()/ calls .


Noel Grandin wrote
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027115.html

Noel Grandin

unread,
Aug 13, 2013, 5:39:22 AM8/13/13
to h2-da...@googlegroups.com, davide.cavestro

Hmmm, sorry, I don't have a quick fix for this.

It appears that the problem is not so much with getCharacterStream() as
it is with setCharacterStream().

If you could make your ORM use the setString() calls, the shutdown, and
in fact performance as a whole should improve.

davide.cavestro

unread,
Aug 13, 2013, 11:56:59 AM8/13/13
to h2-da...@googlegroups.com
I've tried replacing setCharacterStream() with setString() but the db size
still continues to increase. Is there any way to get an idea of actual
fragmentation of internal data-structures?

Now it's time to try reproducing the issue with a test case: today I'm
feeling lucky :-)


Noel Grandin wrote
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027129.html

Noel Grandin

unread,
Aug 13, 2013, 12:03:18 PM8/13/13
to h2-da...@googlegroups.com
On Tue, Aug 13, 2013 at 5:56 PM, davide.cavestro
<davide....@gmail.com> wrote:
> I've tried replacing setCharacterStream() with setString() but the db size
> still continues to increase. Is there any way to get an idea of actual
> fragmentation of internal data-structures?
>

Sorry, no, no idea.
But I thought we were chasing a bug where shutdown was taking too
long, in which case making this change should make a difference.
Did you measure your new shutdown time with this change?

davide.cavestro

unread,
Aug 13, 2013, 12:27:48 PM8/13/13
to h2-da...@googlegroups.com
I didn't measure it, but it was certainly lower. I'll give you more data
asap.


Noel Grandin wrote
> On Tue, Aug 13, 2013 at 5:56 PM, davide.cavestro
> &lt;

> davide.cavestro@

> &gt; wrote:
>> I've tried replacing setCharacterStream() with setString() but the db
>> size
>> still continues to increase. Is there any way to get an idea of actual
>> fragmentation of internal data-structures?
>>
>
> Sorry, no, no idea.
> But I thought we were chasing a bug where shutdown was taking too
> long, in which case making this change should make a difference.
> Did you measure your new shutdown time with this change?
>
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027132.html

Thomas Mueller

unread,
Aug 13, 2013, 1:25:47 PM8/13/13
to H2 Google Group
Hi,

How large is the database exactly, and how does it grow exactly? Are you completely sure it doesn't stop growing? If yes, could you kill the process while the database is about 10 times as big as it should be, and then run the recovery tool, and post the statistics part of the the output? And then, could you shrink the file (for example by export to SQL script / import from SQL script), run recovery again, and post the statistics of this new file?

Regards,
Thomas


To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

davide.cavestro

unread,
Aug 13, 2013, 4:05:08 PM8/13/13
to h2-da...@googlegroups.com
I'll be more precise tomorrow, as at the moment I don't have access to the test environment.

The massive data import is the first stage a a multi-step ETL process. It collects data from an Oracle DB available on the LAN populating a local H2 db.

That premised, after the first import the H2 database weights ~5gb (certainly less than 6gb), after the second it is ~7gb (hence it is not duplicated).
I'm sure after some imports the db size reaches 11gb, but I don't know after how much times.
Now, I want to go for both the following tasks:
* prepare a smaller Oracle Db (with less data) in order to be able to execute several imports in less time (at the moment a single import takes more than 1 hour). This way I can provide you precise measurements and eventually proceed with the tests you suggested
* try to reproduce the problem with a standalone test case

Many thanks
Davide


2013/8/13 Thomas Mueller-6 [via H2 Database] <[hidden email]>
Hi,

How large is the database exactly, and how does it grow exactly? Are you completely sure it doesn't stop growing? If yes, could you kill the process while the database is about 10 times as big as it should be, and then run the recovery tool, and post the statistics part of the the output? And then, could you shrink the file (for example by export to SQL script / import from SQL script), run recovery again, and post the statistics of this new file?

Regards,
Thomas
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].

Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



To unsubscribe from [h2] Continuous Increase in H2 db size after dropping and loading same data repeatedly, click here.
NAML



View this message in context: Re: [h2] Continuous Increase in H2 db size after dropping and loading same data repeatedly

davide.cavestro

unread,
Aug 14, 2013, 6:31:47 AM8/14/13
to h2-da...@googlegroups.com
Hi,
just to keep you up-to-date before continuing with the tasks mentioned in my
previous message.

Yesterday I launched the 3rd import process on the test environment, and it
complained for /pageStore: Transaction log could not be truncated; size:
1768 MB/ but the import process continued (it was configured to continue on
errors). Getting this error message seems strange: the JDBC url was
/jdbc:h2:file:/path/to/mirror/db;MODE=PostgreSQL;CACHE_SIZE=65536;FILE_LOCK=SOCKET;LOG=1;MAX_LOG_SIZE=1024;AUTO_SERVER=TRUE;AUTO_SERVER_PORT=9092;/
while the 1st import completed with no problems at all (having the same jdbc
url).
I don't know the db size at that point cause I wasn't there. Then the
scheduled nightly import runned with no problems, but now the database is
~17gb.
So the database is not 10 times the expected size (I'd prefer to prepare the
smaller db before), anyway I ran the recovery tool on it. Here you are the
statistics after this run

I can only notice quite strange negative numbers, but I am not aware of
their precise meaning...

BTW now I'm trying to export and reimport data just to rerun the recovery
and give you the 2nd statistics: I've started the h2 console, and I'm still
waiting for it to open the db session (actually it started more than 30
minutes ago) . I don't understand what the h2 console thread is doing: it
uses some cpu with no disk access with the following stack (probably
immutable)



I've also tried to sampling it with jvisualvm, but the number of invocations
of /java.io.DataInputStream.readInt()/ doesn't change. Nonetheless it
continues using a CPU core (not at 100%).

Further annotations (just in case):
1) between the 2nd and 3rd import I stopped Tomcat. I tried to see if the
database was accessible through the h2 console (launching the h2 jar from
the CLI). It was ok. Then I started Tomcat but I forgot to kill the h2
console jvm. I launched the import but it failed cause the DB was locked by
the console jvm instance. I simply killed that jvm and relaunched the 3rd
import.
2) the first time I used the recovery tool on the 17gb db I had to kill it
cause I had not enough free space on the disk. Then I copied the db into
another machine and relaunched the recovery process that gave the statistics
inserted above.


Thomas Mueller-6 wrote
> Hi,
>
> How large is the database exactly, and how does it grow exactly? Are you
> completely sure it doesn't stop growing? If yes, could you kill the
> process
> while the database is about 10 times as big as it should be, and then run
> the recovery tool, and post the statistics part of the the output? And
> then, could you shrink the file (for example by export to SQL script /
> import from SQL script), run recovery again, and post the statistics of
> this new file?
>
> Regards,
> Thomas
>
>
> On Tue, Aug 13, 2013 at 6:27 PM, davide.cavestro
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027139.html

davide.cavestro

unread,
Aug 14, 2013, 10:01:51 AM8/14/13
to h2-da...@googlegroups.com
Now I've run 5 times the drop/data import process copying data from a smaller Postgres local DB (the first import produces a h2 db of less than 800mb).

All the import were executed within the same JVM instance. These are the size measurements taken after every import (with the Tomcat JVM still running):

#	Size(MB)
1	755
2	1267
3	1779
4	2547
5	3059
So in this scenario the db size trend seems almost linear:

Then I shut down Tomcat, and the db size slightly decreased to 2888MB. Then I ran the recovery tool on the db. These are the relevant statistics:

---- Transaction log ----
---- Statistics ----
-- page count: 1478666, free: 1274362
-- page data bytes: head 66033313, empty 510471785, rows 773802742 (63% full)
-- free 0%, 2 page(s)
-- data leaf 44%, 659330 page(s)
-- data node 0%, 4466 page(s)
-- data overflow 0%, 2652 page(s)
-- btree leaf 24%, 366293 page(s)
-- btree node 0%, 4723 page(s)
-- free list 0%, 91 page(s)
-- stream trunk 0%, 869 page(s)
-- stream data 29%, 440237 page(s)
Then I exported the DB to a sql script and used RunScript to reimport it into a new one, which size in turn decreased to 471MB. Then I relaunched recovery tool that produced the following stats:
---- Transaction log ----
---- Statistics ----
-- page count: 241351, free: 80334
-- page data bytes: head 12427119, empty 95612024, rows 147534873 (63% full)
-- free 0%, 1 page(s)
-- data leaf 51%, 124792 page(s)
-- data node 0%, 853 page(s)
-- data overflow 0%, 311 page(s)
-- btree leaf 14%, 34583 page(s)
-- btree node 0%, 460 page(s)
-- free list 0%, 15 page(s)
-- stream trunk 0%, 159 page(s)
-- stream data 33%, 80174 page(s)
It seems that the rows number of the huge db is nearly 5 times the shrunk one. OTOH on a raw check (comparing the number of rows for the bigger tables) it seems that the shrunk db contains the same data.

Would it be compatible with a scenario of pending transactions/sessions?

Thomas Mueller-6 wrote
Hi, How large is the database exactly, and how does it grow exactly? Are you completely sure it doesn't stop growing? If yes, could you kill the process while the database is about 10 times as big as it should be, and then run the recovery tool, and post the statistics part of the the output? And then, could you shrink the file (for example by export to SQL script / import from SQL script), run recovery again, and post the statistics of this new file? Regards, Thomas On Tue, Aug 13, 2013 at 6:27 PM, davide.cavestro <[hidden email]>wrote: > I didn't measure it, but it was certainly lower. I'll give you more data > asap. > > > Noel Grandin wrote > > On Tue, Aug 13, 2013 at 5:56 PM, davide.cavestro > > < > > > davide.cavestro@ > > > > wrote: > >> I've tried replacing setCharacterStream() with setString() but the db > >> size > >> still continues to increase. Is there any way to get an idea of actual > >> fragmentation of internal data-structures? > >> > > > > Sorry, no, no idea. > > But I thought we were chasing a bug where shutdown was taking too > > long, in which case making this change should make a difference. > > Did you measure your new shutdown time with this change? > > > > -- > > You received this message because you are subscribed to the Google Groups > > "H2 Database" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to > > > h2-database+unsubscribe@ > > > . > > To post to this group, send email to > > > h2-database@ > > > . > > Visit this group at http://groups.google.com/group/h2-database. > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > -- > View this message in context: > http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027132.html > Sent from the H2 Database mailing list archive at Nabble.com. > > -- > You received this message because you are subscribed to the Google Groups > "H2 Database" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [hidden email]. > To post to this group, send email to [hidden email]. > Visit this group at http://groups.google.com/group/h2-database. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To post to this group, send email to [hidden email]. Visit this group at http://groups.google.com/group/h2-database. For more options, visit https://groups.google.com/groups/opt_out.

Thomas Mueller

unread,
Aug 14, 2013, 12:04:51 PM8/14/13
to H2 Google Group
Hi,

It seems like there is a long running transaction, because "-- stream data 29%, 440237 page(s)" means there is 860 MB of transaction log. Each page is 2048 bytes by default (unless you changed the page size).

Could you verify you don't keep a connection (with autocommit disabled, and with uncommitted changes) open for a long time? When this is fixed, let's see how this changed the numbers. Maybe there is still a problem, but maybe not. Maybe this unclosed / uncommitted connection has a temporary table that is removed when the connection is closed, committed, or rolled back.

Regards,
Thomas




To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

davide.cavestro

unread,
Aug 14, 2013, 12:16:50 PM8/14/13
to h2-da...@googlegroups.com
I ran the drop/import process on a smaller data set, actually the same used
on the tests I've used to reply to Thomas some hours ago (please note those
tests used /PreparedStatement.setCharacterStream()/).

I did some raw measurements launching Tomcat from the Eclipse IDE: the
entire drop/import process took 20-25% less using
/PreparedStatement.setString()/ instead of
/PreparedStatement.setCharacterStream()/.
Using /PreparedStatement.setCharacterStream()/ Tomcat takes ~7 seconds to go
down, both after a single drop/import and two sequential ones, but in the
latter case Eclipse complains that /Terminate failed/. With a single
drop/import I got no error messages, but with two drop/import it seems
systematic. From Eclipse I have no way to get more details on the error
cause, however I've seen the tomcat JVM process disappears after 7 seconds.
Using /PreparedStatement.setString()/ Tomcat takes ~4 seconds to go down,
both after a single drop/import and two sequential ones.

Using /PreparedStatement.setCharacterStream()/, a Tomcat shutdown after a
single drop/import reduces the DB size from 755MB (with Tomcat alive) to
584MB (after Tomcat shutdown). Shutting down Tomcat after 2 drop/import
within the same JVM instance the DB size of 1267MB (with Tomcat alive) is
reduced to 1156MB (Tomcat stopped).
Using /PreparedStatement.setString()/, a Tomcat shutdown after a single
drop/import reduces the DB size from 755MB (with Tomcat alive) to 556MB
(after Tomcat shutdown). Shutting down Tomcat after 2 drop/import within the
same JVM instance the DB size of 1267MB (with Tomcat alive) is reduced to
1096MB (Tomcat stopped).

With /PreparedStatement.setCharacterStream()/, running the Recover on the
database after two drop/import within the same JVM instance I got the
following statistics



With /PreparedStatement.setString()/, running the Recover on the database
after two drop/import within the same JVM instance I got the following
statistics


PS: about the negative numbers I've seen on the statistics for the 17GB db
I've posted some hours ago... could it be that the row count is kept into an
integer variable, and I simply exceeded /Integer.MAX_SIZE/?


Noel Grandin wrote
> On Tue, Aug 13, 2013 at 5:56 PM, davide.cavestro
> &lt;

> davide.cavestro@

> &gt; wrote:
>> I've tried replacing setCharacterStream() with setString() but the db
>> size
>> still continues to increase. Is there any way to get an idea of actual
>> fragmentation of internal data-structures?
>>
>
> Sorry, no, no idea.
> But I thought we were chasing a bug where shutdown was taking too
> long, in which case making this change should make a difference.
> Did you measure your new shutdown time with this change?
>
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027141.html

Thomas Mueller

unread,
Aug 14, 2013, 12:48:49 PM8/14/13
to H2 Google Group
Hi,

About setCharacterStream: I would use setString for strings that easily fit in memory.

About the negative numbers: why don't you post which numbers where negative? I guess it's a bug in the Recover tool (using int instead of long).

Regards,
Thomas



To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

davide.cavestro

unread,
Aug 14, 2013, 5:20:16 PM8/14/13
to h2-da...@googlegroups.com
Sorry for spreading info in so many replies. I'll try to recap here.

About setCharacterStream it was introduced several years ago for buggy JDBC drivers compatibility (maybe could be safely removed in most cases).


I repost it here for your convenience
---- Statistics ----
-- page count: 8763710, free: 4148621
-- page data bytes: head 263443992, empty 1859617598, rows -1552023894 (-225% full)
-- free 23%, 2078475 page(s)
-- data leaf 27%, 2375979 page(s)
-- data node 0%, 17345 page(s)
-- data overflow 0%, 56 page(s)
-- btree leaf 20%, 1801340 page(s)
-- btree node 0%, 22636 page(s)
-- free list 0%, 533 page(s)
-- stream trunk 0%, 4857 page(s)
-- stream data 28%, 2462486 page(s)
About the pending transaction, I suspect you ar eright: it's likely the cause of most of the issues. However the code is not mine, but it seems ok to me: it's single threaded, it acquires a single connection at the begin, sets the autocommit to false and then starts adding batches: it commits every N batch inserts and at the end of every table. At the end it restores the old autocommit (thought I don't remember if it actually closes the connection at the end... I'll check it).
I'm going to add some log (just to see if there's some corner case that I didn't notice) and eventually debug. It would indeed be very useful if I had some sort of API (even H2 internal) exposing the connection transactional status (pending statements and so on).


2013/8/14 Thomas Mueller-6 [via H2 Database] <[hidden email]>
Hi,

It seems like there is a long running transaction, because "-- stream data 29%, 440237 page(s)" means there is 860 MB of transaction log. Each page is 2048 bytes by default (unless you changed the page size).

Could you verify you don't keep a connection (with autocommit disabled, and with uncommitted changes) open for a long time? When this is fixed, let's see how this changed the numbers. Maybe there is still a problem, but maybe not. Maybe this unclosed / uncommitted connection has a temporary table that is removed when the connection is closed, committed, or rolled back.

Regards,
Thomas




Sent from the H2 Database mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:_e({}, &#39;cvml&#39;, &#[hidden email]&#39;);" target="_blank">h2-database+unsubscribe@....
To post to this group, send email to <a href="javascript:_e({}, &#39;cvml&#39;, &#[hidden email]&#39;);" target="_blank">h2-database@....

Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].

Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


If you reply to this email, your message will be added to the discussion below:
To unsubscribe from [h2] Continuous Increase in H2 db size after dropping and loading same data repeatedly, click here.
NAML

Noel Grandin

unread,
Aug 15, 2013, 2:33:39 AM8/15/13
to h2-da...@googlegroups.com, Thomas Mueller, davide.cavestro
   
On 2013-08-14 18:48, Thomas Mueller wrote:

About the negative numbers: why don't you post which numbers where negative? I guess it's a bug in the Recover tool (using int instead of long).


I have fixed the bug in the Recover tool where it was reporting negative numbers.

davide.cavestro

unread,
Aug 19, 2013, 3:44:19 AM8/19/13
to h2-da...@googlegroups.com
I replied you some days ago from gmail but I'm not sure you received it cause
I don't see the relevant message on nabble, so I repost it there. In the
meantime Noel fixed the Recover tool issue due to max int size overflow.

/Sorry for spreading info in so many replies. I'll try to recap here.

About setCharacterStream it was introduced several years ago for buggy JDBC
drivers compatibility (maybe could be safely removed in most cases).

About the negative numbers, I shared them at
http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027139.html

I repost it here for your convenience


About the pending transaction, I suspect you are right: it's likely the
cause of most of the issues. However the code is not mine, but it seems ok
to me: it's single threaded, it acquires a single connection at the begin,
sets the autocommit to false and then starts adding batches: it commits
every N batch inserts and at the end of every table. At the end it restores
the old autocommit (thought I don't remember if it actually closes the
connection at the end... I'll check it).
I'm going to add some log (just to see if there's some corner case that I
didn't notice) and eventually debug. It would indeed be very useful if I had
some sort of API (even H2 internal) exposing the connection transactional
status (pending statements and so on)./



Thomas Mueller-6 wrote
> Hi,
>
> About setCharacterStream: I would use setString for strings that easily
> fit
> in memory.
>
> About the negative numbers: why don't you post which numbers where
> negative? I guess it's a bug in the Recover tool (using int instead of
> long).
>
> Regards,
> Thomas
>
>
>
> On Wed, Aug 14, 2013 at 6:16 PM, davide.cavestro
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027161.html

Noel Grandin

unread,
Aug 19, 2013, 4:30:17 AM8/19/13
to h2-da...@googlegroups.com, davide.cavestro

On 2013-08-19 09:44, davide.cavestro wrote:
> didn't notice) and eventually debug. It would indeed be very useful if I had
> some sort of API (even H2 internal) exposing the connection transactional
> status (pending statements and so on)./
>

I've added an UNDO_LOG_SIZE column to the INFORMATION_SCHEMA.SESSIONS
metadata table.
That should allow you to monitor sessions and see which ones are going
rogue.


davide.cavestro

unread,
Aug 19, 2013, 12:03:57 PM8/19/13
to h2-da...@googlegroups.com
Hi Noel,
I replaced the custom legacy /DataSource/ dedicated to connection pooling
with /org.h2.jdbcx.JdbcDataSource/, and logged the contents of
/INFORMATION_SCHEMA.SESSIONS/ every time a table copy is completed. I also
disabled the insertion of indexes and foreign keys and replaced /DROP ALL
OBJECTS/ with a single /DROP/ statement for every single table.
The db size still *continues to increase* (please note I had to switch to
the svn trunk contents in order to get the additional metadata column, so
these tests may be eventually affected by regressions or behavior changes).

Anyway I noticed some things that seem strange to me: I'll try to give you
as much detailed info I can.

Issue 1

The /issue_log _size/ is always 0, but after the 3rd table (hence the 3rd
time I query the metadata table), /INFORMATION_SCHEMA.SESSIONS/ contents
seems cached (don't change anymore for that import session).

Here you are the contents from /INFORMATION_SCHEMA.SESSIONS/ (where the
columns are /ID |USER_NAME |SESSION_START |STATEMENT
|STATEMENT_START |UNDO_LOG_SIZE/)

*with a pooling DataSource*


Issue 2

I also have to say that before disabling connection pooling there was also
another issue: at the end of the 1st import I got an additional session with
a null statement. From then on, that session and the metadata query one
remained unchanged on the while 2nd import (within the same jvm instance)

*without a pooling DataSource*



Issue 3

The 2nd import generates in the trace file a bunch of error messages I never
noticed before disabling connection pooling. I paste here some of them








Since they always seem related to connection closing methods, I suspect they
were not triggered at all with connection pooling enabled, but their cause
could lead to other side-effects (such as db size increase?).

Cheers
Davide


Noel Grandin wrote
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/groups/opt_out.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027175.html

Noel Grandin

unread,
Aug 19, 2013, 1:44:00 PM8/19/13
to h2-da...@googlegroups.com
This looks to me like you have code that is allocating connections
from a pool, making some changes, not committing those changes, and
not bothering to close the connection.

Thomas Mueller

unread,
Aug 19, 2013, 3:17:52 PM8/19/13
to H2 Google Group
Hi,

Please note it's not the *size* of an uncommitted transaction that is a problem. The problem is, once you have an uncommitted transaction (which could mean there is only one row changed in the uncommitted transaction), then the transaction log is not truncated starting from that point on. So the *starting* time of the uncommitted transaction is the problem.

To get more details about transactions, you could run the Recover tool with the "-transactionLog" option. This option is not yet documented, because I thought it is not useful in general. But I will document it now.

Regards,
Thomas



--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

davide.cavestro

unread,
Aug 20, 2013, 6:44:30 AM8/20/13
to h2-da...@googlegroups.com
Noel Grandin wrote
> This looks to me like you have code that is allocating connections
> from a pool, making some changes, not committing those changes, and
> not bothering to close the connection.

When using the connection pooling datasource, the connection is certainly
reused and potentially not closed till the jvm shutdown. But I disabled it,
and I continue to experiment the db size increase. Anyway I'd expect there's
no need to close a connection in order to commit changes, otherwise
connection pooling would become somewhat cumbersome.
About not committing the changes: I've read many times that code, and it
seems they are always committed. Shouldn't I see a SESSIONS row with
UNDO_LOG_SIZE greater than 0 if there was an open connection with
uncommitted changes at that stage?
Moreover at connection close time there are those error messages I mentioned
some days ago replying to Thomas... today during the second import I've just
seen a buch of errors I've never seen before

It seems it is reading a record out of its bounds.
I ignored them and run a 3rd import, and I got the /NullPointerException/s
and /ClassCastException/ trying to cast /PageStreamData/ to /PageDataNode/.

Today I also sometimes experimented some imports (2nd or 3rd) with *no error
messages at connection close time and no db increase*.
So it seems that importing the same data from the same origin and under the
same (known) conditions results in different behaviors.

Hence, assumed the errors and the db size increase are related, I still
don't know the cause.
Worse and worse, I don't know how to systematically reproduce that
behavior... some sort of entropy is really not welcome now :-)

PS: I'll reply directly to Thomas for results of running the Recover tool
with the /-transactionLog/ parameter (no results). Please tell me if I
should take a different approach on replies.



--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027192.html

davide.cavestro

unread,
Aug 20, 2013, 7:03:43 AM8/20/13
to h2-da...@googlegroups.com
I agree with you.

Running the Recover tool with the "-transactionLog" option gave me no
evidences. Is there some particular section of the sql file I should
examine?

Follows an excerpt from the recover output when ran on a database that had
no problems on the 2nd import (no errors on the trace file and db size
almost unchanged) but had problems on the 3rd one (namely errors on the
trace file and db size increased).



So, if the errors on the trace file and the db size increase/uncommitted
transactions are related, is there something I could do to detect the cause
(even debugging)?


Thomas Mueller-6 wrote
> Hi,
>
> Please note it's not the *size* of an uncommitted transaction that is a
> problem. The problem is, once you have an uncommitted transaction (which
> could mean there is only one row changed in the uncommitted transaction),
> then the transaction log is not truncated starting from that point on. So
> the *starting* time of the uncommitted transaction is the problem.
>
> To get more details about transactions, you could run the Recover tool
> with
> the "-transactionLog" option. This option is not yet documented, because I
> thought it is not useful in general. But I will document it now.





--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027193.html

davide.cavestro

unread,
Aug 20, 2013, 12:42:27 PM8/20/13
to h2-da...@googlegroups.com
Hi,
I realized that I ran my last tests with the old
/PreparedStatement.setCharacterStream()/ instead of
/PreparedStatement.setString()/.
It seems that removing both /setCharacterStream()/ usage and connection
pooling the db size doesn't increase.
So I have shared at https://gist.github.com/davidecavestro/6283789 a
standalone test case that *sometimes reproduces* the issue.
It creates a database into the temp folder and then drops and inserts some
data.
It uses a DataSource that proxies h2 JdbcConnection instances with a custom
class.
Relaunching it more two or three times you can see it generates some errors
on the trace file, and the database size starts to increase.




--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027196.html

Thomas Mueller

unread,
Aug 20, 2013, 1:46:35 PM8/20/13
to H2 Google Group
Hi,

Running the Recover tool with the "-transactionLog" option gave me no evidences. 

Well, what did you look at? Could you upload the .sql file, or at least the interesting part (the summary and the transaction log)? Obviously of a case where the database has grown very large.

Regards,
Thomas



davide.cavestro

unread,
Aug 20, 2013, 4:08:06 PM8/20/13
to h2-da...@googlegroups.com
Hi Thomas,
I suspect if you read mailing list messages from a mobile device you miss
preformatted contents... I posted the head and tail of the .sql file in my
previous post, but reading it from gmail with my smartphone the relevant
portion is missing.

I re-paste it here without pre tags

/CREATE ALIAS IF NOT EXISTS READ_BLOB FOR "org.h2.tools.Recover.readBlob";
CREATE ALIAS IF NOT EXISTS READ_CLOB FOR "org.h2.tools.Recover.readClob";
CREATE ALIAS IF NOT EXISTS READ_BLOB_DB FOR
"org.h2.tools.Recover.readBlobDb";
CREATE ALIAS IF NOT EXISTS READ_CLOB_DB FOR
"org.h2.tools.Recover.readClobDb";
-- pageSize: 2048 writeVersion: 3 readVersion: 3
-- head 1: writeCounter: 1211977 log 2413:0/0 crc 662344123 (ok)
-- head 2: writeCounter: 1211977 log 2413:0/0 crc 662344123 (ok)
-- log 2413:0/0
-- page 3: free list
-- 3 1111111 1111111111 - 1111111111 1111111111 - 1111111111 1111111111 -
1111111111 1111111111 - 1111111111 1111111111

...
... contents omitted
...

---- Transaction log ----
---- Statistics ----
-- page count: 319162, free: 88377
-- page data bytes: head 13245977, empty 102539677, rows 154789962 (63%
full)
-- free 20%, 64423 page(s)
-- data leaf 41%, 132117 page(s)
-- data node 0%, 941 page(s)
-- data overflow 0%, 314 page(s)
-- btree leaf 0%, 1673 page(s)
-- btree node 0%, 21 page(s)
-- free list 0%, 20 page(s)
-- stream trunk 0%, 438 page(s)
-- stream data 37%, 119212 page(s)
/

Please giv eme a hint on what I should search for (the entire .sql file
weights some hundreds of MB and moreover could contain privacy-sensitive
data).



Thomas Mueller-6 wrote
> Hi,
>
>> Running the Recover tool with the "-transactionLog" option gave me no
> evidences.
>
> Well, what did you look at? Could you upload the .sql file, or at least
> the
> interesting part (the summary and the transaction log)? Obviously of a
> case
> where the database has grown very large.
>
> Regards,
> Thomas
>
>
>
> On Tue, Aug 20, 2013 at 1:03 PM, davide.cavestro
> &lt;

> davide.cavestro@

> &gt;wrote:
> h2-database+unsubscribe@

> .
>> To post to this group, send email to

> h2-database@

> .
>> Visit this group at http://groups.google.com/group/h2-database.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to

> h2-database+unsubscribe@

> .
> To post to this group, send email to

> h2-database@

> .
--
View this message in context: http://h2-database.66688.n3.nabble.com/h2-Continuous-Increase-in-H2-db-size-after-dropping-and-loading-same-data-repeatedly-tp4026836p4027200.html

Thomas Mueller

unread,
Aug 21, 2013, 1:56:59 AM8/21/13
to h2-da...@googlegroups.com
Hi,

I suspect if you read mailing list messages from a mobile device you miss
preformatted contents... I posted the head and tail of the .sql file in my
previous post

Well, I think you didn't post the content of the transaction log (not before, and not now).

In your "test case", you have used "MAX_LOG_SIZE=1024". You are aware what this means, right? See http://h2database.com/html/grammar.html#set_max_log_size "Sets the maximum size of the transaction log, in megabytes." Of course the database may grow larger than 1 GB if you set the transaction log size to 1 GB.

Anyway, I made a test case to illustrate the problem. http://h2database.com/p.html#ad8161dfc7ccd07b7aef95142046e99e

I guess this isn't a problem of the database, but the problem is in your code. If you think that's not the case, please provide a simple test case (one class without dependencies), just like I did above. I'm sorry, but I have quite little time reading long emails and test cases. If the emails get too large, I tend to ignore them. I stopped reading your test case as soon as I saw "MAX_LOG_SIZE=1024" for example.

Regards,
Thomas


Noel Grandin

unread,
Aug 21, 2013, 3:36:14 AM8/21/13
to h2-da...@googlegroups.com, Thomas Mueller

On 2013-08-21 07:56, Thomas Mueller wrote:
Hi,

I suspect if you read mailing list messages from a mobile device you miss
preformatted contents... I posted the head and tail of the .sql file in my
previous post

Well, I think you didn't post the content of the transaction log (not before, and not now).

Davide seems to be using some other interface (other then google groups or straight email) to post to this newsgroup, so quite often his posts are incomplete.

Also, I have modified the SESSIONS metadata table - I replaced the UNDO_LOG_SIZE column with a "CONTAINS_UNCOMMITTED" column, which should be more reliable as an indicator of uncommitted transactions.

Davide Cavestro

unread,
Aug 21, 2013, 10:08:06 AM8/21/13
to h2-da...@googlegroups.com, Thomas Mueller
Hi Noel,
I've always used http://h2-database.66688.n3.nabble.com/ to post to the newsgroup before: from that interface my messages appeared to be ok.
I simply was not aware that there were this kind of issues. From now on I'm going to write directly from google groups.

About the CONTAINS_UNCOMMITTED column, it's value is always false.
About the MAX_LOG_SIZE parameter, removing it from the JDBC URL, hence using the default value of 16MB I obtain in the trace file the message pageStore: Transaction log could not be truncated; size: 16 MB . The strange thing is that - even debugging - I'm sure I commit every 10 records, and with the real world data this issue always arises at row 332991 of the 65th table, when the db file size is ~68MB.
The error message appears only one time, only at the 1st import.
About the standalone test case I posted, it is so long cause I had to introduce a proxy for the connection in order to reproduce the issue. I can try to simplify it, but I can assure you this issue costs too much time even to me.

Davide

Davide Cavestro

unread,
Aug 22, 2013, 6:42:17 AM8/22/13
to h2-da...@googlegroups.com, Thomas Mueller
Before giving up definitively, could you please tell me if getting the 
pageStore: Transaction log could not be truncated; size: 16 MB 
warning in the trace file indicates an error?
I feel very dumb, but I got it in cases were I wouldn't expect it... and it it indicates a problem, then it could reveal the cause of the increasing db size issue.

So I've attached a simple standalone test case that uses a single connection with auto commit to create and populate 20 tables with 7001 rows each one.
It always reproduces the warning mentioned above, approximately at row 5150 of the last table.


On Wednesday, August 21, 2013 9:36:14 AM UTC+2, Noel Grandin wrote:
H2IncreasingSizeTest.java

Noel Grandin

unread,
Aug 22, 2013, 7:14:27 AM8/22/13
to h2-da...@googlegroups.com, Davide Cavestro, Thomas Mueller

On 2013-08-22 12:42, Davide Cavestro wrote:
Before giving up definitively, could you please tell me if getting the 
pageStore: Transaction log could not be truncated; size: 16 MB 
warning in the trace file indicates an error?
No, it's just a warning. It's a useful one, because it indicates that the transaction log is taking up more space than we expect, and that there might be a problem somewhere else.

I think we've got to the bottom to this - it is related to the use of setCharacterStream(), which generates temporary LOB's which clog up the transaction log.

Myself and Thomas are working through some patches, we should have something in a couple of days.

Davide Cavestro

unread,
Aug 22, 2013, 7:27:17 AM8/22/13
to h2-da...@googlegroups.com, Davide Cavestro, Thomas Mueller
I still don't understand why the transaction log is taking up more space than expected, even when using setString() and autocommit: that's what happens with the test case I attached to my previous message.

Noel Grandin

unread,
Aug 22, 2013, 9:03:55 AM8/22/13
to h2-da...@googlegroups.com, Davide Cavestro, Thomas Mueller
Yeah, I don't really understand that part either.
Sorry, but I'm also out of enthusiasm for chasing this bug.

Noel Grandin

unread,
Aug 23, 2013, 3:09:34 AM8/23/13
to h2-da...@googlegroups.com, Davide Cavestro, Thomas Mueller
By the way, nice work producing a test case. Sorry we weren't able to track it all the way down.


On 2013-08-22 13:27, Davide Cavestro wrote:

Thomas Mueller

unread,
Aug 23, 2013, 12:49:48 PM8/23/13
to H2 Google Group
Hi,

About the message "Transaction log could not be truncated": Yes, it's indeed a very good test case. I'm analyzing the problem now. I think the problem is somewhere in the PageStore or PageLog class. Possibly it's "only" an incorrect warning, but solving it might as well solve the original problem (transaction log grows without bounds).

Regards,
Thomas


--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.

pe...@hotmail.com

unread,
Aug 24, 2013, 10:53:33 AM8/24/13
to h2-da...@googlegroups.com
I don't know if it is related to your problem but I have a memory problem with transactions too.

I have a program that reads some web pages, perform some analysis on them and then save results in a local H2 database.

Every 50 or 100 pages, the program would flush the session and commit the transaction and yet, after reading 1500 pages or so (depending on the average size of the saved data), I would get an out of memory error in java.  To solve my problem I actually had to close the session every 50 pages and create a new one and a new transaction.

Not very efficient, but I didn't care much about efficiency in a program meant to run for a few hours once a month or so.

I still don't understand why my memory usage was blowing up though.

Noel Grandin

unread,
Aug 24, 2013, 11:38:25 AM8/24/13
to h2-da...@googlegroups.com
On Sat, Aug 24, 2013 at 4:53 PM, <pe...@hotmail.com> wrote:
> I don't know if it is related to your problem but I have a memory problem
> with transactions too.

Weird. I run my stuff for weeks at a time with no problem.

But we can find out easily - add this option
-XX:+HeapDumpOnOutOfMemoryError to your java startup options.
Let it trigger a heap dump, and then run it through a heap dump
analysis tool like the one in Netbeans or the one in VisualVM.

Davide Cavestro

unread,
Aug 24, 2013, 12:06:03 PM8/24/13
to h2-da...@googlegroups.com
That make me think that - just in case there were a common cause - we could obtain some useful data even debugging the test case with a breakpoint at the line that would produce the transaction log warning and generating a heap dump at that time.
Maybe that could highlight unwanted transaction log related object instances that are somewhat kept referenced and hence not GCed.


2013/8/24 Noel Grandin <noelg...@gmail.com>

--
You received this message because you are subscribed to a topic in the Google Groups "H2 Database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/h2-database/i2fVBYBF3as/unsubscribe.
To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages