Automatic COMPACT does not seem to work

251 views
Skip to first unread message

Ulrich

unread,
Jun 13, 2022, 5:48:08 AM6/13/22
to H2 Database
Hi all,

I am running a H2 2.1.212 and I wonder why the automatic compact feature does not shrink the database size.

My application is continuously collecting data (around 100 inserts per second). Most of the data is removed once a day. The data is kept in 4300 tables.

The database file size is around 5 GB. When running a manual SHUTDOWN COMPACT it is compacted to 40 MB.

I expected that the automatic compact algorithm of the MV_STORE is able to keep the database smaller than it is. Are there any known circumstances why the compact algorithm does not run or does not give good results?

I already tried to use the latest git code but the results are the same.

Here are some infos from the information_schema.settings table with the relevant parameters:

AUTO_COMPACT_FILL_RATE
  90

info.FILL_RATE
  19

info.CHUNKS_FILL_RATE
  27

info.CHUNKS_FILL_RATE_RW
  27

info.FILE_SIZE
  4916121600 

info.CHUNK_COUNT 490
info.PAGE_COUNT
  1594967

info.PAGE_COUNT_LIVE
440730

info.PAGE_SIZE
  4096

info.CACHE_MAX_SIZE
16

info.CACHE_SIZE
15

info.CACHE_HIT_RATIO
85

info.TOC_CACHE_HIT_RATIO
99

info.LEAF_RATIO
44

Thanks!
Ulrich

Andreas Reichel

unread,
Jun 14, 2022, 12:02:08 AM6/14/22
to h2-da...@googlegroups.com
Greetings.

My understanding is, there is no "auto compact" while the database is running.
Instead it will be compacted only during shutdown and only  within the defined shutdown time period (longer will give more time to compact).

And yes, H2 database consumes a lot of file space, especially when data are constantly written and deleted. I made exactly the same observation.

Best regards
Andreas
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/ae5bf92d-f61a-4cac-981a-9e1b51d7d49fn%40googlegroups.com.

Ulrich

unread,
Jun 14, 2022, 1:58:28 AM6/14/22
to H2 Database
The manual is not clear about this:

Having a look into the source code I guess it should compact while the database is open.

I have a 2nd project where compacting works fine. But this project only has rare database changes.

Noel Grandin

unread,
Jun 14, 2022, 2:12:03 AM6/14/22
to H2 Database

In general, compacting should work while the db is running, but it's fairly conservative, so if you have running queries or open transactions, that can keep a lot of stuff alive that should be compacted.
Also, if you are adding stuff to the db while compacting, even if the compacter frees up a ton of space, there will likely be trailing entries in the file that prevent the file from being truncated.

If you can temporarily pause all your queries or operations while you run the compacter, that might help.

Possibly we should look into an option that does that (pausing) automatically.

Ulrich

unread,
Jun 14, 2022, 2:59:18 AM6/14/22
to H2 Database
Thanks for the clarification!

I will try to pause operations on the database and see what happens.
What do you mean with "while you run the compacter"? Is there a specific command to start the compacter. I thought it would start automatically or if I run SHUTDOWN COMPACT.

Andreas Reichel

unread,
Jun 14, 2022, 5:04:26 AM6/14/22
to h2-da...@googlegroups.com


On Tue, 2022-06-14 at 08:11 +0200, Noel Grandin wrote:
In general, compacting should work while the db is running, but it's fairly conservative

Noel,

not arguing, you know that we do love H2 and are grateful.
Although I never saw any online defragmention doing anything.

We have 300 MB defragmented content blown up to 20 GByte. The databases are idle during the night and most time the day.
Heavy Delete/Write/Querying happens only in the early morning -- but it never shrinks, only grows, until "SHUTDOWN DEFRAG".

Best regards
Andreas


Andrei Tokar

unread,
Jun 15, 2022, 9:31:05 PM6/15/22
to H2 Database
I wonder how your url looks like, what parameters you are using, specificaly RETENTION_TIME and WRITE_DELAY ?
Are any LOBs involved?
With default settings auto compaction should take place in the background. Whether or not it can keep up with the rate of updates, is a different story, but the fact that idle database is not compacted at all is worrysome.
BTW, the fact that file is not shrinking does not nesseccarily means that there is no compaction going on, like the fact that JVM heap is not shrinking is not an indication of GC absence.

Ulrich

unread,
Jun 21, 2022, 8:28:42 AM6/21/22
to H2 Database
My URL parameters are: LOCK_TIMEOUT=86400000;WRITE_DELAY=10000;MAX_MEMORY_ROWS=1000000;MAX_OPERATION_MEMORY=50000000;ANALYZE_AUTO=0

I'll check if the compact algorithm works (better) with the default parameters.

Andrei Tokar

unread,
Jun 21, 2022, 10:25:48 PM6/21/22
to H2 Database
I would rather try to add RETENTION_TIME=0

Ulrich

unread,
Jun 22, 2022, 9:17:39 AM6/22/22
to H2 Database
The documentation says:
Using a lower value might be dangerous, unless the file system and hard disk flush the buffers earlier.

Is it safe to use RETENTION_TIME=0? What are the consequences?
Do I have to fear that the database file becomes corrupted or that transactions are not consistent?

Andrei Tokar

unread,
Jun 22, 2022, 9:22:44 PM6/22/22
to H2 Database
Documentation is outdated. At the time, garbage determination was inexact and this was used as additional safety. This is not the case since 1.4.197 or 1.4.198, premature storage release is not possible anymore.

Ulrich

unread,
Jun 24, 2022, 4:23:54 AM6/24/22
to H2 Database
Using RETENTION_TIME=0, I see that a moderate compaction is applied :-) That's completely sufficient!
I'll now check how it behaves in long term.

RETENTION_TIME=0 also speeds up the database significantly.
Is there any use case to set RETENTION_TIME>0? Should the default value be changed in H2?

Andrei Tokar

unread,
Jun 24, 2022, 3:47:05 PM6/24/22
to H2 Database
Yes, it makes sense now to make RETENTION_TIME=0 as a default. Use case for a positive value is that it theoretically improve your chances for recovery in case of abrupt shutdown (more history has been kept), but that's about it. This setting does not change the way how compaction works, it just delays marking chunks of storage as available for reuse. Database might be little faster, because there is less garbage to shuffle around, pretending it's still a data. Current default (45 sec) seems a way too high.

Andreas Reichel

unread,
Jun 26, 2022, 12:38:58 AM6/26/22
to h2-da...@googlegroups.com
Thank your for advising, Andrei.

I have set  RETENTION_TIME=0 and the DB behaves much more graceful now: Before a compact 2 GB datbase file grew to 15 GByte and never shrank even when idle during the day.
Now, the same file grows to 2.5 GB only (increasing in relation to the actually added data daily).

It is a very welcome improvement, which deserve more promotion in my opinion.
All the best

Andreas 
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages