Purge old .journal files for voron

175 views
Skip to first unread message

vlko

unread,
May 12, 2016, 6:58:47 AM5/12/16
to RavenDB - 2nd generation document database
Hi

We are testing to move from esent to voron (incremental backup set to on) while playing with it and we found strange issue (Server #30115). We added a lot of data, then deleted them and after setting replication we have found a 36GB of .journal files for 7 GB database. First I thought that it is something like sql, you need to do backup to purge transaction log, but event after backup and incremental backup files a still there.

I tried to search but no post here in forum and I'm not able to find any info how to have incremental backup and don't have GB's of old transaction logs. Or there is something I'm doing wrong?

vlko

unread,
May 12, 2016, 7:09:15 AM5/12/16
to RavenDB - 2nd generation document database
Sorry,

files already purged:) Looks like .journal files a not purged right after backup, but sometimes later.
[murphy_irony]Maybe purging is related to posting to this group, to embarrass myself first:)[/murphy_irony]

Oren Eini (Ayende Rahien)

unread,
May 13, 2016, 7:46:14 AM5/13/16
to ravendb
Your hunch is correct. What is actually going on is that if your incremental backup is set to on, we don't delete the journal files.
When you do a backup, we mark them as to be deleted, but only on the next flush to disk will we delete it, and that is mostly controlled by the number of writes you do

Hibernating Rhinos Ltd  

Oren Eini l CEO Mobile: + 972-52-548-6969

Office: +972-4-622-7811 l Fax: +972-153-4-622-7811

 


--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrej Krivulčík

unread,
May 20, 2016, 6:41:48 AM5/20/16
to RavenDB - 2nd generation document database
(OP's coworker here).

It appears that the problem persists. The incremental backup is turned on and working. The .ravendb-incremental-dump files appear in the backup directory as scheduled (every 2 hours). However, the .journal files are not removed even after heavy import (hundreds of thousands of documents, 2 GB worth of new .journal files).

Server Build #30140
Client Build #30140
The database has Replication and Periodic Export bundles enabled.
The replication is currently disabled.

Rough timeline:

2016-05-12: incremental backup performed, original question asked, later the .journal files got removed. Several new journal files were created (small import, these files are 65 MB in size)
2016-05-18: largish import, 196 journal files (10.9 GB)
2016-05-19: another import, 127 journal files (7.31 GB)
2016-05-20 (today): latest import, 43 journal files (2.12 GB). This import (and at least one of the previous ones) failed with ScratchBufferSizeLimitException (will post separate question shortly)
2016-05-20 10:17: imported 30k documents (replaced existing documents), 8 journal files (25.6 MB)
2016-05-20 10:18: changed the incremental backup interval to 10 minutes

All the time, incremental backup was running and the backup files were being created. I can provide the full file list if that's relevant.

The latest backup files (times are UTC).

-a---         5/20/2016   6:27 AM        741 2016-05-20-06-27-0.ravendb-incremental-dump
-a---         5/20/2016   8:27 AM        744 2016-05-20-08-27-0.ravendb-incremental-dump
-a---         5/20/2016   9:53 AM  155701533 2016-05-20-09-41-0.ravendb-incremental-dump
-a---         5/20/2016   9:59 AM   96791980 2016-05-20-09-59-0.ravendb-incremental-dump
-a---         5/20/2016  10:18 AM    2593509 2016-05-20-10-18-0.ravendb-incremental-dump
-a---         5/20/2016  10:18 AM    2594243 2016-05-20-10-18-1.ravendb-incremental-dump

However, the journal files are still present:

-a---         5/12/2016  11:30 AM      65536 0000000000000000000.journal
-a---         5/12/2016  11:30 AM     778240 0000000000000000001.journal
-a---         5/12/2016  11:30 AM     262144 0000000000000000002.journal
...
-a---         5/20/2016   9:59 AM   67108864 0000000000000000391.journal
-a---         5/20/2016   9:59 AM   67108864 0000000000000000392.journal
-a---         5/20/2016   9:59 AM   67108864 0000000000000000393.journal

The journal files from the latest import:
-a---         5/20/2016  10:18 AM    1048576 0000000000000000394.journal
-a---         5/20/2016  10:18 AM     131072 0000000000000000395.journal
-a---         5/20/2016  10:18 AM    1028096 0000000000000000396.journal
-a---         5/20/2016  10:18 AM     524288 0000000000000000397.journal
-a---         5/20/2016  10:18 AM    1056768 0000000000000000398.journal
-a---         5/20/2016  10:18 AM    2097152 0000000000000000399.journal
-a---         5/20/2016  10:18 AM    4194304 0000000000000000400.journal
-a---         5/20/2016  10:18 AM   16777216 0000000000000000402.journal

There were long periods of inactivity in that interval, so the database had a chance to flush everything to disk etc.

Is there something that I'm doing wrong?

Thanks
Andrej

Oren Eini (Ayende Rahien)

unread,
May 20, 2016, 8:48:35 AM5/20/16
to ravendb
You are confusing incremental export (ravendb-incremental-dump) and incremental backup
Export is a text based format, which is very small.
Backup is a binary copy of the db state.


Andrej Krivulčík

unread,
May 20, 2016, 9:47:44 AM5/20/16
to RavenDB - 2nd generation document database
Thanks for the hint, that's the reason.
Reply all
Reply to author
Forward
0 new messages