Extremely slow of incremental backup with huge number of files

937 views
Skip to first unread message

Lv Haijiao

unread,
Apr 22, 2019, 10:11:51 PM4/22/19
to bareos-users
Hi Community Users

In our case, we use Bareos to backup file server which has a huge number of files in some folders.

The target directory has 1,620,775 files, 1.598 TB. While we enable the incremental backup, it's noted Bareos took about 14+ hours to complete the incremental backup though only 7.6GB data were actually backup-ed, most of hours(99%) was spent on comparison.

Scheduled time: 14-Apr-2019 08:39:40
Start time: 14-Apr-2019 08:39:43
End time: 14-Apr-2019 23:14:48
Elapsed time: 14 hours 35 mins 5 secs
Priority: 10
FD Files Written: 4,138
SD Files Written: 4,138
FD Bytes Written: 7,670,521,154 (7.670 GB)
SD Bytes Written: 7,671,391,963 (7.671 GB)

We understand it's pretty normal to take some time to compare during incremental backup, however it's just too long. For the same folder and volume, Bacula only needs about 3 hours for incremental backup.

would much appreciated if anyone can share how to fine tune the BareOS incremental backup performance.


BTW, our environment,

- BareOS 18.2.5
- File Server:Windows 2008 DataCenter (x64), NTFS


Thanks!

Andreas Rogge

unread,
Apr 23, 2019, 5:00:04 AM4/23/19
to bareos...@googlegroups.com
Am 23.04.19 um 04:11 schrieb Lv Haijiao:
> Hi Community Users
>
> In our case, we use Bareos to backup file server which has a huge number of files in some folders.
>
> The target directory has 1,620,775 files, 1.598 TB. While we enable the incremental backup, it's noted Bareos took about 14+ hours to complete the incremental backup though only 7.6GB data were actually backup-ed, most of hours(99%) was spent on comparison.
How did you find out that this was "comparison" and what comparison are
you talking about?

[...]
> We understand it's pretty normal to take some time to compare during incremental backup, however it's just too long. For the same folder and volume, Bacula only needs about 3 hours for incremental backup.
Did you try Bacula or Bacula Enterprise? Did you use the exact same
configuration for Bacula?
Because I doubt that Bacula will be 5 times faster with the same
configuration.

> would much appreciated if anyone can share how to fine tune the BareOS incremental backup performance.
Do you have accurate enabled?
Can you show your fileset (especially the accurate flags and
wildcard/regexp includes and excludes)

Best Regards,
Andreas
--
Andreas Rogge andrea...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221-630693-86
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: S. Dühr, M. Außendorf, J. Steffens, Philipp Storz

signature.asc

Lv Haijiao

unread,
Apr 25, 2019, 10:16:28 AM4/25/19
to Andreas Rogge, bareos...@googlegroups.com
Thanks for your reply,  Andreas, 

Yes, we have accurate enabled.  Here's the fileset we have configured. Anything we can fine tune ?  Thanks !

*show fileset

FileSet {

  Name = "WFCN00105-SBMLV-file"

  Include {

    Options {

      Signature = MD5

      IgnoreCase = Yes

      Exclude = Yes

      Wild Dir = "[A-Z]:/RECYCLER"

      Wild Dir = "[A-Z]:/$RECYCLE.BIN"

      Wild Dir = "[A-Z]:/System Volume Information"

      Wild File = "[A-Z]:/pagefile.sys"

      Drive Type = "fixed"

    }

    File = "E:/"

  }

}



Andreas Rogge <andrea...@bareos.com> 于2019年4月23日周二 下午5:00写道:
--
You received this message because you are subscribed to the Google Groups "bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bareos-users...@googlegroups.com.
To post to this group, send email to bareos...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andreas Rogge

unread,
Apr 25, 2019, 11:17:15 AM4/25/19
to Lv Haijiao, bareos...@googlegroups.com
Am 25.04.19 um 16:15 schrieb Lv Haijiao:
> Yes, we have accurate enabled.  Here's the fileset we have configured.
> Anything we can fine tune ?  Thanks !
[...]
The fileset looks good.
Do you run with PostgreSQL or MySQL?

Did you already try an estimate (i.e. "estimate <jobname>" from
bconsole)? Does it also take 14 hours?
signature.asc

Lv Haijiao

unread,
May 5, 2019, 9:40:23 PM5/5/19
to Andreas Rogge, bareos...@googlegroups.com
Hi, Andreas

We run with MySQL.  And tried an estimate,  it seems the estimated time is close to the actual backup time. 

Please let us know if any configuration file or data can help narrow down this performance issue which has bothered us for long time. 

Thanks!
    



Andreas Rogge <andrea...@bareos.com> 于2019年4月25日周四 下午11:17写道:

Andreas Rogge

unread,
May 6, 2019, 3:05:24 AM5/6/19
to Lv Haijiao, bareos...@googlegroups.com
Am 06.05.19 um 03:40 schrieb Lv Haijiao:
> Hi, Andreas
>
> We run with MySQL.  And tried an estimate,  it seems the estimated time
> is close to the actual backup time.
Estimate traverses the Fileset on the bareos-fd just like it would for a
backup. It doesn't touch the file's contents at all.
So if this takes 14 hours, you either have lots and lots of files, a
really slow filesystem (what is the fs-type) or a fileset that slows
down a lot (a lot of wildcard or regexp include/exclude).
Can you rerun the estimate (you don't have to wait for it to finish) and
look at the utilization of the machine running the fd? If the fd takes
one cpu and maxes that out it is probably the fileset.
If you see a lot of i/o waiting, then it is probably the filesystem or
drives.

Concerning MySQL: do you realize that PostgreSQL is the preferred DBMS
for Bareos? Would you shed a light on how the decision to go with MySQL
was made (this is off-topic for your question, but I'm trying to find
out why so many people still use MySQL).
signature.asc

Bartek R

unread,
May 10, 2019, 12:30:00 PM5/10/19
to Andreas Rogge, Lv Haijiao, bareos-users
Hi,

@andreas

This is my fileset for bareos 18.2.5:

FileSet {
  Name = "LinuxAll"
  Description = "Backup all regular filesystems, determined by filesystem type."
  Include {
    Options {
      sparse = yes
      aclsupport = yes
      accurate = mcspiug
      verify = pin5
      Signature = MD5 # calculate md5 checksum per file
      One FS = No     # change into other filessytems
      FS Type = btrfs
      FS Type = ext2  # filesystems of given types will be backed up
      FS Type = ext3  # others will be ignored
      FS Type = ext4
      FS Type = reiserfs
      FS Type = jfs
      FS Type = xfs
      FS Type = zfs
      FS Type = vfat
    }
    File = /
  }
  # Things that usually have to be excluded
  Exclude {
    File = /home/bartek/.cache
    File = /home/basia/.cache
    File = /var/lib/bareos
    File = /var/lib/bareos
    File = /var/lib/pgsql
    File = /var/lib/mysql
    File = /var/lib/libvirt/images
    File = /var/lib/docker
    File = /var/opt/gitlab
    File = /var/log
    File = /var/cache
    File = /volumes
    File = /proc
    File = /tmp
    File = /var/tmp
    File = /.journal
    File = /.fsck
  }
}

You have sugested that having too many regex paths can cause backup slowdown on filesystem with large number of files. Do you think my fileset can also be impacted by this issue ? 

> Can you rerun the estimate (you don't have to wait for it to finish) and
> look at the utilization of the machine running the fd? If the fd takes
> one cpu and maxes that out it is probably the fileset.
> If you see a lot of i/o waiting, then it is probably the filesystem or
> drives.

This is exactly what i see when trying to make a full backup of my home nas. It runs low power intel cpu without aes accel.

Kind Regards,
Bart

Reply all
Reply to author
Forward
0 new messages