Poor Ceph/RADOS/BareOS transfer performance

776 views
Skip to first unread message

Fox Whittington

unread,
Mar 19, 2016, 4:50:44 AM3/19/16
to bareos-users
In short: We have BareOS DIR & SD on one fast new host with 40TB of local storage.

Out entire network is 1Gi, much of it 10Gi and 40Gi.

Our test example backup client will transfer at about 80 to 90MB/sec average for the duration of its 4TB backup to the local storage pool.

Using the RADOS gateway with BareOS to Ceph we are getting 3MB/sec average.

Using the RADOS benchmark test from same BareOS host we are getting about 68MB/sec to Ceph

Other clients to our 12-node Ceph cluster get pretty good performance from other hosts, around the same 68MB/sec.

Have others experienced that performance to Ceph with RADOS -when- using BareOS is impractically awful? There is no way we can run huge backups with 3MB/sec, making Ceph useless for us with BareOS.


Thanks

Marco van Wieringen

unread,
Mar 19, 2016, 5:23:53 AM3/19/16
to bareos...@googlegroups.com
Currently some of the backend drivers are more proof of concept then anything else.
I know some people testing the backends to rados as we got some enhancement request
which we implemented recently. What is probably killing you is the fact that by
default the blocksize used on any BAREOS file device is the POSIX default e.g.
512 * 126 = 64,512. Don't know what the benchmark test uses for chunks but setting
the blocksize on the rados device definition in your bareos-sd.conf will probably
crank the speed up. There are some disadvantages of setting the blocksize in the
device however, the docs have quite extensive explanations on how to tune the blocksize
which is essentially the same for each type of device. I would also setup things as disk
to disk to rados backups e.g. utilizing the local storage as primary storage and use
copy/migration jobs to the rados storage. An other thing people are trying is using the
rados striper functionality for which you need the current experimental branch as my
initial implementation had some errors.


--
Marco van Wieringen marco.van...@bareos.com
Bareos GmbH & Co. KG Phone: +49-221-63069389
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: Stephan Dühr, M. Außendorf, J. Steffens,
P. Storz, M. v. Wieringen

Sven Röllig

unread,
Mar 20, 2016, 10:17:03 AM3/20/16
to bareos-users
Hi,
same problem with Rados

I use Ceph over normal SD config.
I write 18 parallel streams on Ceph over the normal SD config.

The Write Speed is ~8Gbit in to the Ceph Cluster.

Sven

Fox Whittington

unread,
Jun 29, 2016, 3:35:16 PM6/29/16
to bareos-users
Thanks for the reply. I have tried most of that without much luck. We don't have enough local space for one of our massive jobs. Is it possible to get BareOS to do spooling to local disk before it goes through rados to ceph? We have configured spooling but it seems to be ignoring it.

Marco van Wieringen

unread,
Jun 29, 2016, 4:40:58 PM6/29/16
to bareos...@googlegroups.com
On 06/29/16 09:35 PM, Fox Whittington wrote:
>
> Thanks for the reply. I have tried most of that without much luck. We don't have enough local space for one of our massive jobs. Is it possible to get BareOS to do spooling to local disk before it goes through rados to ceph? We have configured spooling but it seems to be ignoring it.
>

It should be possible for every device type. I have it configured
myself for disk (VTL) and tape storage.

e.g. in the bareos-sd in the device section something like:


Spool Directory = ...
Maximum Job Spool Size = ...

I haven't tested it with CEPH but as its generic I would expect it to just work.

Fox Whittington

unread,
Jul 6, 2016, 5:21:46 PM7/6/16
to bareos-users
Thanks, for some reason it is working now, but when it de-spools, the transfer rate is as poor as always. No help there.

Below is our test config example for this storage and this client:

Running Jobs:
Writing: Full Backup job clienthost.example.com JobId=1415 Volume="Vol-0657"
pool="CephPool" device="CephStorage" (Rados Device)
spooling=0 despooling=1 despool_wait=0
Files=13,621,842 Bytes=3,296,515,371,142 AveBytes/sec=3,628,711 LastBytes/sec=3,419,745
FDReadSeqNo=159,816,491 in_msg=122797073 out_msg=5 fd=16
====

Device status:

Device "CephStorage" (Rados Device) is mounted with:
Volume: Vol-0657
Pool: CephPool
Media type: RadosFile
Total Bytes=58,054,993,153 Blocks=899,910 Bytes/block=64,511
Positioned at File=13 Block=2,220,418,304
==
====

Used Volume status:
Vol-0657 on device "CephStorage" (Rados Device)
Reader=0 writers=1 reserves=0 volinuse=1
====

Data spooling: 1 active jobs, 299,999,945,874 bytes; 2 total jobs, 300,000,008,124 max bytes/job.
Attr spooling: 1 active jobs, 1,610,400,986 bytes; 2 total jobs, 1,610,400,986 max bytes.


SD:

Device {
Name = CephStorage
Archive Device = "Rados Device"
Device Options = "conffile=/etc/ceph/ceph.conf,poolname=bareos"
Spool Directory = /u0/BareOS/spool
Maximum Spool Size = 300000000000
Device Type = rados
Media Type = RadosFile
Label Media = yes
Minimum block size = 2097152
Maximum block size = 4194304
Random Access = yes
Automatic Mount = yes


DIR:

JobDef:

JobDefs {
Name = "Ceph-Linux-weekly-all"
Type = Backup
Level = Incremental
Client = dirhost.example.com-fd
FileSet = "Linux All"
Schedule = "WeeklyCycle"
SpoolData = yes
Allow Duplicate Jobs = no
Storage = CephFile
Messages = Standard
Pool = CephPool
Priority = 10
Write Bootstrap = "/var/lib/bareos/%c.bsr"
}


Spool Dir:

-rw-r----- 1 bareos bareos 299999946807 Jun 21 03:05 clienthost.example.com-sd.data.1340.clienthost.example.com.2016-06-12_18.52.27_05.CephStorage.spool
-rw-r----- 1 bareos bareos 299999945874 Jul 5 21:09 clienthost.example.com-sd.data.1415.clienthost.example.com.2016-06-25_21.00.01_32.CephStorage.spool

Stephan Dühr

unread,
Jul 7, 2016, 5:05:10 AM7/7/16
to bareos...@googlegroups.com
Hi,

what do you mean by normal SD config? Do you use CephFS on the SD machine?

Regards,

Stephan
--
Stephan Dühr stepha...@bareos.com
Bareos GmbH & Co. KG Phone: +49 221-630693-90
http://www.bareos.com

Sitz der Gesellschaft: Köln | Amtsgericht Köln: HRA 29646
Komplementär: Bareos Verwaltungs-GmbH
Geschäftsführer: S. Dühr, M. Außendorf,
J. Steffens, Philipp Storz, M. v. Wieringen

Fox Whittington

unread,
Sep 4, 2016, 8:15:56 AM9/4/16
to bareos-users
Thanks, my settings are:
Minimum block size = 2097152
Maximum block size = 4194304

Unfortunately there was no improvement.
Do you offer Enterprise support? The ability to use Ceph in practical terms is very important for us, and I think in general going forward. RAID is getting fairly impractical when you have boxes with 12x 6TB disks that can still only lose 2, and take forever to rebuild at that size.

Thanks.


On Saturday, March 19, 2016 at 2:23:53 AM UTC-7, marco.van.wieringen wrote:

Alexander Kushnirenko

unread,
Sep 29, 2017, 3:43:29 AM9/29/17
to bareos-users
Hi,

We have tried two versions:
1. Bareos(16.2.4) + CEPH-10.2.7 with librados and
2. Bareos(16.2.4) + CEPH-12.2.0 with libradosstriper support. In default binary bareos package libradosstriper is usually not compiled in.

We experience very much the same problem - backup speed of about 3MB/s, when rados benchmarking gives 95MB/s

Did you manage to resolve this issue? We have 2 running setups, perhpaps we can help testing some options.

Thanks,
Alexander.

Alexander Kushnirenko

unread,
Oct 17, 2017, 5:21:48 AM10/17/17
to bareos-users
We managed to significanly (10 times) increase Bareos -> Rados speed. Details are here https://groups.google.com/d/msg/bareos-users/hnLJrH60GHU/TJlb6j47BAAJ

пятница, 29 сентября 2017 г., 10:43:29 UTC+3 пользователь Alexander Kushnirenko написал:
Reply all
Reply to author
Forward
0 new messages