performance?

304 views
Skip to first unread message

Randy Rue

unread,
Dec 12, 2016, 3:39:44 PM12/12/16
to s3...@googlegroups.com
Hello All,

After an initial test on a CentOS_7 VM with 4 CPUs and 16GB of RAM I'm
testing s3ql on a physical box, a Dell r910 with 24 cores and 768GB of
RAM. Two 10GB NICs in an ALB bond0 pair.

I'm running s2ql with its cache on a 300GB RAMDISK and compression set
at zlib=1 , cachesize at 320GB, against a SwiftStack cluster.

I'm loading it with multiple rsync streams coming via NFS (v3, auth_sys)
from multiple cluster nodes.

I'm watching metrics with htop, iftop, and a running "watch" of
s3qlstat, df and free.

When the streams start, I'm getting incoming throughput to the machine
with bursts up to 190MB/s and an average of 120MB. CPU cores are all
working but none pegged or stuck, overall load average is 6-8. Memory
doesn't seem to be a factor, without the RAMdisk we're only using ~5G.

s3ql stat shows the cache at ~35GB. "Dirty" cache starts at zero and
climbs steadily to the full size of the cache, and when they match
(cache is all dirty), input slows to half (60MB). Cache size and dirty
cache slowly drop together along with throughput, currently at 26GB and
38MB/s.

What kind of performance can I hope for with s3ql, and what do I need to
know to tune for it?

What other information should I be providing?

Let me know,

Randy

Nikolaus Rath

unread,
Dec 12, 2016, 4:44:03 PM12/12/16
to s3...@googlegroups.com
On Dec 12 2016, Randy Rue <rand...@gmail.com> wrote:
> What kind of performance can I hope for with s3ql, and what do I need
> to know to tune for it?
>
> What other information should I be providing?

Please run contrib/benchmark.py. If you have any questions after
studying the output, please post them here.

Best,
-Nikolaus
--
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«

Randy Rue

unread,
Dec 12, 2016, 5:21:28 PM12/12/16
to s3ql
Will do!

Randy Rue

unread,
Dec 13, 2016, 2:57:17 PM12/13/16
to s3ql
Hello All,

I used a 1GB file created from the output of /dev/urandom
The /cache directory is a 330GB RAM disk

benchmark.py outputs this:
[root@fast-dr contrib]# ../benchmark.py --authfile /etc/s3ql.authinfo --cachedir /cache swift://tin/fast_dr/ ./1GB.random
Preparing test data...
Measuring throughput to cache...
Cache throughput with   4 KiB blocks: 14188 KiB/sec
Cache throughput with   8 KiB blocks: 17977 KiB/sec
Cache throughput with  16 KiB blocks: 33760 KiB/sec
Cache throughput with  32 KiB blocks: 53328 KiB/sec
Cache throughput with  64 KiB blocks: 141503 KiB/sec
Cache throughput with 128 KiB blocks: 169001 KiB/sec
Measuring raw backend throughput..
Backend throughput: 24012 KiB/sec
Test file size: 1024.00 MiB
compressing with lzma-6...
lzma compression speed: 1712 KiB/sec per thread (in)
lzma compression speed: 1712 KiB/sec per thread (out)
compressing with bzip2-6...
bzip2 compression speed: 4285 KiB/sec per thread (in)
bzip2 compression speed: 4307 KiB/sec per thread (out)
compressing with zlib-6...
zlib compression speed: 15118 KiB/sec per thread (in)
zlib compression speed: 15123 KiB/sec per thread (out)

With 128 KiB blocks, maximum performance for different compression
algorithms and thread counts is:

Threads:                              1           2           4           8          24
Max FS throughput (lzma):     1712 KiB/s   3424 KiB/s   6849 KiB/s  13698 KiB/s  24011 KiB/s
..limited by:                       CPU         CPU         CPU         CPU      uplink
Max FS throughput (bzip2):    4285 KiB/s   8570 KiB/s  17140 KiB/s  23888 KiB/s  23888 KiB/s
..limited by:                       CPU         CPU         CPU      uplink      uplink
Max FS throughput (zlib):    15118 KiB/s  24005 KiB/s  24005 KiB/s  24005 KiB/s  24005 KiB/s
..limited by:                       CPU      uplink      uplink      uplink      uplink


Questions:
* I see discussion of tuning for block size but no mention of this in formatting or mounting the file system, do you mean specifying a block size in my call to rsync?
* I've mounted the volume with mount.s3ql calling for 24 uplink threads, is this correct?
* I've tried several different combinations of arguments attempting to set the cache size and maximum cache entries but in every case, cache entries top out at about ~4K and cache size seems to float around 30-35GB, presumably shifting as the size of those 4K entries changes?
* I get better throughput (up to 120MB/s average incoming to the server, with bursts to 200MB/s) until eventually all of the cache is dirty. Then throughput drops to half or less. This makes sense, if the cache is all hot it seems like I'd then switch to "write-through" traffic and be limited to the back end connection, and unless my cache is TBs in size I'll eventually have that problem in any case. But I do have a cache volume mounted of 300+GB and would like to make use of it. What's limiting my cache entries and size?
* In every case (depending on the number of rsync clients and their network connections) I get up to about 200MB/s and no more despite the server having bonded 10Gb connections and the back end swift cluster having multiple 10Gb connections. Where's my bottleneck?

Thanks in advance,

Randy

Nikolaus Rath

unread,
Dec 13, 2016, 3:29:26 PM12/13/16
to s3...@googlegroups.com
It means the blocksize that is used by applications when they issue
write(2) and read(2) requests to the kernel. This is distinct from the
rsync --block-size argument (which specifies the block size for rsync's
delta-transfer algorithm). Many applications use a hardcoded blocksize
that you cannot change (e.g. the "cp" program).

> * I've mounted the volume with mount.s3ql calling for 24 uplink threads, is
> this correct?

There is no "correct" value. It means S3QL will try to do up to 24
uploads in parallel.

> * I've tried several different combinations of arguments attempting to set
> the cache size and maximum cache entries but in every case, cache entries
> top out at about ~4K and cache size seems to float around 30-35GB,
> presumably shifting as the size of those 4K entries changes?

You need to be more specific. Please post a concrete set of options that
you used, the results that you got, and the results that you'd rather have.

> * I get better throughput (up to 120MB/s average incoming to the server,
> with bursts to 200MB/s) until eventually all of the cache is dirty. Then
> throughput drops to half or less. This makes sense, if the cache is all hot
> it seems like I'd then switch to "write-through" traffic and be limited to
> the back end connection, and unless my cache is TBs in size I'll eventually
> have that problem in any case. But I do have a cache volume mounted of
> 300+GB and would like to make use of it. What's limiting my cache entries
> and size?

The number of available file descriptors and the --max-cache-entries
argument.

> * In every case (depending on the number of rsync clients and their network
> connections) I get up to about 200MB/s and no more despite the server
> having bonded 10Gb connections and the back end swift cluster having
> multiple 10Gb connections. Where's my bottleneck?

I assume you mean what is limiting the upload speed to the server to 24
MB/s? That's not something that benchmark.py can determine. Do you get
more than 24 MB/s when you use a different swift client? If not, then
the server is to blame.

Randy Rue

unread,
Dec 13, 2016, 6:21:49 PM12/13/16
to s3ql
Hi Nikolaus,
 
> * I've mounted the volume with mount.s3ql calling for 24 uplink threads, is 
> this correct? 
There is no "correct" value. It means S3QL will try to do up to 24 
uploads in parallel. 

Forgive me for asking the question imprecisely. Perhaps a better question is "Is this consistent with best practice" or "Do you have a recommendation?" 

> * I've tried several different combinations of arguments attempting to set 
> the cache size and maximum cache entries but in every case, cache entries 
> top out at about ~4K and cache size seems to float around 30-35GB, 
> presumably shifting as the size of those 4K entries changes? 
You need to be more specific. Please post a concrete set of options that 
you used, the results that you got, and the results that you'd rather have. 

My most recent attempt mounted using:
/usr/bin/mount.s3ql --nfs --compress zlib-6 --authfile /etc/s3ql.authinfo --log syslog --threads 24 --cachedir /cache --cachesize 346030080 --allow-other swift://tin/fast_dr/ /fast_dr

I've also tried it with the argument "--max-cache-entries 40960" included. In both cases, while I have multiple rsyncs running from a mounted NFS client, "watch s3qlstat" shows the cache grow to ~4,000 entries and no more. Cache size grows to 38-40GB at the most, and dirty cache eventually grows to take up all of the cache. At that point whatever throughput I was seeing starts to drop.

At that point I've also tried changing the cache size on the fly using "s3qlctrl cachesize /fast_dr/ 781250000" (that's 100GB) and saw the cache size float upward maybe 5GB and then return to ~30GB. Cache entries didn't change.

I would like to see more of the 300GB cache volume used to see if this will help overall performance or at least postpone the point where all of the cache is dirty and write-throughs start.

Related:
What's limiting my cache entries  and size? 
The number of available file descriptors and the --max-cache-entries 
argument. 

Am I missing something about using the max-cache-entries argument? It doesn't seem to make a difference.

Forgive me if this isn't a s3ql question, but what determines the number of available file descriptors?


> * In every case (depending on the number of rsync clients and their network 
> connections) I get up to about 200MB/s and no more despite the server 
> having bonded 10Gb connections and the back end swift cluster having 
> multiple 10Gb connections. Where's my bottleneck? 
I assume you mean what is limiting the upload speed to the server to 24 
MB/s? That's not something that benchmark.py can determine. Do you get 
more than 24 MB/s when you use a different swift client? If not, then 
the server is to blame. 

I mean "How can I find out why my total write speed is only as high as 200MB/s when the NFS client, the s3ql server and the swift cluster all have multiple 10Gb connections?"

I'll test some rsyncs from the client to a non-s3ql target on the s3ql system as a data point for incoming speed, and some swift calls from the s3ql server to the swift cluster.

Life is Good,

Randy

Nikolaus Rath

unread,
Dec 14, 2016, 3:24:08 PM12/14/16
to s3...@googlegroups.com
On Dec 13 2016, Randy Rue <rand...@gmail.com> wrote:
> Hi Nikolaus,
>
>
>> > * I've mounted the volume with mount.s3ql calling for 24 uplink threads,
>> is
>> > this correct?
>> There is no "correct" value. It means S3QL will try to do up to 24
>> uploads in parallel.
>
>
> Forgive me for asking the question imprecisely. Perhaps a better question
> is "Is this consistent with best practice" or "Do you have a
> recommendation?"

From the output of benchmark.py you can tell that when using zlib
compression, increasing the number of threads above 2 does not give any
benefits. So if you use zlib compression (as in your example below),
using more than 2 threads is more likely to have negative effects than
positive ones.


>> * I've tried several different combinations of arguments attempting to
>> set
>> > the cache size and maximum cache entries but in every case, cache
>> entries
>> > top out at about ~4K and cache size seems to float around 30-35GB,
>> > presumably shifting as the size of those 4K entries changes?
>>
>> You need to be more specific. Please post a concrete set of options that
>> you used, the results that you got, and the results that you'd rather
>> have.
>
>
> My most recent attempt mounted using:
> /usr/bin/mount.s3ql --nfs --compress zlib-6 --authfile /etc/s3ql.authinfo
> --log syslog --threads 24 --cachedir /cache --cachesize 346030080
> --allow-other swift://tin/fast_dr/ /fast_dr

This should result in a message of the form "Detected xxxx available
file descriptors". What's xxx in your case?

> In both cases, while I have multiple rsyncs running from a mounted NFS
> client, "watch s3qlstat" shows the cache grow to ~4,000 entries and no
> more. Cache size grows to 38-40GB at the most, and dirty cache eventually
> grows to take up all of the cache. At that point whatever throughput I was
> seeing starts to drop.
>
> At that point I've also tried changing the cache size on the fly using
> "s3qlctrl cachesize /fast_dr/ 781250000" (that's 100GB) and saw the cache
> size float upward maybe 5GB and then return to ~30GB. Cache entries didn't
> change.

Looks like you're limited by the number of cache entries.


> I've also tried it with the argument "--max-cache-entries 40960"
> included.

Did that increase the number of cache entries to more than ~4000?

> I would like to see more of the 300GB cache volume used to see if this will
> help overall performance or at least postpone the point where all of the
> cache is dirty and write-throughs start.
>
> Related:
>
>> What's limiting my cache entries and size?
>> The number of available file descriptors and the --max-cache-entries
>> argument.
>
>
> Am I missing something about using the max-cache-entries argument? It
> doesn't seem to make a difference.
>
> Forgive me if this isn't a s3ql question, but what determines the number of
> available file descriptors?

The first is a resource limit, you can check (and modify) it with
"ulimit". In addition to this the kernel imposes a limit,
cf. http://unix.stackexchange.com/questions/84227/limits-on-the-number-of-file-descriptors

>> * In every case (depending on the number of rsync clients and their
>> network
>> > connections) I get up to about 200MB/s and no more despite the server
>> > having bonded 10Gb connections and the back end swift cluster having
>> > multiple 10Gb connections. Where's my bottleneck?
>> I assume you mean what is limiting the upload speed to the server to 24
>> MB/s? That's not something that benchmark.py can determine. Do you get
>> more than 24 MB/s when you use a different swift client? If not, then
>> the server is to blame.
>
>
> I mean "How can I find out why my total write speed is only as high as
> 200MB/s when the NFS client, the s3ql server and the swift cluster all have
> multiple 10Gb connections?"

Well, benchmark.py already told you why. Even when you use 128k blocks,
S3QL cannot process more than 169001 KiB/sec. That limitation is a
combination of your CPU/disk/mainboard, the kernel, and Python.

In addition to that, S3QL is not able to send data to the server any
quicker than 24005 KB/s - even if it does not do any processing at
all. So I would investigate that first.

> I'll test some rsyncs from the client to a non-s3ql target on the s3ql
> system as a data point for incoming speed, and some swift calls from the
> s3ql server to the swift cluster.

The latter would be more important.

Randy Rue

unread,
Dec 14, 2016, 3:53:46 PM12/14/16
to s3...@googlegroups.com
No.

Roberto Martins

unread,
Nov 16, 2021, 7:07:37 AM11/16/21
to s3ql
Hello! Just to let you know that I learned a lot from this conversation and it helped me to increase S3QL performance almost 10x by understanding the results from benchmark.py.

Thank you all!

Reply all
Reply to author
Forward
0 new messages