> Hi Nikolaus,
>
>> [...] is there a reason to not just bump the hardcoded buffer size?
>>
>> Adding an option typically means that 50% of the people go with the
>> default (even when it's sub-optimal), 40% pick a value that's even
>> worse, and 10% actually benefit. So I'd like to avoid options whenever
>> possible.[...]
> If changing the hard-coded BUFSIZE is an option, that's what I would
> prefer, too. That would increase the required memory for S3QL, tho. So
> when low-memory systems are a of any concern, increasing the BUFSIZE
> would be bad for them.
I'm not too concerned about that. I think any system were a 4 MB memory
increase is a concern probably can't run S3QL anyway.
> In this use case I use S3QL as a target for Bareos backups. Each
> backup is one single file that can get hundreds of GB big. Thus I
> chose a max-obj-size of 3GB, no compression (Bareos does that already)
> and a cache size of 100 GB.
[...]
> So there are relatively few objects/data blocks but they are 1GB on
> average. This is quite a different use case as than the default
> max-obj-size of 10MB.
> Before bumping the BUFSIZE we definitely should benchmark with the
> default max-obj-size, too.
I don't see why any of this would affect the optimum buffer size (as
long as it's under the maximum object size), but sure, why not :-).
> Looking at contrib/benchmark.py, I can probably change this to also
> benchmark different BUFSIZEs for the upload so that we can get some
> data from different configurations.
Again, no objections, but I think this may be more work than
required. I'd start by just measuring the effects of a few different
hardcoded sizes (e.g. 512 kB, 1 MB, 2 MB, 4 MB, 8 MB).