Apologies! I used to use Pine back in the day but got out of practice 8-).
Anyway, I ran some benchmarks -- Made an empty filesystem and copied a portion of my .cache/bazel into it, a mix of somewhat larger files and a few directories of loads of small files. Then I deleted it, re-copied, and re-deleted it.
synchronous and journal-mode
OFF OFF
run 1:
create 4:15 delete 0:57
run 2:
create 4:15 delete 0:58
WAL OFF
run 1:
create 4:28 delete 1:03
run 2:
create 4:30 delete 0:59
WAL NORMAL
run 1:
create 4:29 delete 0:59
run 2:
create 4:30 delete
(I didn't run the last delete since the WAL, synchronous=NORMAL and WAL, synchronous=OFF speeds seemed to be identical anyway.)
I was surprised to see identical speeds with WAL/OFF and WAL/NORMAL. Interestingly, WAL with sync=NORMAL is slower, but only by about 5%, for write-intensive loads. WAL+normal is supposed to provide stronger guarantees of a consistent DB (if the system got interrupted, on next DB open sqlite can either complete or roll back transactions in the WAL, to make sure the DB is in a consistent state.) That said, I've used s3ql on a few USB drives (one with a habit of having the cable fall out a few times, and I had a USB3 interface that had a habit of dropping off now and then with older kernels.) Worst I had was running a sqlite .repiar and fsck, losing (unsurprisingly) the last several seconds of whatever I was copying in. In other words I've already found sqlite robust enough with "OFF/OFF" setting, plus of course s3ql has the failsafe of having all those metadata backups just in case.
I did notice, if i ran "find" on the test set, it took 0.9 seconds in WAL mode but 0.3 seconds in OFF mode -- they do note in the docs that WAL can be slower for read-intensive loads, for walking through a directory tree it's getting 1/3rd the speed!
I suspect that explains the "faster" rsync performance I observed -- I had one rsync copying stuff in and one walking through a directory tree copying stuff out, if it was walking through the tree at like 1/3rd the speed I suppose the write-intensive rsync would proceed faster, despite total filesystem IOPS being lower.
I'll post back in a bit, I've got another patch cooked up. I decided to look into why running the fsck, why the searching for temporary files was taking over 2 hours. I found on my s3ql-data, I had 810,000 directories (the "100" through "999" directories, 2 layers deep), but only about 30,000 with data in them,. I went into the s3ql-data directory and (with s3ql unmounted) ran a "find -type d -exec rmdir {} \+" (and let that run overnight, I imagine it took a while.) This cut the time for find to walk through from over 2 hours to about 10 minutes (and under a minute if I re-run, I apparently got the directory count low enough it can fit in the directory entry cache.) s3ql currently creates these directories as needed, but does not remove them when empty, this patch adds that.