Default transfer settings to local hard drives uses multithreaded streaming and sparse files from what I read. This tanks performance and makes reading the file afterward take forever due to the extremely high fragmentation. Moving the files after to clean up the fragmentated mess takes longer than the download itself to two locally attached SATA3 speed drives.
(Log does mention the commands needed but isn't very clear on when to use it, Writing sparse files: use --local-no-sparse or --multi-thread-streams 0 to disable)
With mult-threaded streams off and sparse file off --multi-thread-streams 0 speeds improve a lot
I think if rclone did a best effort detection if the local drive was an SSD or HDD it would improve user experience and likely lead to less hard drive wear and tear. From what I read people still wanted multi-threaded downloading but not sparse allocated files but for hard drives you probably should just have one thread to disk and multiple threads downloading into a cache. For SSDs it is the opposite you probably want like 20+ writing to SSD threads or more to get the queue depth up as high as possible while HDDs really don't like doing that which might be more than is reasonable for download threads.
Users might think their hard drive slow or some other problem and while there is a log message to say which flag to use most bulk storage will be to hard drives so I'm not sure if it make sense to by default cause extreme fragementation. It doesn't really tell you if you have a hard drive you should definitely choose one of the options for best performance.
Windows has the MediaType flag for the disk type and if it is unknown and the user doesn't specify the target disk type it should probably ask what type of local drive is being targeted. There are probably different ideal settings for single HDD, SSD, NVMe SSD, Arrays of HDDs/SSDs. Problem is that even after the transfer is complete it can take 24+ hours to transfer the extremely fragmentated data to another disk which fixes the fragementation or even longer to defragment the drive in place. Checking this media type flag the targeted local drives all advertise as HDDs in windows.
Default transfer settings to local hard drives uses multithreaded streaming and sparse files from what I read. This tanks performance and makes reading the file afterward take forever due to the extremely high fragmentation.
We've just re-worked the multi-thread streaming for the latest beta
(shortly to be released as v1.64) and one of the consequences of this is that it will do the downloads in 64M chunks rather than in chunks of 1/4 of the size of the file. I think this should improve the the fragmentation of the files - I'd be interested if you give it a try.
I don't understand why sparse files are causing a performance problem for you, unless you are using EXFAT or VFAT in which case they will cause a problem as sparse files aren't supported and the OS writes the whole file first.
Edit: I retested to double check and the download speed improved quite a bit with the beta and default settings. Updated the results to the better 2nd test of the beta version. Ended the test a bit early as it was pretty clear that the read speed would not improve and the partial file on disk was already very fragmentated.
It does seem to go faster in download and no longer pegs the hard drive on initial testing of the beta version I think I messed up and ran two tests at once obviously ruining things during the download tests for each settings type.
I did notice with my previous run of the stable version between files the verification also took forever compared to the download which I thought was odd and is likely the read aspect being slowed down after the transfer finishes.
Benchmark of the single HDD all in single drives no raid just manage file replicas manually
Used WD Gold 18TB, Seagate Seagate Exos X16 16TB, drives are all in good health 0% fragmentation before testing
via USB3.2 5/10gbps controllers to SATA (QNAP TR-004, Sabrent DS-SC5B)
on an Intel or ASmedia root controller no external hubs
thought it was the connection causing problems but benchmarking the drives as shown below they have plenty of bandwidth and swapping enclosures, controllers, ports, cables didn't improve things.
I read the github issue thread and I will let the download finish with the different settings and compare how many fragments the new beta creates vs the current latest stable release. There is definitely a big improvement for downloading although disabling threaded writing seems to work about the same for large files for downloading.
The increased chunk size does not seem to reduce the fragmentation sigificantly in the beta defaults and this does still cause reads afterwards to run extremely slowly currently. My first assumption is that windows is interleaving the IO from the 4 writing threads causing the low chunk count to balloon into the crazy 180k fragments for some reason.
The penalty after downloading is persistant on disk as well and with extremely large files it may be impossible to defragment in place and the solution is to move the files to another disk in a serial manner that will reassemble the fragments (very slowly) and bring the speed back up. Since I moved the files to local disk it was impossible to redownload it with the correct settings and I just had to let it do the slow local to local copy with a tool called fastcopy.
I think this is because the new code doesn't write the end of the file first, it only writes blocks within a few multiples of --multi-thread-chunk-size. I was hoping since that number is relatively small that Windows would coalesce the writes and write them in sequence, but obviously not.
I do agree finding out if the drive is an SSD or not is not the same on each OS which makes things complicated. Windows has the media type which is easy, linux has the rotational speed 0=SSD thing which is maybe easy, MacOS which also seems to have a media type thing that can be accessed. Although it would possibly prevent a hard drive from dying early (they get very clickly clacky and warmer during 100% busy pure random workload) and would greatly improve default performance when downloading to slow HDDs. This detection would obviously break for RAID/network mounted/crappy USB enclosures... and many other cases.
Sanity Check still very fragmented baseline default config. I don't think the USN journal will have much traffic because in the mode it is in only new files generally create any entry and the way it is allocated keeps it far away from the data part of the NTFS volume. Verifying took forever possibly because they are all trying to read very random fragments.
Very fast ironically even though the download was slower
Runtime 1m20s
Write 50MB/s
Verify instant ??? is it being skipped as there was no read from disk step hashes still displayed
Read 160MB/s
Fragmentation 1 frags/file (1GB)
For the verify speed are you measuring the time that it takes rclone to caclulate the checksum? That explains why some verifications are instant as if rclone writes the file sequentially it calculates the checksum as it goes along.
You could simulate this on its own with rclone hashsum md5 /path/to/file. If you want to ignore the CPU time taken by the MD5 calculation then try rclone hashsum crc32 /path/to/file. I'd be interested to see what those transfer rates are for the files above if you've still got them.
This looks like the most promising result. 295 fragments means the fragments are about 3M which is better. The read speed is pretty close to the maximum also. I'd just like to understand that Verify time better.
Perhaps setting --local-no-sparse is forcing windows to do write coalescing otherwise it would have to write 0s to the file. Was that test done with both --multi-thread-streams 0 and --local-no-sparse or just --local-no-sparse? If the latter it should have been doing multi-thread downloads which you can check in the -vv log.
The do the verify while it is writing definitely speeds things up over having to read it again later. The read speed during verify doesn't seem to be optimal in the multithreaded download cases as when the fragmentation is lower the fastcopy tool can copy the files much faster than the effective verify read speed in rclone.
Checked the --local-no-sparse on giagantic files 4TB in the current release version results in some interesting behavior on windows 10. To test how it behaves with large files that need to be pre-allocated instead of the tiny 1GB one. So --local-no-sparse on 1.63.1 definitely allocates 0s.
Control-C can't cancel the operation, kill rclone becomes impossible (access denied error in task manager), and the allocation of the empty TB file continues likely because it is passed off to a system call that won't return for a very very long time in the future.
Sometimes you just want to know that the disk you're going to use is in good condition and your disk drive gets along nicely with it. Here's a simple but effective tool for testing that disk. 1541 Sector Tester will read every track and sector on the disk and let you know if an error is returned.
Disc Rescue v1.0 by Cal/Alien - Released in 1990. This utility has three main functions; unscratch files, un-new a disk and remove errors. Although the program name is spelled dis(c) and not disk, I can assure you it isn't for a CD. ;)
Discan v1.0 by MR. NOP
This program checks an entire disk for a series of bytes, and allows you to change those bytes if found. This allows you to create game cheats, remove copy protection and to do virus scans. What makes this program useful is that it allows you to modify data in games that don't contain files. With files, it's easy to modify code and create trainer menus. When the data is hidden on sectors though, it's not as simple a process because you need to modify and then re-save the data back to disk.
Release on March 10th,1986 by the Rancocas Valley User Group. Disk compare isn't fancy but when you want to see the difference between two floppies and you have two 1541 drives, then this utility will help.
4a15465005