ANN: jcp does rsync, but 3x faster

282 views
Skip to first unread message

Jason E. Aten

unread,
Mar 21, 2025, 1:04:45 AMMar 21
to golang-nuts
I've open sourced jcp, my rsync-like file transfer library and CLI.

By using Go's fabulous multicore support, jcp can do diff-only filesystem syncs
up to 3x faster than rsync (which is a single threaded C program).
It uses a parallelized version of the FastCDC algorithm with a 
Gear table to ship only the changes, even in binaries.

https://github.com/glycerine/jcp

From the README:

This project (jcp) was written to harden and polish my RPC system, https://github.com/glycerine/rpc25519 , whose high-performance and novel Peer/Circuit/Fragment paradigm is featured here. In this evolution of RPC, peers communicate fragments of infinite data streams over any number of persistent circuits. Since the roles are peer-to-peer rather than client-server, any peer can run the code for any service (as here, in the jcp case, either end can give or take a stream of filesystem updates).

Enjoy.
- Jason

G

unread,
Mar 28, 2025, 2:24:30 PMMar 28
to golang-nuts
is it able to use a local storage as backup?
Thanks

Jason E. Aten

unread,
Mar 28, 2025, 6:55:30 PMMar 28
to golang-nuts
Yes? The question is a bit confusing. jcp copies from 

host1 filesystem -> over the network -> (jsrv running on) host2 filesystem,

and while (for testing mostly) you can leave off the host: prefix on 
both giver and taker, to copy things from local disk
over the TCP/UDP network stack, and back to the same host's local storage -- 
this is going to be wildly less efficient than using tar or cp to do local disk copies.

jeff.ko...@gmail.com

unread,
Mar 29, 2025, 8:28:05 AMMar 29
to golang-nuts
I read G's question as whether jcp can efficiently update incremental backups from primary to secondary local storage, e.g. daily backup of a home dir to an attached memory stick or an alternate folder on the same filesystem.

Jason E. Aten

unread,
Mar 29, 2025, 10:43:33 AMMar 29
to golang-nuts
Right. So the answer to that is: It can, and does, but it is missing
the obvious optimization of skipping the network stack for
local-disk-to-local-disk transfer.

It might still be fast (enough); you'd have to benchmark it to see.
See below for demonstration. 

Its pretty convenient to try local-to-local, because when 
jcp detects it is doing a local disk-to-local disk transfer, 
it starts the receiver for you (the jsrv part is run in-process,
on a goroutine). It already did this for testing convenience,
but now (I added this small feature in response to this question)
it also automatically turns off the encryption+decryption part of the
transport, since there's no point in wasting cycles doing
encryption just to decrypt it a moment later so it can be written
to disk unencrypted.  Now jcp certainly doesn't aim to provide encrypted
backups. That is a much bigger lift, and there are alot of
specialized backup programs out there that do that already (e.g. plakar.io).

Demonstration:

~/go/src/github.com/glycerine/jcp (master) $ jcp source_from_here target_to_here

no ':' in src/target: starting local rsync server to receive files...

(001)version.go                [==============================] 100%  684.0 B     38.3 MB/s   00:00 ETA

jcp.go:474 [pid 9073] 2025-03-29 09:20:47.705 -0500 CDT giver total file sizes: 43_651_011

jcp.go:475 [pid 9073] 2025-03-29 09:20:47.705 -0500 CDT bytes read = 53_715 ; bytes sent = 12_412_130 (out of 43_651_011). (28.4%) ratio: 3.5x speedup

~/go/src/github.com/glycerine/jcp (master) $ jcp source_from_here target_to_here

no ':' in src/target: starting local rsync server to receive files...

jcp.go:474 [pid 9300] 2025-03-29 09:21:18.856 -0500 CDT giver total file sizes: 43_649_411

jcp.go:475 [pid 9300] 2025-03-29 09:21:18.856 -0500 CDT bytes read = 693 ; bytes sent = 8_491 (out of 43_649_411). (0.0%) ratio: 5140.7x speedup

~/go/src/github.com/glycerine/jcp (master) $

Reply all
Reply to author
Forward
0 new messages