rsync, s3fs and S3's "Eventual Consistency"

172 views
Skip to first unread message

marcin_gut

unread,
Jul 6, 2008, 1:51:44 PM7/6/08
to s3fs-devel
What impact does S3's "eventual consistency" paradigm have on typical
uses of rsync & s3fs? I'm not that familiar with the internals of the
rsync protocol but I know that, by default, files are transferred to a
temporary file and then renamed to the actual destination filename in
two separate steps. Wouldn't this potentially break if the temporary
file hasn't reached a "consistent" state before rsync attempts to
rename it? What would happen in that case, would the rename fail
causing the rsync session to abort? Would using the --inplace rsync
flag prevent this?

Thanks for any insights,
M

Randy Rizun

unread,
Jul 7, 2008, 9:58:19 PM7/7/08
to s3fs-...@googlegroups.com
Hi-

Great question(s)- I suspect s3's eventual consistency could have an
impact on rsync...

In my usage of rsync (I use rsync -a ...) I have yet to observe a
problem... rsync rightfully seems to be pretty paranoid/robust about
not trusting what's really happening at the far end destination; I
would suspect that rsync would complain (abort) if it were to ever
observe any anomalies due to s3's eventual consistency, although like
I said I have been using rsync over s3fs heavily and have not observed
any such anomaly...

not sure about the --inplace flag, without thinking about it too much
i would suspect it would be less robust than not using it...

not much there but I hope that helps!

s3fs user

unread,
Jul 13, 2008, 11:59:21 PM7/13/08
to s3fs-devel
I worry about the consistency problem as well because I use s3fs for
backups. While I haven't seen any problems myself, I don't trust s3fs.
I suggest you do what I do and occasionally re-fetch your bucket into
a temporary directory and do a diff, after clearing the s3fs cache.
That way you know that your old files are and will continue to be
consistent until the next time you modify them.

By the way, Rizum, I also use "rsync -a." Is this the suggested method
between a local directory and a s3fs mount? I've considered using
"rsync -aW" to send the whole file when the timestamps+size don't
match, disabling the rsync matching algorithm entirely. For my
workload it would be more optimal and as a side effect would fix any
consistency problems, if they exist.

I'd be interested to hear from users who use s3fs+rsync from multiple
computers. Anybody attempted this?

rri...@gmail.com

unread,
Jul 14, 2008, 2:26:10 PM7/14/08
to s3fs-devel
Hi-

Just looked up the -W option...

s3fs works on a brute-force "all-or-nothing" basis, that is, amazon s3
has no incremental update capability for individual objects; s3fs has
to (re)upload the entire object (file) even if only one byte has
changed

so, having said that, not using rsync's -W option (which means rsync
does a delta update) really has no net effect; behind the scenes, s3fs
is gonna download the entire object, do the incremental update locally
and then re-upload the entire object; in other words, then net effect
is that rsync over s3fs operates in a -W-like fashion anyway...!

I guess one way to try to test for the existence of a "eventual-
consistency" problem is to re-run rsync a second time and see if it
"does nothing" (of course, assuming (a) there were no errors during
the first rsync run and (b) the source folder hasn't changed)

hope that makes sense!
Reply all
Reply to author
Forward
0 new messages