Changing backend providers via bucket copy - feasible?

26 views
Skip to first unread message

Xomex

unread,
May 13, 2020, 2:18:05 AM5/13/20
to s3ql
If this is a naive question (or answered elsewhere) please excuse...
I wish to change backend storage providers but it is not practical for me to copy the files between two mounted s3ql filesystems due to bandwith and cost limitations for data to/from my home.
Is it feasible to copy my data at the bucket level from provider A to B (using a 3rd party data copying service) and then connect s3ql to the new data location. Will an intact s3ql filesystem magically appear at the new location?
Any advice to help this work?
Regards,

Daniel Jagszent

unread,
May 13, 2020, 10:42:09 AM5/13/20
to s3ql
Hello Xomex,
> [...] Is it feasible to copy my data at the bucket level from provider
> A to B (using a 3rd party data copying service) and then connect s3ql
> to the new data location.[...]
You need to use https://www.rath.org/s3ql-docs/contrib.html#clone-fs-py
to copy a S3QL file system from one storage backend to a different
storage backend. Only copying the data from one bucket to another does
not work. S3QL uses backend specific metadata (e.g. S3 metadata is a
little bit different than Openstack Swift metadata).

Unfortunately clone-fs.py does not do server-to-server copies. You might
be better off to run clone-fs.py inside a throw away VM with a decent
internet connection.



Marcin Ciesielski

unread,
May 13, 2020, 12:11:33 PM5/13/20
to s3ql
Do you guys know if clone-fs.py can pickup where interrupted? Or basically copy across what was not yet copied like rclone or rsync.


 

Daniel Jagszent

unread,
May 13, 2020, 4:56:28 PM5/13/20
to s3ql
Hi Marcin,
[...] Do you guys know if clone-fs.py can pickup where interrupted? Or basically copy across what was not yet copied like rclone or rsync. [...]
it currently cannot do that. It blindly copies over all data.

If you know some Python you could easily tinker with it, tho
Either do not put the keys you want to skip in the queue ( https://github.com/s3ql/s3ql/blob/d2baf00236df69878e19466fc1c27f8ff0c89b4e/contrib/clone_fs.py#L168 )
or check if the key already exists in the destination backend and skip the write call.

This would be a very naive implementation of the second option: Replace line 122 with this:
try: dst_backend.lookup(key) except NoSuchObject: dst_backend.perform_write(do_write, key, metadata)
It does not checks whether the metadata is equal (a robust implementation should most definitely do this)


Reply all
Reply to author
Forward
0 new messages