Replication

10 views
Skip to first unread message

Arkadi Colson

unread,
May 18, 2020, 4:48:48 AM5/18/20
to mog...@googlegroups.com

Hi

We are using the network plugin in MogileFS and have set the replication to 1 for each Datacenter (network). So the replica count is 2. This works very well. When doing a rebalance from device X on datacenter 1 to device Y on datacenter 1 we see that the file is get from device X on datacenter 2 and thus goes over the link between the datacenter. Is this normal behavior?

What exactly happens in the background? Will the file first be deleted on device X on datacenter 1 and thus temporary have only 1 copy or will the file first being replicated so that we temporary have 3 copies of that file and deleted afterwards?

--

Best regards
Arkadi Colson

dormando

unread,
May 20, 2020, 4:34:11 AM5/20/20
to mog...@googlegroups.com
It's been a long time so grain of salt:

It's supposed to make an extra copy, then it lets the replication policy
kill off one of the original now excess ones afterward. Otherwise
rebalance wouldn't work on things with only one copy to begin with.

IIRC it never got smart enough to ensure it copied from a local datacenter
when possible.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "mogile" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mogile+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/mogile/1c7cd743-0a71-580a-2adc-300988eb6454%40smartbit.be.
>
>

Daniel Frett

unread,
May 29, 2020, 12:11:25 PM5/29/20
to mogile
In addition, whichever tracker happens to be processing the rebalance request will read the file from the source device and write it to the destination device. So, if the source and destination are in one data center and the tracker is in a different data center, the file contents will be streamed between the data centers twice.

Streaming the data to/from the tracker becomes more painful when you start dealing with multi-gigabyte files being stored in MogileFS.

I always had an idea of making the storage nodes a bit smarter to accept a command to "replicate" a file from a source location(s) which would read directly from another storage node, and report back success similar to how the file checksumming is performed on storage nodes.


On Wednesday, May 20, 2020 at 4:34:11 AM UTC-4, Dormando wrote:
It's been a long time so grain of salt:

It's supposed to make an extra copy, then it lets the replication policy
kill off one of the original now excess ones afterward. Otherwise
rebalance wouldn't work on things with only one copy to begin with.

IIRC it never got smart enough to ensure it copied from a local datacenter
when possible.

On Mon, 18 May 2020, Arkadi Colson wrote:

>
> Hi
>
> We are using the network plugin in MogileFS and have set the replication to 1 for each Datacenter (network). So the replica count is 2. This works very
> well. When doing a rebalance from device X on datacenter 1 to device Y on datacenter 1 we see that the file is get from device X on datacenter 2 and thus
> goes over the link between the datacenter. Is this normal behavior?
>
> What exactly happens in the background? Will the file first be deleted on device X on datacenter 1 and thus temporary have only 1 copy or will the file
> first being replicated so that we temporary have 3 copies of that file and deleted afterwards?
>
> --
>
> Best regards
> Arkadi Colson
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "mogile" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to mog...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages