Multiple masters as mountpoint

65 views
Skip to first unread message

Matt

unread,
Mar 1, 2019, 7:58:07 AM3/1/19
to XtreemFS
Hi,

I really like this software, it looks solid, simple and clear as the docs are good so I would like to use it.

The only question I have is how I can make it work that ever storage node can access all data ? It seems you replicate from one machine where you actually "see" your files and they are not availabe on other machines.

Is this possible ?

So have multiple masters as mountpoints.

Thanks for the great work!

Matt

Robert Schmidtke

unread,
Mar 4, 2019, 3:13:23 AM3/4/19
to XtreemFS
Hi Matt,

I'm not sure I understand your question. Each OSD (which I assume you mean by "storage node") stores parts or all of your data, depending on your replication policy. Data is organized in volumes, logically. These volumes can be mounted by clients, just like any other file system.
Now, each node running an OSD can also function as a client if you like, so they can mount the volume(s) as well.
Say you have 3 nodes, each running an OSD, and one of them additionally running the DIR and MRC. You set up a replication factor of 3 for your volume, s.t. all data is replicated across all nodes. Other nodes (e.g. your laptop) mount this volume to gain access to the storage provided by your 3 OSDs.
If you want each of the 3 OSDs to also have access to the shared storage, they can mount the volume as well.

I'm not sure what you mean by "multicple masters as mountpoints".

I hope this helps a bit. If there is still confusion about the question, please come back to me and we'll hopefully be able to sort this out.

Cheers
Robert

Matt

unread,
Mar 6, 2019, 2:08:18 PM3/6/19
to XtreemFS
Hi Robert,

My appologies for the late response and thank you for the complete explanation.

Let's say I have 2 locations and, starting with 2 nodes, each on one location, could clients on each location mount the local OSD server and write to it and the other "side" clients sees the written files as well from their own local OSD ?

I hope this makes it a little bit more clear!

Thanks again for the great software!

Cheers,

Matt


Op maandag 4 maart 2019 09:13:23 UTC+1 schreef Robert Schmidtke:

Robert Schmidtke

unread,
Mar 7, 2019, 2:56:10 AM3/7/19
to XtreemFS
Hi Matt,

thanks for the clarification. I would suggest the following:

On location L1, run the DIR, the MRC and one OSD.
On localtion L2, run one OSD.
Create a volume with read-write replication and a replication factor of 2 (note that you need at least 3 nodes for quorum-based fault-tolerance, see http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc6.1).
Use dcmap or vivaldi as OSD and replica selection policy: http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc5.3.3
This way, each client talks to the closest OSD (clients mounting the volume at L1 will talk to the OSD at L1, clients mounting the volume at L2 will talk to the OSD at L2).
- If you use dcmap, check http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc7.3.2 for how to configure distances.
- If you use vivaldi, the distances should be figured out automatically.
You may want to enable asynchronous writes as well: http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc4.3.4
Note however, again, that this setup does not fully ensure fault-tolerance, but is only a way of sharing data across two locations.

If you want to extend your setup, say, add OSDs to L1 and L2, you can add replicas to ensure fault-tolerance, e.g. by having 2 replicas in L1 and 2 replicas in L2 (if you run 2 OSDs per location, on different machines).
If you have 2 OSDs per location, and you set the replication factor to 3 and use dcmap or vivaldi as replica selection policies, you should end up with two replicas on the "closer" location, and one on the "remote" location.

I hope this helps.

Cheers
Robert

Matt

unread,
Mar 7, 2019, 2:31:45 PM3/7/19
to XtreemFS
Hi Robert,


Thanks this helps a lot!

Sound good! With 2 nodes is it possible to detect the master with a split brain? Just curious.

The power with XtreemFS is indeed that adding nodes expands in all kinds of ways, in this case you can even with small nodes start a real cluster that directly expand storage as well.

I'm a fan!

Cheers,

Matt


Op donderdag 7 maart 2019 08:56:10 UTC+1 schreef Robert Schmidtke:

Robert Schmidtke

unread,
Mar 8, 2019, 2:56:03 AM3/8/19
to XtreemFS
Hey Matt,

with 2 nodes you will not be able to reach a quorum in presence of a split brain (since the quorum size is still 2).
Also, since the MRC is necessary for some metadata operations, you will not be able to complete several operations, you can check the architecture section of the following paper to learn more about dependencies between DIR, MRC and OSD: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.1304

As a heads-up I'll have to tell you that currently the replication feature of the DIR and MRC are experimental and we do not recommend using them for production setups, as there are some race conditions during failover.
This means that in practice, the DIR and MRC are single points of failure. You can, however, backup their databases and restore them later. The OSDs are fully fault-tolerant and their replication works properly, so data loss is rather unlikely.

Cheers
Robert

ygor.go...@gmail.com

unread,
Apr 16, 2019, 5:26:44 PM4/16/19
to XtreemFS
Hi Robert.

I´m testing this tool, and thanks for support.

I have a question.

i create a single server running the dir, mrc and osd. and i want to secure when or if this server going offline, another server assume his position, and don't let the network fail. and keep the osd's running without outage of the xtreemfs service. can u help me with this? explaining how can i do this, cause, i'm trying to do this reading the documentation, but i facing some difficulties.

Thanks on advance.

Robert Schmidtke

unread,
Apr 17, 2019, 4:41:42 AM4/17/19
to XtreemFS
Hi,

there is a failover mechanism for DIR and MRC: http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc6.3 However, this does not currently always work, as there is a race condition during the failover. Therefore we do not recommend using DIR/MRC replication in production.
Replication for files via multiple OSDs is well-tested and works: http://xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc6.1
You can check the XtreemFS repository for example DIR/MRC replication configurations:
The configuration files for the replication plugin can be found here: https://github.com/xtreemfs/xtreemfs/tree/master/contrib/server-repl-plugin/src/main/resources/config

If you get into more detail of what the difficulties are that you're facing, I'm sure I can help you. However as this inquiry is quite general, I can only refer you to the relevant sections of the documentation and source code, I hope you understand.

Cheers
Robert

ygor.go...@gmail.com

unread,
Apr 23, 2019, 3:49:28 PM4/23/19
to XtreemFS
Hi Robert,

Thanks in advance for your support.

i´m facing this issue on MRC log:

dir.log da 10.10.0.249


 FleaseMessage ( type=MSG_PREPARE cell=replication v=0 b=(1556048527120;-1112305736) lease=127.0.0.1:35678/1556048590234(Tue Apr 23 15:43:10 AMT 2019) prevb=(0;0) ts=1556048530234(Tue Apr 23 15:42:10 AMT 2019) addr=localhost/127.0.0.1:35678 mepoch=-1) could not be sent to '/10.10.0.192:35678', because sending RPC failed: reconnecting to the server '/10.10.0.192:35678' was blocked locally to avoid flooding.

all your steps where followed as it is, and the replica isnt working.

can u help?

Robert Schmidtke

unread,
Apr 24, 2019, 3:33:21 AM4/24/19
to XtreemFS
Hi,

could you please share some more information on your setup (e.g. number of nodes with (pseudo-) IP addresses), and the steps you take to arrive at that message? You can also attach logs and configuration files (make sure you mask any confidential data!). This will make it easier for me to understand and diagnose the problems you are facing.

Cheers
Robert
Reply all
Reply to author
Forward
0 new messages