Transport endpoint is not connected

Matthew Hess

unread,

Jul 14, 2016, 6:36:22 PM7/14/16

to XtreemFS

I'm having a real problem keeping a mount active.

I have 3 data centers each with 1 mrc, 1 dir, and 2 osds. I'm running WqRq and 3 replicas.

mount command:
# mount.xtreemfs -o allow_other --vivaldi-enable --vivaldi-enable-dir-updates <fqdn>/vol /path/to/mount

I'm running an rsync -avP /source/path /path/to/mount/ and after a while it spews a bunch of messages like the below and exits with failure:
rsync: recv_generator: mkdir "/path/to/mount/subdir/anotherdir/athirddir/whatever.file" failed: Transport endpoint is not connected (107)

xtfsutil on the client mount shows in part:

Owner root

Group root

Type volume

Available/Used Space 47.2383919 TB / 331 MB

Num. Files/Dirs 1542 / 1153

Access Control p. POSIX (permissions & ACLs)

OSD Selection p. 1000,3002

Replica Selection p. default

Default Striping p. STRIPING_POLICY_RAID0 / 1 / 128kB

Default Repl. p. WqRq with 3 replicas

Snapshots enabled no

The source data is about 14G across 311188 files.

No errors with the default level of logging are being returned on any of the osds, mrcs, or dirs. Nothing interesting is being returned in the logs on the client at all.
The mount just dies and I'm forced to umount /path/to/mount and perform the above mount command again to get it moving once more.
I've checked the performance of all the systems and none have high cpu load, io wait, or anything else.

Robert Schmidtke

unread,

Jul 15, 2016, 4:42:12 AM7/15/16

to XtreemFS

Hi Matthew,

first of all let me point out that for WqRq to be fault-tolerant, you will need at least 3 OSDs, because the quorum of 2 OSDs is still 2, which cannot be reached when one OSD fails (see http://www.xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc6.1.1).

Regarding your problem it is quite hard to diagnose without any logs at all, even if you suspect there is nothing valuable in there. You can increase the log level to debug on all services (DIR, MRC, OSD) and the client as well and attach the logs to your reply. This would greatly help us helping you, as we do not have enough information to guide us to the problem and its solution.

I know I probably don't have to ask, but the connectivity between your data centers is solid? What version of XtreemFS are you running?

Robert

Matthew Hess

unread,

Jul 15, 2016, 8:56:13 AM7/15/16

to XtreemFS

Yeah, I should have included this info up front, sorry.

> I have 3 data centers each with 1 mrc, 1 dir, and 2 osds

The total from this is 3 mrc, 3 dir, and 6 osds.

Running 1.5.1-1.2

I've turned on debug and figured this may be a timeout from somewhere along the line. I've had more success (stability) with:
mount.xtreemfs -o allow_other --enable-async-writes --log-level DEBUG --vivaldi-enable --vivaldi-enable-dir-updates --retry-delay 10 --connect-timeout 60 --request-timeout 20

Connectivity between datacenters is as stable as the internet will overall allow.

Robert Schmidtke

unread,

Jul 15, 2016, 10:20:46 AM7/15/16

to xtre...@googlegroups.com

If you like you can give the unstable packages a try as they continue some stability fixes we've included over the past months. We're currently in the process of building a 1.6 stable release, but for the time being you can have a look at: http://download.opensuse.org/repositories/home:/xtreemfs:/unstable/

We're happy about any kind of feedback on your experiences.

If you have any specific questions or want us to clarify something, please don't hesitate to ask.

--

---
You received this message because you are subscribed to a topic in the Google Groups "XtreemFS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/xtreemfs/rXebJcA9YfE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to xtreemfs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

My GPG Key ID: 336E2680

Matthew Hess

unread,

Jul 15, 2016, 12:08:16 PM7/15/16

to XtreemFS

My main concern is how to speed this up. I'm using an async mount but perf tuning seems like a bit of a dark art and is heavily dependent upon various timers of leases etc. Information seems limited for tuning.

I'd also like to note that issuing a service xtreemfs-mrc reload does not pick up changes to a datacenter map. A restart of each mrc is required to populate datacenter map changes it would seem. Is this expected behavior?

I would like to mention that due to the expansion needs I'm actually layering xtreemfs over the top of local glusterfs systems in each datacenter. Each osd has a gluster client mount that houses the objs dir per osd and performance locally has been fantastic. It's when dealing with the wan portion that slowness is nearly an application killer for me. Each datacenter has a 40Gbps uplink but does have some contention with other dc traffic.

Robert Schmidtke

unread,

Jul 18, 2016, 5:40:18 AM7/18/16

to XtreemFS

Hi Matthew,

as of now the reload is a noop, you would indeed need to issue a restart for changes to take effect. Have you had a look at the object caching (http://www.xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc4.3.5), the metadata-cache-size and enable-async-writes mount options? As I'm still not quite sure about your exact setup it is hard for me to give concrete advice. Any feedback is greatly appreciated.

Robert

Matthew Hess

unread,

Jul 18, 2016, 10:17:06 AM7/18/16

to XtreemFS

setup..

dc1:
1 mrc, 1 dir, 2 osd
dc2:
1 mrc, 1 dir, 2 osd
dc3:
1 mrc, 1 dir, 2 osd

all 3 mrc systems are setup in replica
all 3 dir systems are setup in replica

clients exist within all 3 dcs.

I'm using dcmap to create a replication requirement of all 3 datacenters and vivaldi order.

current client mount command:

mount.xtreemfs -o allow_other --enable-async-writes --log-level DEBUG --vivaldi-enable --vivaldi-enable-dir-updates --retry-delay 3 --connect-timeout 60 --request-timeout 10 dir-server-in-dc1/xtfs /local/path/

I'm finding that while running the client with a -f in debug mode that I get occasional timeouts talking with an osd (for no apparent reason) and that leads to retries behind the scenes. It is eventually able to reconnect to the osd however, the user perspective is one of delay and slowness. This issue is unique to the osd service since I am able to ssh from the client into the particular osd timing out at that moment.

Robert Schmidtke

unread,

Jul 18, 2016, 11:15:50 AM7/18/16

to XtreemFS

Hi Matthew, thanks for sharing your setup.

I have to warn you that the DIR and MRC replication features are currently experimental, and there are issues during failover of one of those components which can lead to race conditions and eventual loss of the service which requires manual intervention to bring it up again.

That being said, there has got to be an issue with either the connection or the host/service itself. Since you're saying the ssh connection does not break (which I assume you have open the entire time), it must be the machine. Have you checked on memory consumption by the JVM and the load on the machine? I know you said in an earlier message everything seems to be in order, but there must be some correlation between the timeouts and some system property.

Do you know what OSDs your replicas sit on? If two of them are in the same DC, and the third one is in another DC, there might be a delay because of replication. Is read-only replication an option for you? That would allow replicate-on-close (http://www.xtreemfs.org/xtfs-guide-1.5.1/index.html#tth_sEc6.2), is however less safe wrt data safety.

Robert

Matthew Hess

unread,

Jul 18, 2016, 12:22:30 PM7/18/16

to XtreemFS

I've using dcmap to ensure a copy of the file in each of the 3 data centers (replica 3)

dc1=dc1-osd1,dc2-osd1,dc3-osd1,dc1-client1,dc2-client1,dc3-client1
dc2=dc1-osd2,dc2-osd2,dc3-osd2,dc1-client2,dc2-client2,dc3-client2
(all ipv4/32 naturally)
etc.

One of the project requirements is that data be available for rw if a data center explodes. I will be adding more osds in yet more data centers (2 more planned) in order to maintain the ability to write to 3 replicas if we do lose a dc. I haven't done that yet because I must prove this to be a viable solution first.

Hannes Diedrich

unread,

Oct 20, 2016, 8:25:33 AM10/20/16

to XtreemFS

Hello,

I am having the same issue since switching to grid-ssl authentication of the mount.
Randomly mounts disconnect on different nodes and raise the "Transport endpoint is not connected" Error.
I was wondering whether this is could be fixed with a fuse option like "reconnect". Do you have any experience with that?

My setup:
4 Nodes with
1x DIR+ MRC+ OSD
3x OSD
xtreemfs version: 1.5.1.94
Mount points are on every node.
Mount command:
mount.xtreemfs --pkcs12-file-path=/path/to/client_ca.p12 --pkcs12-passphrase - --grid-ssl pbrpcg://<hidden>/vol /path/to/mount/point/

I have to add that our ssh connections are not stable (are disconnected after a random time of not using the shell).
However, this was never a problem before using grid-ssl. Can these issues be connected at all?

Best,
Hannes

robbi....@gmail.com

unread,

Oct 20, 2016, 10:11:30 AM10/20/16

to XtreemFS

Been having this issue. Seems to go away if /etc/fuse.conf is set to allow others and use -o allow_other while mounting with mount.xtreemfs

mount.xtreemfs -d DEBUG -f --enable-async-writes --vivaldi-enable --vivaldi-enable-dir-updates -o allow_other dirserver/volume /vol

Hannes Diedrich

unread,

Oct 20, 2016, 11:05:35 AM10/20/16

to XtreemFS, robbi....@gmail.com

Thanks for the response, but actually I do use the allow_other keyword (/etc/fuse.conf set up accordingly). Forgot to write it down, Sorry!

Robbi Hatcher

unread,

Oct 20, 2016, 11:12:41 AM10/20/16

to Hannes Diedrich, XtreemFS

I just made this change today. I found that a few clients had setup correct and others didn't. Once all set the same the error went away on all clients? Will keep you posted if it comes back in my setup.

Hannes Diedrich

unread,

Oct 25, 2016, 6:30:38 AM10/25/16

to XtreemFS, diedric...@gmail.com, robbi....@gmail.com

Hello!

I have experienced a few things in the last days:
The error only appeared on the node that was computing a certain script.
The script downloads files from the internet, resorts and renames the files.
On my original implementation, the script downloads the data directly to the xtreemfs volumes and does the renaming and resorting there (including copy, move, remove commands).

Now I changed the program, so that the downloader works locally (on the system memory of the node) and copies the resorted and renamed files to the xtreemfs.
This way the "Transport endpoint is not connected" error is not occurring.
Although I'm really happy that the script finally works, I am very curious what the reasons for that can be.
I guess it is connected to the problem I spotted earlier in the following threat:
https://groups.google.com/forum/#!topic/xtreemfs/6zU7MbEgz4I
Or is it not?
I am happy about any idea!
Best,
Hannes

Reply all

Reply to author

Forward