April 18, 2017 at 9:19 AM
Dan,
Are they running NFS over TCP or UDP? Have you whitelisted the Isilon cluster (*all* IPs) in iptables? We had trouble with SNMP a few years ago -- Nagios would send an SNMP packet to a dynamic SmartConnect IP, and snmpd on the receiving node would send it back via a static node IP. This broke iptables response detection until we whitelisted SNMP in iptables. iptables LOG rules and tcpdump on the problem Linux client should help; tcpdump on Isilon nodes should help too.
Troubleshooting on the Isilon side would be simpler if you address an individual node directly, but might circumvent your issue as well. I believe Isilon support has a custom tool that basically runs tcpdump cluster-wide and dumps into log files to help with troubleshooting. Unfortunately I don't recall the name of the binary.
Regards,
Chris
April 17, 2017 at 3:26 PM
Hi all -
we've got a cluster with 7.2.1.2, and a bunch of RHEL6 clients.
Recently two of those clients have had multiple "mount disappears" events - good old "NFS server not responding still trying". In some cases we've had no choice but to do hard reboots.
Once, I got in and tried disabling iptables - the mount returned immediately.
The two clients are on different lan segments, and the problem has occurred with mounts against (at least) two different isilon nodes. One was very busy, the other not so much.
In one case, the export is mounted only by that client, which was hammering on it. In the other case, it was our central shared software repository, not heavily accessed at all.
Neither iptables configs nor the isilon have had any significant configuration changes in the recent past.
One common thread is that the two clients are running recent kernels:
kernel-2.6.32-696.1.1.el6
and
kernel-2.6.32-696.el6
These are not our only examples of these kernels, but most of our machines aren't on this yet, and the other machines running these kernels aren't heavily used.
Anyone else run into this? I asked around on our campus and nobody had run into it (yet?).
thanks
danno
April 17, 2017 at 3:26 PM
May 10, 2017 at 2:21 PM
FWIW,
I have further narrowed this down to the iptables conntrack INVALID match. I'm DROPping INVALID packets; removing that rule fixes my problem.
grumble grumble
thanks
danno
--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
April 17, 2017 at 3:26 PMHi all -
we've got a cluster with 7.2.1.2, and a bunch of RHEL6 clients.
Recently two of those clients have had multiple "mount disappears" events - good old "NFS server not responding still trying". In some cases we've had no choice but to do hard reboots.
Once, I got in and tried disabling iptables - the mount returned immediately.
The two clients are on different lan segments, and the problem has occurred with mounts against (at least) two different isilon nodes. One was very busy, the other not so much.
In one case, the export is mounted only by that client, which was hammering on it. In the other case, it was our central shared software repository, not heavily accessed at all.
Neither iptables configs nor the isilon have had any significant configuration changes in the recent past.
One common thread is that the two clients are running recent kernels:
kernel-2.6.32-696.1.1.el6
and
kernel-2.6.32-696.el6
These are not our only examples of these kernels, but most of our machines aren't on this yet, and the other machines running these kernels aren't heavily used.
Anyone else run into this? I asked around on our campus and nobody had run into it (yet?).
thanks
danno