Always on Node 1

1,061 views
Skip to first unread message

GAZ

unread,
Apr 10, 2013, 10:22:52 AM4/10/13
to isilon-u...@googlegroups.com
We had a node failure yesterday that took down our Isilon cluster.  Every connection and 99.5% of the data traffic comes in Node 1.  EMC came and installed this several months ago, but I suspect it isn't working as it should.  Shouldn't these passive nodes come into play when a node becomes unresponsive?  Support restarted the node and it is working now, but again every connection is on Node 1 and it is taking all the traffic.  I guess this three node cluster has two passive nodes.  It seems like a waste of money.  We're on OneFS v6.5.5.16.

Keith Nargi

unread,
Apr 10, 2013, 10:28:43 AM4/10/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com

What is your smartconnect setup? Connections should be balanced across all of the nodes if smartconnect and your DNS is setup correctly.  

You should have a delegation record (NS) in DNS that points to the smartconnect service IP address.  If this is setup correctly the cluster will hand out up addresses of the nodes in your cluster.  A simple NSLookup should show you if this is working correctly.  If you get different IP addresses each time you issue the lookup it's setup correctly. 

Sent from Mailbox for iPhone


On Wed, Apr 10, 2013 at 10:22 AM, GAZ <greg...@yahoo.com> wrote:

We had a node failure yesterday that took down our Isilon cluster.  Every connection and 99.5% of the data traffic comes in Node 1.  EMC came and installed this several months ago, but I suspect it isn't working as it should.  Shouldn't these passive nodes come into play when a node becomes unresponsive?  Support restarted the node and it is working now, but again every connection is on Node 1 and it is taking all the traffic.  I guess this three node cluster has two passive nodes.  It seems like a waste of money.  We're on OneFS v6.5.5.16.

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Matt Dey

unread,
Apr 10, 2013, 10:30:29 AM4/10/13
to isilon-u...@googlegroups.com
In your ip addess pools what is your Connection policy set as and do you have SmartConnect Advanced licensed? 

The connection policy should help distribute the clients out to all the nodes.  SmartConnect Advanced I believe is needed to rebalance the IP's if a single node goes down.

Erik Weiman

unread,
Apr 10, 2013, 10:35:08 AM4/10/13
to isilon-u...@googlegroups.com
Also make sure you are not connecting to the smart connect service ip. You need to use the DNS name. 

--
Erik Weiman 
Sent from my iPhone 4
--

GAZ

unread,
Apr 10, 2013, 10:47:45 AM4/10/13
to isilon-u...@googlegroups.com


On Wednesday, April 10, 2013 9:28:43 AM UTC-5, Keith wrote:

What is your smartconnect setup? Connections should be balanced across all of the nodes if smartconnect and your DNS is setup correctly.  

You should have a delegation record (NS) in DNS that points to the smartconnect service IP address.  If this is setup correctly the cluster will hand out up addresses of the nodes in your cluster.  A simple NSLookup should show you if this is working correctly.  If you get different IP addresses each time you issue the lookup it's setup correctly. 

To summarize the setup, we have followed those simple setup sets in our DNS environment and created a delegated sub-domain with the cluster IP as the name server authoritative for that sub-zone. This is setup according to the instructions provided by Isilon support, but it doesn't work.  The cluster IP does not respond to NS requests so I don't see the point.

GAZ

unread,
Apr 10, 2013, 10:52:11 AM4/10/13
to isilon-u...@googlegroups.com


On Wednesday, April 10, 2013 9:30:29 AM UTC-5, Matt Dey wrote:
In your ip addess pools what is your Connection policy set as and do you have SmartConnect Advanced licensed? 

The connection policy should help distribute the clients out to all the nodes.  SmartConnect Advanced I believe is needed to rebalance the IP's if a single node goes down.

I have no idea what that is, but in the web interface I have a license warning:
"You do not have SmartPools licensed. These defaults apply to all files. For more information about SmartPools, please contact an Isilon sales representative or Isilon Smart Partner reseller."

GAZ

unread,
Apr 10, 2013, 10:57:58 AM4/10/13
to isilon-u...@googlegroups.com


On Wednesday, April 10, 2013 9:35:08 AM UTC-5, Erik Weiman wrote:
Also make sure you are not connecting to the smart connect service ip. You need to use the DNS name. 

We only use the virtual cluster IP as it is assigned to the DNS object.  I am guessing that is the same thing as the smart connect service IP.  None of the other IPs assigned to the cluster are ever used by clients

Keith Nargi

unread,
Apr 10, 2013, 11:04:39 AM4/10/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
That is your problem right there.  You need to troubleshoot the delegation and get that working to evenly balance across the nodes in your cluster. If all of your end users are connecting to a node up address than you most definitely will have 2 passive nodes from a connectivity perspective but all of the nodes are participating in reading and writing of data. 
You should get the DNS piece squared away so that the cluster responds to dns requests. 

Sent from my iPhone

Matt Dey

unread,
Apr 10, 2013, 11:14:21 AM4/10/13
to isilon-u...@googlegroups.com
Like ketih said if the IP you are querying isn't responding to DNS that's a problem.  You might be trying to query the wrong IP.  I believe only the service IP will respond to DNS requests all the other IP's will probably just time out if you try to query them.  On a 3 node cluster you should have a minimum of 4 IP's.  One for the smartconnect name then at least one for each node.

Matt Dey

unread,
Apr 10, 2013, 11:16:51 AM4/10/13
to isilon-u...@googlegroups.com
Smartpools are totally different than SmartConnect.  Smartpools are for the disks SmartConnect is for the network.

GAZ

unread,
Apr 10, 2013, 11:40:47 AM4/10/13
to isilon-u...@googlegroups.com


On Wednesday, April 10, 2013 10:04:39 AM UTC-5, Keith wrote:
That is your problem right there.  You need to troubleshoot the delegation and get that working to evenly balance across the nodes in your cluster. If all of your end users are connecting to a node up address than you most definitely will have 2 passive nodes from a connectivity perspective but all of the nodes are participating in reading and writing of data. 
You should get the DNS piece squared away so that the cluster responds to dns requests. 


You, and this forum, are always very helpful.  I will escalate with support. 

GAZ

unread,
Apr 10, 2013, 11:50:48 AM4/10/13
to isilon-u...@googlegroups.com


On Wednesday, April 10, 2013 10:14:21 AM UTC-5, Matt Dey wrote:
Like ketih said if the IP you are querying isn't responding to DNS that's a problem.  You might be trying to query the wrong IP.  I believe only the service IP will respond to DNS requests all the other IP's will probably just time out if you try to query them.  On a 3 node cluster you should have a minimum of 4 IP's.  One for the smartconnect name then at least one for each node.

When I send a ns request to that IP it doesn't respond.  I get the error "query refused."  When I check that is it listening on the DNS port, it is answering:

(new-object Net.Sockets.TcpClient).Connect("isilon01",53)

Each node has two IPs in our setup.  I can access the cifs shares by browsing to those IPs, but I only do that for troubleshooting.

This was setup by the vendor, who I suspect may not have done everything correctly but that DNS setup is so simple I could do it with my eyes closed.  I will escalate the ticket.

J. Lasser

unread,
Apr 10, 2013, 12:17:54 PM4/10/13
to isilon-u...@googlegroups.com
The SmartConnect server is NOT a full-fledged DNS server. The only
record types it will reply with are A records. You can check that it's
actually handing out different addresses by querying the A record for
your SmartConnect zone name. If that's always handing out the same
address, you should probably switch your policy to round-robin and try
again.

You should also log onto node 1 and look at your connections with
'netstat -lan'. Find your client connections, and confirm what IP
they're talking to. If they're talking directly to the SmartConnect
Service IP, you've probably got a DNS configuration problem. In many
cases, that problem is an A record for the SmartConnect zone name,
which will prevent the delegated lookup from occurring. (Windows DNS
servers tend to create those A records by default, as do some
automated Unix DNS management systems.) I'd do a nonrecursive query
against your domain server for an A record for the SmartConnect zone
name and see if you get a reply. If you do, you'll have to remove that
A record.

Jon
Jon Lasser j...@lasser.org 206-326-0614
. . . and the walls became the world all around . . . (Maurice Sendak)

Andrew Stack

unread,
Apr 10, 2013, 2:32:40 PM4/10/13
to isilon-u...@googlegroups.com
Just curious...how many IP's have you assigned to each node?  For example on a node you have two interfaces.  In our environment each interface gets an IP for NFS and another for CIFS so 4 IP's per node.


Andrew Stack
Sr. Storage Administrator
Genentech

GAZ

unread,
Apr 10, 2013, 2:32:50 PM4/10/13
to isilon-u...@googlegroups.com
This was helpful.  It was setup wrong.  There is an A record for the smartconnect IP.  I can't change that during production hours.  

Greg Zook

unread,
Apr 10, 2013, 2:39:05 PM4/10/13
to isilon-u...@googlegroups.com

 
We have a two IPs per node
 

 
 
 
 


From: Andrew Stack <stack....@gene.com>
To: "isilon-u...@googlegroups.com" <isilon-u...@googlegroups.com>
Sent: Wednesday, April 10, 2013 1:32 PM
Subject: Re: Isilon-Users Always on Node 1

GAZ

unread,
Apr 12, 2013, 8:49:28 AM4/12/13
to isilon-u...@googlegroups.com
The A record in DNS was preventing the delegated sub-domain from working.  I deleted the A record and made sure my delegated subdomain had the same FQDN as the A record which was the hostname in Active Directory.  Now when I ping the record, it returns a different one of the pool IPs each time.  I am seeing connections balanced out across three nodes.

Thanks to the goup for the help.

Luc Simard

unread,
Apr 12, 2013, 11:13:46 AM4/12/13
to isilon-u...@googlegroups.com
You clearly have a configuration issue, I strongly recommend you go through the support channels and review with them your cluster DNS configuration. The smartconnect Best practices white paper is available from the support.emc.com or contact your support representative or Isilon/EMC partner.


--
Reply all
Reply to author
Forward
0 new messages