[Rocks] [Rocks-Discuss] HOST NOT RESOLVABLE error

366 views
Skip to first unread message

Roger.W....@aphis.usda.gov

unread,
May 4, 2010, 3:49:01 PM5/4/10
to npaci-rocks...@sdsc.edu
I'm having a series of errors crop up, which appear to be related,
although I don't quite know how they are related. The initial error
occurs during insert-ethers. As I power on each of the compute nodes to
install them, I see (HOST NOT RESOLVABLE) in red in the terminal window.
However, the nodes seem to install properly. However if I try to perform
"insert-ethers remove="compute-0-0", I get an error indicating that it
can't resolve the host name again, but the node DOES get removed. The big
problem occurs right after this. All other nodes are then reported as
"down' by ganglia, and upon reboot of the entire cluster, I get the
message "Host name lookup failure, error resolving local host". I'm not
sure where to look to resolve this issue. Does anybody have any guess
where to start?
-Roger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20100504/1e0cbfe2/attachment.html

--
You received this message because you are subscribed to the Google Groups "Rocks Clusters" group.
To post to this group, send email to rocks-c...@googlegroups.com.
To unsubscribe from this group, send email to rocks-cluster...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rocks-clusters?hl=en.

Ian Kaufman

unread,
May 4, 2010, 3:53:34 PM5/4/10
to Discussion of Rocks Clusters
Sounds like your networking and/or DNS are messed up.

Ian

On Tue, May 4, 2010 at 12:49 PM, <Roger.W....@aphis.usda.gov> wrote:

> I'm having a series of errors crop up, which appear to be related,
> although I don't quite know how they are related. The initial error
> occurs during insert-ethers. As I power on each of the compute nodes to
> install them, I see (HOST NOT RESOLVABLE) in red in the terminal window.
> However, the nodes seem to install properly. However if I try to perform
> "insert-ethers remove="compute-0-0", I get an error indicating that it
> can't resolve the host name again, but the node DOES get removed. The big
> problem occurs right after this. All other nodes are then reported as
> "down' by ganglia, and upon reboot of the entire cluster, I get the
> message "Host name lookup failure, error resolving local host". I'm not
> sure where to look to resolve this issue. Does anybody have any guess
> where to start?
> -Roger
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20100504/1e0cbfe2/attachment.html
>



--
Ian Kaufman
Research Systems Administrator
UC San Diego, Jacobs School of Engineering ikaufman AT ucsd DOT edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20100504/55db328c/attachment.html

Bart Brashers

unread,
May 4, 2010, 4:02:26 PM5/4/10
to Discussion of Rocks Clusters

The .local domain is handled via the DNS server running on your
frontend. Check if it's running (service named status) and if it
returns any values.

What IP range/mask did you specify for your compute nodes, and what IP
address/mask did you give for the public NIC on your frontend?

Bart

> I'm having a series of errors crop up, which appear to be related,
> although I don't quite know how they are related. The initial error
> occurs during insert-ethers. As I power on each of the compute nodes
to
> install them, I see (HOST NOT RESOLVABLE) in red in the terminal
window.
> However, the nodes seem to install properly. However if I try to
perform
> "insert-ethers remove="compute-0-0", I get an error indicating that it
> can't resolve the host name again, but the node DOES get removed. The
big
> problem occurs right after this. All other nodes are then reported as
> "down' by ganglia, and upon reboot of the entire cluster, I get the
> message "Host name lookup failure, error resolving local host". I'm
not
> sure where to look to resolve this issue. Does anybody have any guess
> where to start?
> -Roger
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: https://lists.sdsc.edu/pipermail/npaci-rocks-
> discussion/attachments/20100504/1e0cbfe2/attachment.html


This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to em...@environcorp.com and immediately delete all copies of the message.
Reply all
Reply to author
Forward
0 new messages