[Rocks-Discuss] 2 compute nodes require a password for ssh

538 views
Skip to first unread message

Jonathan Gough

unread,
Oct 16, 2008, 10:19:09 AM10/16/08
to Discussion of Rocks Clusters
Dear rocks admins,

2 nodes that have never given me problems whatsoever stopped letting me ssh
in. I didn't realize this until the nodes stopped accepting Jobs from PBS.


Things I have tried:

rocks sync users
I shutdown both nodes and restarted - synced again.

still nothing. any ideas?

--
Jonathan D. Gough
Assistant Professor
Department of Chemistry and Biochemistry
Long Island University, Brooklyn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20081016/626a5b43/attachment.html

Jeremy Mann

unread,
Oct 16, 2008, 10:24:56 AM10/16/08
to Discussion of Rocks Clusters
Jonathon, check to see if the /home directory is being automounted on
those 2 nodes.

--
Jeremy Mann
jer...@biochem.uthscsa.edu

University of Texas Health Science Center
Bioinformatics Core Facility
http://www.bioinformatics.uthscsa.edu
Phone: (210) 567-2672

Jonathan Gough

unread,
Oct 16, 2008, 10:51:20 AM10/16/08
to Discussion of Rocks Clusters
home directory is not mounting. that is for sure.

--
Jonathan D. Gough
Assistant Professor
Department of Chemistry and Biochemistry
Long Island University, Brooklyn
-------------- next part --------------
An HTML attachment was scrubbed...

URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20081016/c6adbf1d/attachment.html

Paul Kopec

unread,
Oct 16, 2008, 10:52:05 AM10/16/08
to Discussion of Rocks Clusters
Make sure /home (any user) is NOT mounted on the frontend before you
execute rocks sync users.
i.e. umount /home/<user> first

I can't remember if this applies to /share/apps as well. Better
umount that as well before issuing rocks sync users.

At least that is what I do in ROCKS 4.3. :)

______________________________________________________
Paul Kopec
Project Manager
University of Michigan
Dept. of Human Genetics
1241 E. Catherine Street
5928 Buhl Building
Ann Arbor, MI 48109-0618
734-763-5411
pko...@umich.edu


Mike Hanby

unread,
Oct 16, 2008, 10:54:12 AM10/16/08
to Discussion of Rocks Clusters

What error do you get if you try to ssh to the nodes, a time out, a
message about port 22 not accepting requests? Can you ping the nodes? Is
it just your user who can't ssh / login, or is root not able to as well?

Have you hooked up a monitor / keyboard to the nodes to ensure that they
are actually successfully booting?

When you say you rebooted the nodes, did you hit the reset switch /
power button, or log in locally as root and issue the shutdown command?

As Jonny #5 would say, "need more input!"

Jonathan Gough

unread,
Oct 16, 2008, 10:57:56 AM10/16/08
to Discussion of Rocks Clusters
i can log in as user jonathan, but it requires a password and it says /home
isn't found.
i can log in as root without a password

--
Jonathan D. Gough
Assistant Professor
Department of Chemistry and Biochemistry
Long Island University, Brooklyn
-------------- next part --------------
An HTML attachment was scrubbed...

URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20081016/9f9ebd84/attachment.html

Bart Brashers

unread,
Oct 16, 2008, 11:25:48 AM10/16/08
to Discussion of Rocks Clusters
This means that auto-mounting isn't working on the node. Log in as root
and check it, and possibly start it:

# ssh compute-X-Y service autofs status

If it's not running:

# ssh compute-X-Y service autofs start

You can also try "service autofs re-start".

Bart

> -----Original Message-----
> From: npaci-rocks-dis...@sdsc.edu
[mailto:npaci-rocks-dis...@sdsc.edu] On
> Behalf Of Jonathan Gough
> Sent: Thursday, October 16, 2008 7:58 AM
> To: Discussion of Rocks Clusters
> Subject: [Rocks-Discuss] 2 compute nodes require a password for ssh
>


This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to em...@environcorp.com and immediately delete all copies of the message.

Jeremy Mann

unread,
Oct 16, 2008, 11:33:10 AM10/16/08
to Discussion of Rocks Clusters
SSH as root to that node and look at the contents of the
/etc/auto.home file. You should have entries for each of your users:

jeremy bcf.local:/export/home/user/&

If you do not, you can repopulate that file using dbreport, copy it to
that node and restart autofs.

--

Greg Bruno

unread,
Oct 16, 2008, 12:30:30 PM10/16/08
to Discussion of Rocks Clusters
On Thu, Oct 16, 2008 at 8:33 AM, Jeremy Mann <jerem...@gmail.com> wrote:
> SSH as root to that node and look at the contents of the
> /etc/auto.home file. You should have entries for each of your users:
>
> jeremy bcf.local:/export/home/user/&
>
> If you do not, you can repopulate that file using dbreport, copy it to
> that node and restart autofs.

an easier way to do the same this is, on the frontend, execute:

# rocks sync config

that regenerates all files under the control of 411 (of which
/etc/auto.home is one), broadcasts a message to all compute nodes that
tells the compute nodes to get the files under 411's control, then
restarts autofs on all compute nodes.

- gb

Jonathan Gough

unread,
Oct 16, 2008, 1:30:29 PM10/16/08
to Discussion of Rocks Clusters
I did a reboot already,

How do i reinstall?

--
Jonathan D. Gough
Assistant Professor
Department of Chemistry and Biochemistry
Long Island University, Brooklyn
-------------- next part --------------
An HTML attachment was scrubbed...

URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20081016/f232c5ce/attachment.html

Bart Brashers

unread,
Oct 16, 2008, 1:46:15 PM10/16/08
to Discussion of Rocks Clusters
# ssh compute-0-0 '/boot/kickstart/cluster-kickstart'

or

# rocks set host pxeboot compute-0-0 action=install
And PXE boot the node (just reboot it, if you have PXE selected as the
first in the boot order in the BIOS).

or

Punch the power button / reset button, or unplug the node. Rocks
re-installs by default if a node goes down ungracefully.

Bart

> -----Original Message-----
> From: npaci-rocks-dis...@sdsc.edu
[mailto:npaci-rocks-dis...@sdsc.edu] On
> Behalf Of Jonathan Gough
> Sent: Thursday, October 16, 2008 10:30 AM
> To: Discussion of Rocks Clusters
> Subject: [Rocks-Discuss] 2 compute nodes require a password for ssh
>

Jonathan Gough

unread,
Oct 16, 2008, 1:52:19 PM10/16/08
to Discussion of Rocks Clusters
that did it! Thanks for the patience

--
Jonathan D. Gough
Assistant Professor
Department of Chemistry and Biochemistry
Long Island University, Brooklyn
-------------- next part --------------
An HTML attachment was scrubbed...

URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20081016/a3e21ff6/attachment.html

Mike Hanby

unread,
Oct 16, 2008, 4:36:56 PM10/16/08
to Discussion of Rocks Clusters
cool, so no more need to do:
make -C /var/411
make -C /var/411 force
cluster-fork 411get --all

rocks sync config <--- much nicer :-)

-----Original Message-----
From: npaci-rocks-dis...@sdsc.edu
[mailto:npaci-rocks-dis...@sdsc.edu] On Behalf Of Greg Bruno
Sent: Thursday, October 16, 2008 11:31 AM
To: Discussion of Rocks Clusters

Reply all
Reply to author
Forward
0 new messages