[Rocks-Discuss] Compute node reboot

907 views
Skip to first unread message

mike vandewege

unread,
Jan 30, 2012, 5:49:43 PM1/30/12
to Discussion of Rocks Clusters
Hello gurus,

I had all of the nodes and FE working fine before the weekend, however I
rebooted the front end and after the reboot, I couldnt ssh into any compute
node, I'd get the error:

ssh: connect to host compute-0-0 port 22: Connection refused

So I manually KVM'ed (verb?) into compute-0-0 and saw the error message:

Unable to locate partition mapper/nvidia_afjdheif3 to use for . Press 'OK'
to reboot system.

In which I did, but after restart, the node just looped to the same error
message. eth0 is connected and running fine in the FE. Are the FE and nodes
not talking and why would i be getting this error message?

Mike

--
Michael Vandewege, Ph.D. Student
Graduate Research Assistant
Dept. of Biochemistry and Molecular Biology
Mississippi State University
Mississippi State, MS 39762
Email: mike.va...@gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120130/dfbfc8fe/attachment.html

Luca Clementi

unread,
Feb 1, 2012, 12:32:02 PM2/1/12
to Discussion of Rocks Clusters
Hey Mike,
it seems that you have the Nvidia ION raid device on your compute nodes:
http://superuser.com/questions/110395/nvidia-ion-and-dev-mapper-nvidia-issues

Maybe it's a good idea to deactivate it on the BOIS and re-install all
the compute nodes.
To reinstall all the compute node you can execute on the FE:
rocks set host boot compute action=install
and then restart the compute node

Rebooting the FE should not create any problem, on a properly
installed cluster.
Before rebooting the FE I would try rebooting a compute node (just to
very the compute node can reboot properly).

Sincerely,
Luca

mike vandewege

unread,
Feb 2, 2012, 4:04:50 PM2/2/12
to Discussion of Rocks Clusters
Thank you for the help, this solved my problem. I think it would have taken
me weeks to figure this out on my own.

Mike

URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120202/d55556a0/attachment.html

Reply all
Reply to author
Forward
0 new messages