[Rocks] [Rocks-Discuss] Trouble Adding Virtual Frontend

1 view
Skip to first unread message

Rick Wagner

unread,
May 13, 2010, 7:41:30 PM5/13/10
to Discussion of Rocks Clusters
Hi,

I'm having some trouble getting a virtual frontend installed on my physical frontend. When I run "rocks add cluster", I see messages that make me think something is wrong at the very start (note: the 0 is because this cluster will have a VM for a frontend, and physical appliances):

[root@my-fe ~]# rocks add cluster vfe.example.edu xxx.xxx.xxx.xxx 0 vlan=xxxx
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
lost connection
lost connection
lost connection
lost connection
lost connection
created frontend VM named: frontend-0-0-0

After this, I (skeptically) tried starting the frontend, and saw an error about Xen not being able to access a disk:

[root@my-fe ~]# rocks start host vm vfe.example.edu install="yes"
lost connection
Traceback (most recent call last):
File "/opt/rocks/bin/rocks", line 264, in ?
command.runWrapper(name, args[i:])
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py", line 1774, in runWrapper
self.run(self._params, self._args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/start/host/vm/__init__.py", line 443, in run
self.bootVM(physhost, host, xmlconfig)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/start/host/vm/__init__.py", line 392, in bootVM
hipervisor.createLinux(xmlconfig, 0)
File "/usr/lib64/python2.4/site-packages/libvirt.py", line 974, in createLinux
if ret is None:raise libvirtError('virDomainCreateLinux() failed', conn=self)
libvirt.libvirtError: POST operation failed: xend_post: error from xen daemon: (xend.err "Error creating domain: Disk isn't accessible")
[root@my-fe ~]#

Since I'm not sure where the Xen roll wants to store the VM images by default, here's some ouptut from df, to show that I only have space under /export (/state/partition1 doesn't exsit on frontends, normally).

[root@my-fe ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 31G 12G 19G 39% /
/dev/sda5 200G 7.6G 182G 4% /export
/dev/sda2 31G 11G 19G 37% /var
tmpfs 12G 0 12G 0% /dev/shm
tmpfs 5.8G 0 5.8G 0% /var/lib/ganglia/rrds
none 12G 104K 12G 1% /var/lib/xenstored
[root@my-fe ~]#

I tried this on two similar frontends, and got the same results both times. As always, any help is appreciated.

Thanks,
Rick

--
You received this message because you are subscribed to the Google Groups "Rocks Clusters" group.
To post to this group, send email to rocks-c...@googlegroups.com.
To unsubscribe from this group, send email to rocks-cluster...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rocks-clusters?hl=en.

Philip Papadopoulos

unread,
May 13, 2010, 8:18:16 PM5/13/10
to Discussion of Rocks Clusters
Rick,

What is your current Lib path.
Is the pgi module loaded?

If so, the pgi Libs are causing problems.
Unload the module, remove the host and start again


P
Sent from my mobile device

Philip Papadopoulos, PhD
University of California, San Diego
858-822-3628 (Ofc)
619-331-2990 (Fax)

Rick Wagner

unread,
May 13, 2010, 8:54:29 PM5/13/10
to Discussion of Rocks Clusters
Hi Phil,

I'm afraid I don't see PGI in either the loaded modules or the environment variables.

--Rick

[root@my-fe ~]# module list
Currently Loaded Modulefiles:
1) intel/11.1
[root@my-fe ~]#

[root@my-fe ~]# env | grep -i lib
LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/046/lib/intel64:/opt/intel/Compiler/11.1/046/ipp/em64t/sharedlib:/opt/intel/Compiler/11.1/046/mkl/lib/em64t:/opt/intel/Compiler/11.1/046/tbb/em64t/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/intel/libtemp:/opt/torque/lib64:/opt/moab/lib
LIB=/opt/intel/Compiler/11.1/046/ipp/em64t/lib
NLSPATH=/opt/intel/Compiler/11.1/046/lib/intel64/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/ipp/em64t/lib/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/mkl/lib/em64t/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/idb/intel64/locale/%l_%t/%N
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
PYTHONPATH=/opt/scar/lib
[root@my-fe ~]#

Philip Papadopoulos

unread,
May 13, 2010, 9:08:22 PM5/13/10
to Discussion of Rocks Clusters
On Thu, May 13, 2010 at 5:54 PM, Rick Wagner <rwa...@physics.ucsd.edu>wrote:

> Hi Phil,
>
> I'm afraid I don't see PGI in either the loaded modules or the environment
> variables.
>
> --Rick
>
> [root@my-fe ~]# module list
> Currently Loaded Modulefiles:
> 1) intel/11.1
> [root@my-fe ~]#
>
> [root@my-fe ~]# env | grep -i lib
>
> LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/046/lib/intel64:/opt/intel/Compiler/11.1/046/ipp/em64t/sharedlib:/opt/intel/Compiler/11.1/046/mkl/lib/em64t:/opt/intel/Compiler/11.1/046/tbb/em64t/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/intel/libtemp:/opt/torque/lib64:/opt/moab/lib
> LIB=/opt/intel/Compiler/11.1/046/ipp/em64t/lib
>
> NLSPATH=/opt/intel/Compiler/11.1/046/lib/intel64/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/ipp/em64t/lib/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/mkl/lib/em64t/locale/%l_%t/%N:/opt/intel/Compiler/11.1/046/idb/intel64/locale/%l_%t/%N
> SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
> PYTHONPATH=/opt/scar/lib
> [root@my-fe ~]#
>
It's probably the intel compilers. All the Xen stuff is linked against
gcc's glib and other libs, with intel leading your
lib path, libvirt or other system apps may be trying to use the intel
versions.

-P
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20100513/80f689d4/attachment.html

Rick Wagner

unread,
May 17, 2010, 12:59:53 PM5/17/10
to Discussion of Rocks Clusters
Hi Phil,

I tried replacing the Intel for the GNU compilers, with identical results. I am waiting for the DNS entry of the new frontend to propagate, but I'm assuming that's not a problem.

--Rick

[root@my-fe ~]# module swap intel gnu
[root@my-fe ~]# env | grep -i lib
LD_LIBRARY_PATH=/opt/torque/lib64:/opt/moab/lib
SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
PYTHONPATH=/opt/scar/lib
[root@my-fe ~]#

[root@my-fe ~]# rocks add cluster vfe.example.edu xxx.xxx.xxx.xxx 0 vlan=xxxx
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
close failed: [Errno 32] Broken pipe
lost connection
lost connection
lost connection
lost connection
lost connection
created frontend VM named: frontend-0-0-0
[root@my-fe ~]# rocks start host vm vfe.example.edu install="yes"
lost connection
Traceback (most recent call last):
File "/opt/rocks/bin/rocks", line 264, in ?
command.runWrapper(name, args[i:])
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py", line 1774, in runWrapper
self.run(self._params, self._args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/start/host/vm/__init__.py", line 443, in run
self.bootVM(physhost, host, xmlconfig)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/start/host/vm/__init__.py", line 392, in bootVM
hipervisor.createLinux(xmlconfig, 0)
File "/usr/lib64/python2.4/site-packages/libvirt.py", line 974, in createLinux
if ret is None:raise libvirtError('virDomainCreateLinux() failed', conn=self)
libvirt.libvirtError: POST operation failed: xend_post: error from xen daemon: (xend.err "Error creating domain: Disk isn't accessible")
[root@my-fe ~]#



Rick Wagner

unread,
May 17, 2010, 5:48:43 PM5/17/10
to Discussion of Rocks Clusters
For posterity's sake:

This problem was caused by the SSH configuration of our frontend, which restricts SSH logins by root. My assumption is that the commands used to create VM on a container appliance go over SSH. We're hosting a single VM on our frontend, but root was not able to SSH to the localhost, which made the 'add cluster' command fail.

--Rick
Reply all
Reply to author
Forward
0 new messages