I'm a newbie to Rocks.
After we 'make rpm' to add a new device driver to our Rocks 5.3, we are
not able to kickstart
a new node.
The error appear on the console right after the display of :
Retrieving /install/sbin/public/kickstart.cgi!arch=x86_64&np=16...
The error message displayed:
Running anaconda, the Rocks system installer - please wait...
python: error while loading shared libraries: libpython2.4.so.1.0:
cannot open shared object file: No such file or directory
/usr/bin/python: error while loading shared libraries:
libpython2.4.so.1.0: cannot open shared object file: No such file or
directory
sending termination signals...done
After that, it just reboot.....
On our headnode, the python library looks fine:
# ldd /usr/bin/python
libpython2.4.so.1.0 => /usr/lib64/libpython2.4.so.1.0
(0x0000003b67c00000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003b59600000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003b59200000)
libutil.so.1 => /lib64/libutil.so.1 (0x0000003b64000000)
libm.so.6 => /lib64/libm.so.6 (0x0000003b58e00000)
libc.so.6 => /lib64/libc.so.6 (0x0000003b58a00000)
/lib64/ld-linux-x86-64.so.2 (0x0000003b58600000)
# ls -la /usr/lib64/libpython*
lrwxrwxrwx 1 root root 19 Feb 23 2010 /usr/lib64/libpython2.4.so
-> libpython2.4.so.1.0
-r-xr-xr-x 1 root root 1236344 Sep 3 2009 /usr/lib64/libpython2.4.so.1.0
Do I need to run 'ldconfig' to sort out all the library links before the
build?
Please advise on how to resolve this issue.
Thanks.
Steven Lo.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120220/d0c50c6c/attachment.html
Thanks,
Yu
This is the version we been using the last couple years. I'm not sure
what's involve
to upgrade Rocks from 5.3 to 5.4.
Eventually we have to perform this task but we thought it will be easier
just add
the new driver to the kernel image.
> 2)what is your HW: frontend and compute nodes (server model, CPU , RAM
> ,HDD and NIC?
Frontend:
Server Model: Supermicro X8DTT motherboard
CPU: Intel(R) Xeon CPU L5520 @ 2.27GHz, dual processor with 4 core each
RAM: 24GB
HDD: ATA Hitachi HDS72202
NIC: Intel 82576 Gigabit Network Connection
Compute Nodes:
Server Model: Supermicro X9DRT motherboard
CPU: Intel Xeon E5-2600 Sandy-Bridge
RAM: 48GB
HDD: Seagate 1TB disk
NIC: Intel I350 dual port Gigabit Ethernet LAN controller
> 3)which NIC or HDD or else that you need to add driver?
It's Intel 'igb' NIC with I350 chip set.
BTW, we have kicked more than 20 compute nodes in our cluster using this
version of Rocks
the last couple years and this is the first time we need to make a new
build to include the new driver.
After the new build, we think the driver works fine since it's able to
download the image to the
compute node. However, the problem occur when it try to execute the
Python .cgi script.
Thanks.....
Steven.....
On 2/20/2012 2:28 PM, Steven Lo wrote:
> Supermicro X9DRT motherboard
--
Hung-Sheng Tsao Ph D.
Founder& Principal
HopBit GridComputing LLC
cell: 9734950840
http://laotsao.blogspot.com/
http://laotsao.wordpress.com/
http://blogs.oracle.com/hstsao/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120220/ef65a2ba/laotsao.vcf
Thanks.
Steven.
On 02/20/2012 01:37 PM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
> there is service pack 5.3.1 did you install this service pack?
>
>
> On 2/20/2012 2:28 PM, Steven Lo wrote:
>> Supermicro X9DRT motherboard
>
-------------- next part --------------
An HTML attachment was scrubbed...
-P
--
Philip Papadopoulos, PhD
University of California, San Diego
858-822-3628 (Ofc)
619-331-2990 (Fax)
-------------- next part --------------
An HTML attachment was scrubbed...
-P
--
Philip Papadopoulos, PhD
University of California, San Diego
858-822-3628 (Ofc)
619-331-2990 (Fax)
-------------- next part --------------
An HTML attachment was scrubbed...
This is what we did:
# cd /export
# hg clone http://fyp.rocksclusters.org/hg/rocks-5.3
# cd
/export/rocks-5.3/src/roll/kernel/src/rocks-boot/enterprise/5/images/drivers
# mkdir igb
# cd igb
# cp ../e1000/modinfo .
# cp ../e1000/Makefile* .
# cp ../e1000/modules.dep .
# cp ../e1000/pcitable .
-bash-3.2$ pwd
/home/slo/drivers
-bash-3.2$ tar xzvf igb-3.3.6.tar.gz
# pwd
/export/rocks-5.3/src/roll/kernel/src/rocks-boot/enterprise/5/images/drivers/igb
# cp /home/slo/drivers/igb-3.3.6/src/*.c .
# cp /home/slo/drivers/igb-3.3.6/src/*.h .
# vi modinfo
igb
net
"Intel 82576 and I350-based Gigabit Ethernet Controller"
# cat modules.dep
# <--- empty file
# cat pcitable
0x8086 0x10c9 "igb" "82576 Gigabit Network Connection"
0x8086 0x10e6 "igb" "82576 Gigabit Network Connection"
0x8086 0x10e7 "igb" "82576 Gigabit Network Connection"
0x8086 0x10e8 "igb" "82576 Gigabit Network Connection"
0x8086 0x150a "igb" "82576NS Gigabit Ethernet Controller"
0x8086 0x1521 "igb" "I350 Gigabit Network Connection"
0x8086 0x1524 "igb" "I350 Gigabit Connection"
# vi Makefile
MODULES := igb
SOURCES := e1000_82575.c e1000_api.c e1000_mac.c e1000_manage.c
e1000_mbx.c e1000_nvm.c e1000_phy.c igb_ethtool.c igb_main.c igb_param.c
igb_procfs.c igb_sysfs.c igb_vmdq.c kcompat.c kcompat_ethtool.c
HEADERS := e1000_82575.h e1000_api.h e1000_defines.h e1000_hw.h
e1000_mac.h e1000_manage.h e1000_mbx.h e1000_nvm.h e1000_osdep.h
e1000_phy.h e1000_regs.h igb.h igb_regtest.h igb_vmdq.h kcompat.h
# cd ..
# pwd
/export/rocks-5.3/src/roll/kernel/src/rocks-boot/enterprise/5/images/drivers
# cat subdirs
#
# put a list of all the driver directories that you'd like to build.
#
# for example, to build the 'e1000' driver, uncomment the line below:
# e1000
igb
# cd /export/rocks-5.3/src/roll/kernel/src/rocks-boot
# make rpm <-----
# cp /export/rocks-5.3/src/roll/kernel/RPMS/x86_64/rocks-boot*
/export/rocks/install/contrib/5.3/x86_64/RPMS/
# cd /export/rocks/install
# rocks create distro
# pwd
/export/rocks/install
# rpm -Uvh --force rocks-dist/x86_64/RedHat/RPMS/rocks-boot-5*.rpm
Preparing... ###########################################
[100%]
1:rocks-boot ###########################################
[100%]
# cp /boot/kickstart/default/initrd.img-5.3-x86_64 /tftpboot/pxelinux/
# cp /boot/kickstart/default/vmlinuz-5.3-x86_64 /tftpboot/pxelinux/
Next, we PXE boot the new node and that's when we get the error. Again,
looks like the driver
is working fine. Otherwise, we will not able to retrieve all the image
and kickstart cgi files.
BTW, we did get the following error when we try the build with "make
rpm" above:
python: error while loading shared libraries:
libpython2.4.so.1.0: cannot open shared object file: No such file or
directory
This is the same error when we try the PXE boot.
Please let us know if additional information is needed.
Thanks.
Steven.
for 5.3 beside the service-pack iso for 5.3.1
there are also update for 5.3.2
the contents of the update directory should give you hit of the update
>
>
> Index of /ftp-site/pub/rocks/updates/5.3
>
> [ICO] Name
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/?C=N;O=D> Last
> modified
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/?C=M;O=A> Size
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/?C=S;O=A> Description
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/?C=D;O=A>
> ------------------------------------------------------------------------
> [DIR] Parent Directory
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/> -
> [TXT] extend-client.xml
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/extend-client.xml>
> 09-Aug-2010 13:21 139
> [ ] rocks-anaconda-updates-5.3-2.x86_64.rpm
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/rocks-anaconda-updates-5.3-2.x86_64.rpm>
> 09-Aug-2010 13:16 26M
> [ ] rocks-pylib-5.3-1.noarch.rpm
> <http://www.rocksclusters.org/ftp-site/pub/rocks/updates/5.3/rocks-pylib-5.3-1.noarch.rpm>
> 09-Aug-2010 13:22 187K
> ------------------------------------------------------------------------
>
On 2/20/2012 8:30 PM, Steven Lo wrote:
>
> Nope. I'm able to find the ISO image for the service pack but unable
> to find the bug fixed list. Do you know where I can find it? Just want
> to see what's being fixed in this service pack.
>
> Thanks.
>
> Steven.
>
>
> On 02/20/2012 01:37 PM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
>> there is service pack 5.3.1 did you install this service pack?
>>
>>
>> On 2/20/2012 2:28 PM, Steven Lo wrote:
>>> Supermicro X9DRT motherboard
>>
>
--
Hung-Sheng Tsao Ph D.
Founder& Principal
HopBit GridComputing LLC
cell: 9734950840
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blank.gif
Type: image/gif
Size: 148 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/5ae4f3da/blank.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: back.gif
Type: image/gif
Size: 216 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/5ae4f3da/back.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: text.gif
Type: image/gif
Size: 229 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/5ae4f3da/text.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unknown.gif
Type: image/gif
Size: 245 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/5ae4f3da/unknown.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/5ae4f3da/laotsao.vcf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 148 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/4ee3a379/attachment.gif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 216 bytes
Desc: not available
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 229 bytes
Desc: not available
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 245 bytes
Desc: not available
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120221/4ee3a379/laotsao.vcf
We have problem installing the service pack:
# which rocks
/opt/rocks/bin/rocks
# pwd
/export
# ls -la *.iso
-rw-r--r-- 1 root root 28297216 Aug 13 2010
service-pack-5.3.1-1.x86_64.disk1.iso
# rocks remove roll service-pack
error - unknown roll name "service-pack" <-- first sign of trouble
# rocks set host attr localhost roll_install_on_the_fly true shadow=yes
#rocks add roll service-pack*iso
Traceback (most recent call last):
File "/opt/rocks/bin/rocks", line 264, in ?
command.runWrapper(name, args[i:])
File
"/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1774, in runWrapper
self.run(self._params, self._args)
File
"/opt/rocks/lib/python2.4/site-packages/rocks/commands/add/roll/__init__.py",
line 705, in run
roll_handler = RollHandler(self.arch, self.os, self.db)
File
"/opt/rocks/lib/python2.4/site-packages/rocks/commands/add/roll/__init__.py",
line 260, in __init__
os.makedirs(self.cdrom_mount)
File "/opt/rocks/lib/python2.4/os.py", line 159, in makedirs
mkdir(name, mode)
OSError: [Errno 2] No such file or directory: '/mnt/cdrom'
# rocks add roll ./service-pack-5.3.1-1.x86_64.disk1.iso
Traceback (most recent call last):
File "/opt/rocks/bin/rocks", line 233, in ?
__import__(s)
ValueError: Empty module name
Do you have any idea of what's going on??
Since we can not find the instruction for 5.3, instead we are following
the instruction for 5.4:
http://www.rocksclusters.org/roll-documentation/service-pack/5.4.2/installing-onthefly.html
Thanks.....
Steven.....
On 02/22/2012 09:49 AM, Steven Lo wrote:
>
> Hi,
>
> We have not had the chance to apply the service pack yet since it does
> require system reboot. Unfortunately, this system is also a home
> directory for user jobs, therefore coordination is required.
>
> Thanks for the test result.
>
> Steven.....
>
> On 02/22/2012 03:30 AM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
>> hi
>> Donot know what did you change your system rock-5.3
>> I just test on rocks5.3 with service pack and follow the instruction
>> in the bae-user-guide and your instruction on igb
>> 1)able to make rpm with e1000
>> 2)able to make rpm with both e1000 and igb
>> no python error
>> 3)able to install compute-node
>> regards
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4066 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120222/f6a955a4/smime.p7s
Sent from my iPad
If not, what is our option??
Thanks.....
Steven.....
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4066 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120222/57b291e9/smime.p7s
how big is your cluster?
sorry i donot how to fix your problem, not sure what did you do to your clusters' frontend
Sent from my iPad
2012/2/22 Hung-Sheng Tsao (laoTsao) <lao...@gmail.com>
--
Philip Papadopoulos, PhD
University of California, San Diego
858-822-3628 (Ofc)
619-331-2990 (Fax)
-------------- next part --------------
An HTML attachment was scrubbed...
We manage to get the service pack 5.3.1 installed (thanks to Dr.
Hung-Sheng Tsao).
However, we still see the same error when we PXE boot a compute node.
Anybody has other idea beside re-build the whole cluster (which may not
be feasible at this time and it may require a lot of work)??
Still not quite sure why it can not locate the libpython2.4.so.1.0 which
is located
in the common place /usr/libx64.
Thanks.....
Steven.....
On 02/23/2012 04:59 PM, Steven Lo wrote:
>
> # mkdir -p /mnt/cdrom
>
> # ls /export/*.iso
> /export/service-pack-5.3.1-1.x86_64.disk1.iso
>
> # mount -o loop /export/service-pack-5.3.1-1.x86_64.disk1.iso /mnt/cdrom
>
> # rocks add roll
> /state/partition1/service-pack-5.3.1-1.x86_64.disk1.iso on /mnt/cdrom
> type iso9660 (rw,loop=/dev/loop0)
> Copying service-pack to Rolls.....54509 blocks
>
> # rocks list roll
> NAME VERSION ARCH ENABLED
> base: 5.3 x86_64 yes
> ganglia: 5.3 x86_64 yes
> hpc: 5.3 x86_64 yes
> kernel: 5.3 x86_64 yes
> os: 5.3 x86_64 yes
> sge: 5.3 x86_64 yes
> web-server: 5.3 x86_64 yes
> service-pack: 5.3.1 x86_64 no
>
> # rocks enable roll service-pack
>
> # cd /export/rocks/install
>
> # rocks create distro
>
> # rocks run roll service-pack | bash
> Preparing... ########################################### [100%]
> 1:rocks-pylib ########################################### [100%]
> Preparing... ########################################### [100%]
> 1:roll-service-pack-users###########################################
> [100%]
>
>
> # make rpm
>
> Rest are same as last time.
>
> Thanks.....
>
> Steven.....
>
>
>
>
>
> On 02/23/2012 02:31 PM, Hung-Sheng Tsao (laoTsao) wrote:
>> how did you install the service-pack?
>>
>> Sent from my iPad
>>
>> On Feb 23, 2012, at 16:47, Steven Lo <s...@hep.caltech.edu> wrote:
>>
>>> Hi,
>>>
>>> We are able to install the service pack using the /mnt/cdrom
>>> as suggested.
>>>
>>> [root@t3-local pxelinux]# rocks list roll
>>> NAME VERSION ARCH ENABLED
>>> base: 5.3 x86_64 yes
>>> ganglia: 5.3 x86_64 yes
>>> hpc: 5.3 x86_64 yes
>>> kernel: 5.3 x86_64 yes
>>> os: 5.3 x86_64 yes
>>> sge: 5.3 x86_64 yes
>>> web-server: 5.3 x86_64 yes
>>> service-pack: 5.3.1 x86_64 yes
>>>
>>> We did re-build using 'make rpm', copy 'rocks-boot*' to
>>> /export/rocks/install/contrib/5.3/x86_64/RPMS/, rebuild distro,
>>> install initrd.img-5.3-x86_64 and vmlinuz-5.3-x86_64, and
>>> re-kick the node.
>>>
>>> Unfortunately, same error as before complaining the python library.
>>>
>>> When we do the 'make rpm', we noticed that the following error
>>> still exist:
>>>
>>> Creating SELinux policy...
>>> /usr/bin/python: error while loading shared libraries:
>>> libpython2.4.so.1.0: cannot open shared object file: No such file or
>>> directory
>>>
>>> This is bad sign as far as we can tell.....
>>>
>>> Do you think 'export LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/lib64' would
>>> help??
>>> I doubted.
>>>
>>> Other suggestions if you could??
>>>
>>> Thanks.....
>>>
>>> Steven.....
>>>
>>>
>>> On 02/23/2012 10:42 AM, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
>>>> this error is different
>>>> 1)rocks remove roll service-pack
>>>> generate error due to you donot have service-pack rool
>>>> do rocks list roll
>>>> first to check it out
>>>> 2)2nd error can be solve by the following steps
>>>> a)mkdir -p /mnt/cdrom
>>>> b)mount -o loop <path>/service-pack........iso /mnt/cdrom
>>>> c)rocks add roll
>>>> -LT
>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4066 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120223/9bb92f3c/smime.p7s
Luca
We continue to debug this problem.
We have replaced the python package with 'rpm --replacepkgs' as suggested:
[root@t3-local RPMS]# rpm -Uvh --replacepkgs
python-devel-2.4.3-27.el5.i386.rpm
Preparing... ###########################################
[100%]
1:python-devel ###########################################
[100%]
[root@t3-local RPMS]# rpm -Uvh --replacepkgs python-2.4.3-27.el5.x86_64.rpm
Preparing... ###########################################
[100%]
1:python ###########################################
[100%]
[root@t3-local RPMS]# rpm -Uvh --replacepkgs
python-devel-2.4.3-27.el5.x86_64.rpm
Preparing... ###########################################
[100%]
1:python-devel ###########################################
[100%]
We have also replaced Anaconda and install Anaconda-runtime:
[root@t3-local RPMS]# rpm -ivh --replacepkgs --oldpackage
--replacefiles anaconda-11.1.2.195-1.x86_64.rpm
Preparing... ###########################################
[100%]
1:anaconda ###########################################
[100%]
[root@t3-local RPMS]# rpm -ivh anaconda-runtime-11.1.2.195-1.x86_64.rpm
Preparing... ###########################################
[100%]
1:anaconda-runtime ###########################################
[100%]
The reason we suspect the Anaconda build is that we see the following
error during the 'make rpm' process:
making "torrent" files for RPMS
Buildinstall cmd: ./buildinstall --product "Rocks" --prodpath "RedHat"
--version "0.0.0" --release "0"
/state/partition1/rocks-5.3/src/roll/kernel/BUILD/rocks-boot-5.3/enterprise/5/images/x86_64/rocks-dist/x86_64
error: open of d8:announce29:http://10.4.1.1:7625/announce13:creation
failed: No such file or directory
error: open of
datei1330655008e4:infod6:lengthi1961678e4:name40:anaconda-runtime-11.1.2.195-1.x86_64.rpmee
failed: No such file or directory
Running buildinstall...
~/enterprise/5/images/x86_64/rocks-dist/x86_64/buildinstall.tree.3669
~/enterprise/5/images/x86_64/anaconda-runtime/usr/lib/anaconda-runtime
~/enterprise/5/images/x86_64/anaconda-runtime/usr/lib/anaconda-runtime
Are these normal error message during the build????????
We still see the following error during the build:
Creating SELinux policy...
/usr/bin/python: error while loading shared libraries:
libpython2.4.so.1.0: cannot open shared object file: No such file or
directory <-----
libsemanage.semanage_install_sandbox: genhomedircon returned error code 127.
Any advise on this issue, please ??
Thanks.
Steven.
On 02/22/2012 04:00 PM, Philip Papadopoulos wrote:
> Steven,
> you can look at
> "yum provides '*/libpython2.4*"
> and see which RPMs provide the file. You can attempt a yum reinstall
> of the appropriate package.
> -P
>
>
> 2012/2/22 Hung-Sheng Tsao (laoTsao) <lao...@gmail.com
> <mailto:lao...@gmail.com>>
>
> not sure what's happen here may be just reinstall
> backup your user data and some customer apps
>
> how big is your cluster?
> sorry i donot how to fix your problem, not sure what did you do to
> your clusters' frontend
>
>
> Sent from my iPad
>
> On Feb 22, 2012, at 17:47, Steven Lo <s...@hep.caltech.edu
> <mailto:s...@hep.caltech.edu>> wrote:
>
> >
> > Is this fixable??
> >
> > If not, what is our option??
> >
> > Thanks.....
> >
> > Steven.....
> >
> >
> > On 02/22/2012 02:29 PM, Hung-Sheng Tsao (laoTsao) wrote:
> >> imho, your frontend python is broken
> >>
> >> Sent from my iPad
> >>
> >> On Feb 22, 2012, at 17:22, Steven Lo <s...@hep.caltech.edu
> >>>> On 02/22/2012 03:30 AM, "Hung-Sheng Tsao (Lao Tsao ??) Ph.D."
> >>>>>>>> On 02/20/2012 01:37 PM, "Hung-Sheng Tsao (Lao Tsao ??)
> Ph.D." wrote:
> >>>>>>>>> there is service pack 5.3.1 did you install this service
> pack?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 2/20/2012 2:28 PM, Steven Lo wrote:
> >>>>>>>>>> Supermicro X9DRT motherboard
> >>>>>>>> -------------- next part --------------
> >>>>>>>> An HTML attachment was scrubbed...
> >>>>>>>> URL:
> >>>>>>>>
> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120220/2d548bf4/attachment.html
> >>>>>>>>
> >>>>>>>>
> >>>>
> >>>
> >
> >
>
>
>
>
> --
> Philip Papadopoulos, PhD
> University of California, San Diego
> 858-822-3628 (Ofc)
> 619-331-2990 (Fax)
-------------- next part --------------
An HTML attachment was scrubbed...
Normally it should pick the /opt/rocks/bin/python python.
Luca
The search path definitely has /usr/bin in front of /opt/rocks/bin (for
root):
PATH=/opt/gridengine/bin/lx26-amd64:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/java/latest/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/openmpi/bin/:/opt/rocks/bin:/opt/rocks/sbin:/opt/sun-ct/bin:/root/bin
But I thought that's the way it should be.
You are correct. /opt/rocks/bin/python is not dynamically link with
libpython.
So do you have any idea why it is picking up the system wide Python??
S.
Can someone help us confirm that /opt/rocks/bin/python should be called
in the make process as suggested by Luca??
Thanks.
S.
You have to revert back these actions using the original RPM (which
you can find in the backup).
> # cp /export/rocks-5.3/src/roll/kernel/RPMS/x86_64/rocks-boot*
> /export/rocks/install/contrib/5.3/x86_64/RPMS/
>
>
> # cd /export/rocks/install
>
> # rocks create distro
>
> # pwd
> /export/rocks/install
>
> # rpm -Uvh --force rocks-dist/x86_64/RedHat/RPMS/rocks-boot-5*.rpm
> Preparing... ###########################################
> [100%]
> 1:rocks-boot ###########################################
> [100%]
>
> # cp /boot/kickstart/default/initrd.img-5.3-x86_64 /tftpboot/pxelinux/
>
> # cp /boot/kickstart/default/vmlinuz-5.3-x86_64 /tftpboot/pxelinux/
>
This should fix your problem.
Sincerely,
Luca
If this is so, this should give us a hint where the problem is coming from.
Please confirm.
Thanks.
S.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4066 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120305/28eb55da/smime.p7s
as a test in your r5.3 source tree
can you make any roll? e.g. sge roll or hpc roll?
-LT
--
Hung-Sheng Tsao Ph D.
Founder& Principal
HopBit GridComputing LLC
cell: 9734950840
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120305/85bfeec4/laotsao.vcf
Double checked the
/export/rocks-5.3/src/roll/kernel/src/rocks-boot/python.mk,
it is same as described above.
> as a test in your r5.3 source tree
> can you make any roll? e.g. sge roll or hpc roll?
Note sure how to verify that. Please provide instructions.
Thanks.
S.
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120305/a1259dff/smime.p7s
Can you please verify that you have SELinux disabled with the command?
$ sestatus
And then can you paste the full output of the
make rpm
inside rocks-boot, you said it gives you some error, if I remember well.
Sincerely,
Luca
On 3/5/2012 4:40 PM, Steven Lo wrote:
>
>> as a test in your r5.3 source tree
>> can you make any roll? e.g. sge roll or hpc roll?
>
> Note sure how to verify that. Please provide instructions.
--
Hung-Sheng Tsao Ph D.
Founder& Principal
HopBit GridComputing LLC
cell: 9734950840
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120305/358e238f/laotsao.vcf
On 03/05/2012 03:36 PM, Luca Clementi wrote:
> Dear Steven,
> no I don't know why you have the wrong python in your installation.
>
We did re-install Python on this system but that did not help much.
> Can you please verify that you have SELinux disabled with the command?
> $ sestatus
>
# sestatus
SELinux status: enabled
SELinuxfs mount: /selinux
Current mode: permissive
Mode from config file: permissive
Policy version: 21
Policy from config file: targeted
It was disabled before but we change it to Permissive and see if will
make any
difference. Same result.
> And then can you paste the full output of the
> make rpm
> inside rocks-boot, you said it gives you some error, if I remember well.
>
I have attached the full output of the 'make -d rpm' process.
Thanks again for your help.
Steven.....
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rocks-5.3-new-driver-build.out
Type: application/octet-stream
Size: 11259562 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120306/88d98e72/Rocks-5.3-new-driver-build.out
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4066 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120306/88d98e72/smime.p7s
You might want to try also rocks this approach:
https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2011-January/050787.html
Sincerely,
Luca
We will try disabling SELinux, rebooting and recompiling.
We found this thread before but not sure if that will work. I guess it's
worth a try. However, where can we fetch the newer version of
vmlinuz and initrd.img?? For which version of Rocks??
Thanks.....
Steven.....
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120306/dfc50ba7/smime.p7s
I try to install the latest rocks distribution and I run again (like the
old time) the problem of attributing the wrong ethernet card for the
public and private Internet interface.
Apparently , there's nothing new to fix that problem , even thought eth1
is plug to the public network ( with a DHCP server ), Rock choose it as
eth0 (private).
I could not find any info about it (WIKI, FAQ, ....) , I can't even
search the archive ,when checking archive threads one by one I find one
equivalent question ... unanswered !
During my search in the documentation I ran into the swap interface
command. Can I do an install and swap eth1 and eth0 ?
How can I resolve this problem.
BTW why there's no search option on the mailing list archive , I kind of
remember that there were one in the past , right ?
Thanks for any help
Chris
--
--------------------------------------------------------------------------
Christophe Guilbert, Ph.D.
Mission Bay Genentech Hall
UCSF, Department of Pharmaceutical Chemistry
600 16th Street, Suite # S-126-D
Genentech Hall, MC 2280 San Francisco, CA 94158-2517
Office : 415-476-0707
Office fax : 415-476-0688
Email: cgui...@picasso.ucsf.edu
http://mondale.ucsf.edu
--------------------------------------------------------------------------
In a world without walls and fences,
who needs Windows and Gates?
- Sun Microsystems
I will send you a link to download the development version of rocks 5.7.
First you should fix your issue (you can't kickstart node), replacing
the buggy rpm you produced with the original working rpm from your
backup and verify that everything works properly (you can kickstart
compute nodes).
Be careful because you can make the front-end unbootable, I would
first test the commands on a compute node.
Luca
1) Stop the network service
2) Swap the MAC address in ifcfg-eth0 and ifcfg-eth1
3) Restart the network service
If you don't stop the network service prior to this, linux seems to hang on
shutting down the network service and hangs.
On Tue, Mar 6, 2012 at 8:01 PM, Christophe Guilbert <
cgui...@picasso.ucsf.edu> wrote:
> Hi ,
> Its been a long time that I did not post an email to this list ;-)
>
> I try to install the latest rocks distribution and I run again (like the
> old time) the problem of attributing the wrong ethernet card for the
> public and private Internet interface.
>
> Apparently , there's nothing new to fix that problem , even thought eth1
> is plug to the public network ( with a DHCP server ), Rock choose it as
> eth0 (private).
>
> I could not find any info about it (WIKI, FAQ, ....) , I can't even search
> the archive ,when checking archive threads one by one I find one
> equivalent question ... unanswered !
>
> During my search in the documentation I ran into the swap interface
> command. Can I do an install and swap eth1 and eth0 ?
> How can I resolve this problem.
>
> BTW why there's no search option on the mailing list archive , I kind of
> remember that there were one in the past , right ?
>
> Thanks for any help
>
> Chris
>
>
>
>
>
>
>
>
>
>
> --
> ------------------------------**------------------------------**
> --------------
> Christophe Guilbert, Ph.D.
> Mission Bay Genentech Hall
> UCSF, Department of Pharmaceutical Chemistry
> 600 16th Street, Suite # S-126-D
> Genentech Hall, MC 2280 San Francisco, CA 94158-2517
> Office : 415-476-0707
> Office fax : 415-476-0688
> Email: cgui...@picasso.ucsf.edu
> http://mondale.ucsf.edu
> ------------------------------**------------------------------**
> --------------
> In a world without walls and fences,
> who needs Windows and Gates?
> - Sun Microsystems
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
http://www.rocksclusters.org/roll-documentation/base/5.4.3/x7774.html
> I ran into the same problem. I don't know if there is a "Rocks" way of
> doing this. What I ended up doing is:
>
> 1) Stop the network service
> 2) Swap the MAC address in ifcfg-eth0 and ifcfg-eth1
> 3) Restart the network service
The next time you do a "rocks sync network frontend", it will revert back to your original set-up. Probably not what you want.
________________________________
This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to em...@environcorp.com and immediately delete all copies of the message.
Thanks for the link for the kernel roll.
We have tried disable SELinux, reboot the system and then recompile.
Unfortunately, result with same error.
We will try reverse to the point where we can kickstart nodes. We will
also try the 5.7 kernel disk and see if the kernel (with the driver
that we need) that can recognize the onboard ethernet port.
Thanks.
S.
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120307/88bd0bde/smime.p7s
--
Hung-Sheng Tsao Ph D.
Founder& Principal
HopBit GridComputing LLC
cell: 9734950840
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laotsao.vcf
Type: text/x-vcard
Size: 608 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120307/35a40830/laotsao.vcf
Anyone has any idea why /usr/bin/python is called instead of
/opt/rocks/bin/python??
If /opt/rocks/bin/python is called then the dynamic python library
should not be
needed.
Thanks.....
Steven.....
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120307/828cee4f/smime.p7s
We have reversed some steps to the point where we can kickstart compute
nodes.
Now, what is recommended next step again??
Oh, BTW, I'm not able to download
http://137.110.119.137/temp/kernel-5.7-0.x86_64.disk1.iso
It may have removed?
Thanks.....
Steven.....
On 03/06/2012 06:41 PM, Luca Clementi wrote:
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120308/6292062d/smime.p7s
Hi! I am also interesting in this .. but isn't kernel-ml from elrepo
guys enough?
Thanks!
Adrian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1984 bytes
Desc: S/MIME Cryptographic Signature
Url : https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20120310/52f8e794/smime.p7s
Does any has other recommendation on how to over come the 'python library'
problem? We now find out that we need to integrate LSI driver (at
install time)
so that kickstart can recognize the root disk. That means, the 'make
rpm' has
to work.
Do you think the following will work:
* get a new system and install Rocks 5.3.1 on it.
* configure Rocks the same as our existing frontend node
* 'make rpm' with new drivers and hope that the Python library will
not occur
* copy rocks-boot*.rpm to our existing frontend node
* 'rocks create distro', 'rpm -Uvh rocks-boot*' and then replace
/tftpboot/{initrd.img, vmlinuz}
Thanks.
S.
your downtime will be just few hours at most
rocks demonstrate the installation of 120 nodes in two hours few years back
and that was include the racking the servers
if your FE has two hdd then you can install the new env on the 2nd hdd to preserved the old env just in case
-LT
Sent from my iPad