# rocks list host boot
HOST ACTION
compute-0-0: install
compute-0-1: install
compute-0-2: install
compute-0-3: install
compute-0-4: install
compute-0-5: install
compute-0-6: install
compute-0-7: install
Which I figure means that the next pxeboot, the node will reinstall.
Am I missing something here? Thanks in advance.
what is the output of:
# rocks list host profile compute-0-0 > /tmp/ks.cfg
- gb
yes, it will overwrite /tmp/ks.cfg, but it is just a temporary file
that is not used for node installation. the above command will not
harm the system.
- gb
This was your problem, see below...
> when I pxeboot any of the nodes, it does boot the kernel
> across the network, but when the process go on, instead of just
> installing the node, it prompts me about the language and image
> location, then it fails because cant find the image.
Did you run "cd /home/export/rocks ; rocks create distro" before you PXE-booted your nodes? That creates the "image".
> I thought that it
> would have something to do with the previous instalattion in the
> nodes, so I saw the documentation
> (http://www.rocksclusters.org/roll-documentation/base/5.3/x1354.html)
> and when I run
>
>
> # rocks list host boot
> HOST ACTION
>
> compute-0-0: install
> compute-0-1: install
> compute-0-2: install
> compute-0-3: install
> compute-0-4: install
> compute-0-5: install
> compute-0-6: install
> compute-0-7: install
>
>
> Which I figure means that the next pxeboot, the node will reinstall.
> Am I missing something here? Thanks in advance.
Because the nodes are already in the database, you don't need to use "insert-ethers" again (on these nodes). Just boot them. Assuming they are set to PXE boot, they will install.
If you still have problems, it might be because the compute nodes disks have partitions on them already, which are marked with files named ".rocks-release". This makes the installer refuse to delete/reformat them, in an attempt to save your data.
When the install stops with a problem, you can press Ctrl-Alt-F1, -F2, -F3, -F4 etc. to see some useful info. One of those will contain a command line, from which you can run "fdisk" and wipe out any non-user-data partitions.
Bart
This message contains information that may be confidential, privileged or otherwise protected by law from disclosure. It is intended for the exclusive use of the Addressee(s). Unless you are the addressee or authorized agent of the addressee, you may not review, copy, distribute or disclose to anyone the message or any information contained within. If you have received this message in error, please contact the sender by electronic reply to em...@environcorp.com and immediately delete all copies of the message.
Bart
On Tue, Sep 14, 2010 at 5:21 PM, Bart Brashers
[root@lpge-cluster install]# rocks list host profile compute-0-0
Traceback (most recent call last):
File "/opt/rocks/bin/rocks", line 264, in ?
command.runWrapper(name, args[i:])
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1774, in runWrapper
self.run(self._params, self._args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/list/host/profile/__init__.py",
line 273, in run
[
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1467, in command
o.runWrapper(name, args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1774, in runWrapper
self.run(self._params, self._args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/list/host/xml/__init__.py",
line 189, in run
xml = self.command('list.node.xml', args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1467, in command
o.runWrapper(name, args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/__init__.py",
line 1774, in runWrapper
self.run(self._params, self._args)
File "/opt/rocks/lib/python2.4/site-packages/rocks/commands/list/node/xml/__init__.py",
line 511, in run
handler.parseNode(node, doEval)
File "/opt/rocks/lib/python2.4/site-packages/rocks/profile.py", line
374, in parseNode
parser.feed(line)
File "/opt/rocks/lib/python2.4/site-packages/_xmlplus/sax/expatreader.py",
line 220, in feed
self._err_handler.fatalError(exc)
File "/opt/rocks/lib/python2.4/site-packages/_xmlplus/sax/handler.py",
line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: <unknown>:163:2: mismatched tag
Answering Barts:
/home didnt have the export folder, although I ran in
/export/rocks/install/(acording to this tutorial
http://technical.bestgrid.org/index.php/Installing_R_on_a_Rocks_Cluster#Installing_R)
[root@lpge-cluster install]# rocks create distro
Cleaning distribution
Resolving versions (base files)
including "kernel" (5.3,x86_64) roll...
including "torque" (5.3.0,x86_64) roll...
including "hpc" (5.3,x86_64) roll...
including "base" (5.3,x86_64) roll...
including "web-server" (5.3,x86_64) roll...
including "ganglia" (5.3,x86_64) roll...
including "os" (5.3,x86_64) roll...
Including critical RPMS
Resolving versions (RPMs)
including "kernel" (5.3,x86_64) roll...
including "torque" (5.3.0,x86_64) roll...
including "hpc" (5.3,x86_64) roll...
including "base" (5.3,x86_64) roll...
including "web-server" (5.3,x86_64) roll...
including "ganglia" (5.3,x86_64) roll...
including "os" (5.3,x86_64) roll...
Resolving versions (SRPMs)
including "kernel" (5.3,x86_64) roll...
including "torque" (5.3.0,x86_64) roll...
including "hpc" (5.3,x86_64) roll...
including "base" (5.3,x86_64) roll...
including "web-server" (5.3,x86_64) roll...
including "ganglia" (5.3,x86_64) roll...
including "os" (5.3,x86_64) roll...
Creating files (symbolic links - fast)
Applying stage2.img
Applying updates.img
Installing XML Kickstart profiles
installing "hpc" profiles...
installing "ganglia" profiles...
installing "web-server" profiles...
installing "base" profiles...
installing "kernel" profiles...
installing "os" profiles...
installing "torque" profiles...
installing "site" profiles...
Creating repository
making "torrent" files for RPMS
And about insert-ethers, I have the same problem even not running it.
About the partitions I had thought about it too, so I booted a Debian
Install CD and destroyed all the partitions in the HD, but had the
same problems.
Using ctrl+F3 I got the following relevant output:
ROCKS: Found disk device sda
ROCKS:getCert: No Rocks disks found
ks location: https://172.16.0.1/install/sbin/kickstart.cgi?arch=x86_64&np=8
ROCKS:transfering
https://172.16.0.1//install/sbin/kickstart.cgi?arch=x86_64&np=8
.
.
.
ROCKS:httpsGetFileDesc:status 200 OK
ROCKS:urlinstStartSSLTransfer:attempt (1)
ROCKS:writeInterfacesFile
ROCKS:setting up kickstart
And then it asks for language and image file.
Im kinda worried about that error. I guess its something to do with
the XML I edited to install the packages,although I dont believe its
related to the problem.
Anyway, here is the only XML I have ever edited:
[root@lpge-cluster install]# cat
/export/rocks/install/site-profiles/5.3/nodes/extend-compute.xml
<?xml version="1.0" standalone="no"?>
<kickstart>
<description>
A skeleton XML node file. This file is a template and is intended
as an example of how to customize your Rocks cluster. Kickstart XML
nodes such as this describe packages and "post installation" shell
scripts for your cluster.
XML files in the site-nodes/ directory should be named either
"extend-[name].xml" or "replace-[name].xml", where [name] is
the name of an existing xml node.
If your node is prefixed with replace, its instructions will be used
instead of the official node's. If it is named extend, its directives
will be concatenated to the end of the official node.
</description>
<changelog>
</changelog>
<main>
<!-- kickstart 'main' commands go here -->
</main>
<pre>
<!-- partitioning commands go here -->
</pre>
<!-- There may be as many packages as needed here. Just make sure you only
uncomment as many package lines as you need. Any empty <package></package>
tags are going to confuse rocks and kill the installation procedure
-->
<!-- <package> insert 1st package name here and uncomment the line</package> -->
<!-- <package> insert 2nd package name here and uncomment the line</package> -->
<!-- <package> insert 3rd package name here and uncomment the line</package> -->
<package>R</package>
<package>R-devel</package>
<package>libRmath</package>
<package>libRmath-devel</package>
<post>
<!-- Insert your post installation script here. This
code will be executed on the destination node after the
packages have been installed. Typically configuration files
are built and services setup in this section. -->
<!-- WARNING: Watch out for special XML chars like ampersand,
greater/less than and quotes. A stray ampersand will cause the
kickstart file building process to fail, thus, you won't be able
to reinstall any nodes. It is recommended that after you create an
XML node file, that you run:
xmllint -noout file.xml
-->
mkdir /install/rocks-dist/scripts
<file name="/install/rocks-dist/scripts/rconfig.r">
Sys.getenv("http_proxy")
options(repos="http://cran.stat.auckland.ac.nz")
#Install Rmpi separately due to configure.args requirement
install.package("Rmpi",configure.args='--with-mpi=/opt/openmpi')
# Create a list of standard packages
packagelist <-
c("sp","maptools","lattice","spproj","spgpc","spgrass6","spgdal","gstat","splancs","DCluster","spdep","spPBS","spmaps","spspatstat","spgeoR","spRandomFields","spatstat","geoR","geoRglm","odesolve","snow","coda","akima")
for (pkg in packagelist)
{
if (!require(pkg))
{
print(paste("Attempting to install ",pkg))
install.packages(pkg)
}
}
</file>
ls -l /install/rocks-dist/scripts
http_proxy=http://<address>:<port> /usr/bin/R CMD BATCH --vanilla
/install/rocks-dist/scripts/rconfig.r /var/log/rconfig.log
<eval shell="python">
<!-- This is python code that will be executed on the
frontend node during kickstart file generation. You may contact
the database, make network queries, etc. These sections are
generally used to help build more complex configuration
files. The 'shell' attribute is optional and may point to any
language interpreter such as "bash", "perl", "ruby", etc.
By default shell="bash". -->
</eval>
</post>
</kickstart>
2010/9/14 Cláudio Forain <claudi...@gmail.com>:
2010/9/15 Cláudio Forain <claudi...@gmail.com>:
> Update:
>
> I tried to install via http as the node asked. It askes for the
> updates.img file. Looking around in the node, I found it in with the
> following contents
> (https://10.0.0.74/install/rolls/kernel/5.3/x86_64/images/)
>
> product.img
> index.html
> updates.img
> TRANS.TBL
> stage2.img
>
> So I pointed to that path and it seems to be installing properly. I
> will update you guys. Anyway, I beleive its not the right behavior.
> So, what's wrong?
> 2010/9/15 Cláudio Forain <claudi...@gmail.com>:
I tried to install via http as the node asked. It askes for the
updates.img file. Looking around in the node, I found it in with the
following contents
(https://10.0.0.74/install/rolls/kernel/5.3/x86_64/images/)
product.img
index.html
updates.img
TRANS.TBL
stage2.img
So I pointed to that path and it seems to be installing properly. I
will update you guys. Anyway, I beleive its not the right behavior.
So, what's wrong?
2010/9/15 Cláudio Forain <claudi...@gmail.com>:
the problem is that '<port>' is in the line in the above section:
http_proxy=http://<address>:<port> /usr/bin/R CMD BATCH --vanilla
the XML parser is trying to parse '<port>' and you want the XML parser
to treat it as a literal. to accomplish that, start your <post>
section with:
<post>
<![CDATA[
and end your </post> section with:
]]>
</post>
- gb
<?xml version="1.0" standalone="no"?>
<kickstart>
<description>
A skeleton XML node file. This file is a template and is intended
as an example of how to customize your Rocks cluster. Kickstart XML
nodes such as this describe packages and "post installation" shell
scripts for your cluster.
XML files in the site-nodes/ directory should be named either
"extend-[name].xml" or "replace-[name].xml", where [name] is
the name of an existing xml node.
If your node is prefixed with replace, its instructions will be used
instead of the official node's. If it is named extend, its directives
will be concatenated to the end of the official node.
</description>
<changelog>
</changelog>
<main>
</main>
<pre>
</pre>
<package>R</package>
<package>R-devel</package>
<package>libRmath</package>
<package>libRmath-devel</package>
<post>
<![CDATA[
mkdir /install/rocks-dist/scripts
<file name="/install/rocks-dist/scripts/rconfig.r">
Sys.getenv("http_proxy")
options(repos="http://cran.stat.auckland.ac.nz")
#Install Rmpi separately due to configure.args requirement
install.package("Rmpi",configure.args='--with-mpi=/opt/openmpi')
# Create a list of standard packages
packagelist <-
c("sp","maptools","lattice","spproj","spgpc","spgrass6","spgdal","gstat","splancs","DCluster","spdep","spPBS","spmaps","spspatstat","spgeoR","spRandomFields","spatstat","geoR","geoRglm","odesolve","snow","coda","akima")
for (pkg in packagelist)
{
if (!require(pkg))
{
print(paste("Attempting to install ",pkg))
install.packages(pkg)
}
}
</file>
ls -l /install/rocks-dist/scripts
http_proxy=http://<address>:<port> /usr/bin/R CMD BATCH --vanilla
/install/rocks-dist/scripts/rconfig.r /var/log/rconfig.log
<eval shell="python">
</eval>
]]>
</post>
</kickstart>
If I run #rocks list host profile, it gives me:
[root@lpge-cluster nodes]# rocks list host profile
xml.sax._exceptions.SAXParseException: <unknown>:133:2: mismatched tag
Do you see anything wrong?
Aparently I was finally able to kickstart a node. But those XML errors
still worry me. Im afraid that it wont run the scripts or install the
rpms properly. Thanks for now, but give me a heads up if you see
anything wrong in the xml.
2010/9/15 Cláudio Forain <claudi...@gmail.com>:
yes, after you make any modification to a node XML file, you need to
rebuild the distro:
# cd /export/rocks/install
# rocks create distro
- gb