Hello all,
I'm new to Rocks and ran into installation problems. Thanks to Philip, the "Access denied for use 'apache@local host' " problem solved. However, my original question remains unsolved. Any suggestions would be highly appreciated.
I have been trying to install Rocks 4.3 on my 6 node cluster. It works fine on my frontend. However when booting the compute nodes from the DVD the compute node boots, is able to get an IP from the frontend (insert-ethers reports discovering a new node) and added the information to mysql database ( I can see compute-0-0 entry in the database. In /etc/dhcpd.conf file I can see information of the node and assigned IP address). However,the installation hangs there forever and couldn't get a kickstart file.
I found the following discussion in the archive (at the end of this email). I followed the instruction and checked everything. Everything works fine. No error message with the last command /home/install/kickstart.cgi --client="compute-0-0" > /dev/null (kickstart.cgi was not in /home/install but in /home/install/sbin/ . I copied it to /home/install/ and it returns a kickstart file with no error message. I tried both /home/install/kickstart.cgi and /home/install/sbin/kickstart.cgi).
When I plug a monitor to the compute node, it says "Could not get file https://192.168.2.6///install/sbin/public/kickstart.cgi?arch=x86_64&np=4&project=rocks" Failed to connect to FTP server
The network switch is NETGEAR GS724T. I tried to use one cable to connect the front node directly with the compute node ( I tried both NIC card). The same thing happened.
Following is the hardware specification:
motherboard: 1xSuperMicro X6DVL-EG2 Dual Xeon MB w/Video DG-Lan Raid
included on the motherboard:
video: ATI Rage XL SVGA PCI video controller w/8MB
network: Dual (2) intel 82541 Gigabit 10/100/1000 LAN Ports
Raid: 6300ESB (Hance Rapids) SATA Controller RAID 0/1
Serial: 2xFast UART 16550: 1 serial port and 1 header
DMA: Dual EIDE channels support up to four UDMA / IDE devices
SATA: 2 serial ATA Ports
chipset: intel E7320 chipset
info: PCI-Express 2(x4) slots + 1x64-bit 66MHz PCI (3.3V) slot + 3x32bit 33MHz PCI (5v) slot
Processor: Supports up to two intel 64-bit Xeon processors with 1MB and 2Mb l2 integrated advanced transfer cache, up to a 3.6 GHx
Memory: four 240-pin DIMM sckets
CPU: 2xWoodcrest Xeon 5130 Dual Core 2.0 GHz 1333FSB 4Mb(771)
RAM: 2x1Gb DDR2/667 ECC Registered memory module
HardDrive: 2x250Gb 7200RPM 8mb buffer serial ATA300 (SATA II) HDD
Sorry about this long email. I hope to post everything that might be useful to solve the problem.
Thank you very much in advance!
Guoyan Zhao
Washington University
> [Rocks-Discuss]Compute node install problem
> greg bruno bruno at SDSC.EDU
> Thu Jul 25 01:49:46 PDT 2002
>
> * Previous message: [Rocks-Discuss]Compute node install problem
> * Next message: [Rocks-Discuss]Partitioning weirdness
> * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>
> > We have been trying to install Rocks on a cluster and have successfully
> > installed the frontend node. The problem comes when booting the compute
> > nodes from the CD. The compute node boots, is able to get an IP from the
> > frontend (insert-ethers reports discovering a new node), but the install
> > fails after that. It indicates that it can't find the ks.cfg file and bails
> > out of the install.
> >
> > Any suggestions would be appreciated.
>
> a compute node kickstart requires the following services to be running
> on the frontend:
>
> - dhcpd
> - httpd
> - mysqld
> - autofs
>
> since insert-ethers discovered the compute node, we know dhcpd is ok.
>
> to check if httpd and mysqld are running:
>
> # ps auwx | grep httpd
> # ps auwx | grep mysqld
>
> if either one is not running, restart them with:
>
> # /etc/rc.d/init.d/httpd restart
>
> and/or
>
> # /etc/rc.d/init.d/mysqld restart
>
> the autofs service is called 'automount'. check if it is running:
>
> # ps auwx | grep automount
>
> if it isn't, restart it:
>
> # /etc/rc.d/init.d/autofs restart
>
> finally, to test if the rocks installation infrastructure is working:
>
> # cd /home/install
> # ./kickstart.cgi --client="compute-0-0"
>
> this should return a kickstart file.
>
> and to see if there are any errors associated with kickstart.cgi:
>
> # ./kickstart.cgi --client="compute-0-0" > /dev/null
>
>
> - gb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20071129/dcb0e8b0/attachment.html
login to your netgear switch and ensure that 'fast port' is enabled
for each port.
also, send us the output of:
# dbreport bug
- gb
Thanks a lot for helping me!
I'm sorry that now I have a different situation. I was very frustrated and decided going back to my old setup. So I installed ubuntu on the head node before receiving your email. After receiving your email I reinstalled rocks 4.3 but now the network did not work anymore. From the head node I can not ping outside IP address. During the installation, the front node first get IP address from DHCP. I changed to the static IP at the window "Ethernet Configuration for eth1". But the gateway, DNS were automatically there. Does that matter? What should I do?
You help is very much appreciated!
Guoyan Zhao
Washington University
# dbreport bug
- gb
-------------- next part --------------
An HTML attachment was scrubbed...
try exchanging the eth0 and eth1 cables on your frontend -- there are
situations in which the kernel mapping of eth0 and eth1 are not what
you'd think. also, we've seen the mapping of eth0 and eth1 move based
on kernel version, so it is a 'normal' behavior of the evolution of
the linux kernel.
- gb
I tried exchanging the eth0 and eth1 cables on the frontend. It's still doing the same thing. I figured out why the network was not working. When I turn off iptables, ssh and ping worked. I did not get this problem the first time when I installed the front node. My question is why it is behaving differently the first time vs. all later installation ( I required formating all the disks during later installations.)
Our current cluster was connected by a previous employee who left not nicely. The network switch did not use the default password and nobody know what the password is. If I set back to factory setting I run the risk of losing the currently working part. So I'm thinking wait until we finish our data processing. At the mean time, my boss is thinking to buy some new computers and maybe a full new rack of cluster. I then can try on the new cluster. I have a couple questions. Answers to these questions my save me lots of trouble later on. I would highly appreciate your inputs.
1) Quad core CPU vs. faster dual core CPU. Our current computer is dual core CPU: 2xWoodcrest Xeon 5130 Dual Core 2.0 GHz 1333FSB 4Mb(771). Is Quad core CPU supported by Rocks? Should we buy faster CPU of the same dual core to achieve higher computing power without worrying about supporting issue?
2) Does network switch NETGEAR GS724T support Rocks? Should we get the same switch?
What else should I consider if I want to use Rocks later on?
Thank you very much!
Guoyan Zhao
Washington University
-----Original Message-----
From: Greg Bruno [mailto:greg....@gmail.com]
Sent: Fri 11/30/2007 1:30 PM
To: Zhao, Guoyan
Cc: npaci-rocks...@sdsc.edu
Subject: Re: [Rocks-Discuss] compute node installation hangs at insert-ethers
- gb
-------------- next part --------------
An HTML attachment was scrubbed...
can you send us the output of:
# dbreport bug
> Our current cluster was connected by a previous employee who left not nicely. The network switch did not use the default password and nobody know what the password is. If I set back to factory setting I run the risk of losing the currently working part. So I'm thinking wait until we finish our data processing. At the mean time, my boss is thinking to buy some new computers and maybe a full new rack of cluster. I then can try on the new cluster. I have a couple questions. Answers to these questions my save me lots of trouble later on. I would highly appreciate your inputs.
>
> 1) Quad core CPU vs. faster dual core CPU. Our current computer is dual core CPU: 2xWoodcrest Xeon 5130 Dual Core 2.0 GHz 1333FSB 4Mb(771). Is Quad core CPU supported by Rocks? Should we buy faster CPU of the same dual core to achieve higher computing power without worrying about supporting issue?
the real question is -- does rocks support the motherboard. we have
some older quad-core machines in the lab that run well with stock
rocks v4.3 (that is, no changes required). but there have been reports
where some folks have to supply various kernel command line flags in
order to get their frontend and compute nodes installed.
> 2) Does network switch NETGEAR GS724T support Rocks? Should we get the same switch?
yes. we have several in the lab. just make sure to configure 'fast
link enable' on all ports.
- gb
I'm really sorry that I have already reinstalled with my old system. I have to get it ready for data processing because our sequencing data came back today. If I still run into this problem during my later installation, I'll send you the output of dbreport bug.
Could you please tell me what kind of Quad core machines do you have (maker or hardware specification)? I want to go with what you have.
Many thanks!
Guoyan Zhao
Washington University
> Our current cluster was connected by a previous employee who left not nicely. The network switch did not use the default password and nobody know what the password is. If I set back to factory setting I run the risk of losing the currently working part. So I'm thinking wait until we finish our data processing. At the mean time, my boss is thinking to buy some new computers and maybe a full new rack of cluster. I then can try on the new cluster. I have a couple questions. Answers to these questions my save me lots of trouble later on. I would highly appreciate your inputs.
>
> 1) Quad core CPU vs. faster dual core CPU. Our current computer is dual core CPU: 2xWoodcrest Xeon 5130 Dual Core 2.0 GHz 1333FSB 4Mb(771). Is Quad core CPU supported by Rocks? Should we buy faster CPU of the same dual core to achieve higher computing power without worrying about s
-----Original Message-----
From: Greg Bruno [mailto:greg....@gmail.com]
Sent: Mon 12/3/2007 5:01 PM
To: Zhao, Guoyan
Cc: npaci-rocks...@sdsc.edu
Subject: Re: [Rocks-Discuss] compute node installation hangs at insert-ethers
# dbreport bug
- gb
-------------- next part --------------
An HTML attachment was scrubbed...
______________________________________________________
Paul Kopec
Project Manager
University of Michigan
Dept. of Human Genetics
1241 E. Catherine Street
5928 Buhl Building
Ann Arbor, MI 48109-0618
734-763-5411
pko...@umich.edu
Perhaps you will be running a standard, well-known code that others have
already benchmarked. If so, search the archives and the web for their
results, or post the details of your code/problem and see if anyone has
done some recent benchmarking. There are vendors out there who will
offer you access to test clusters to benchmark.
Bart
Guoyan Zhao
Washington University
-----Original Message-----
From: Bart Brashers [mailto:bbra...@geomatrix.com]
Sent: Tue 12/4/2007 10:46 AM
To: Zhao, Guoyan
Cc: npaci-rocks...@sdsc.edu
Bart
-------------- next part --------------
An HTML attachment was scrubbed...