ever administer NFS?

1 view
Skip to first unread message

Paul Johnson

unread,
Dec 10, 2009, 12:58:34 PM12/10/09
to kul...@googlegroups.com
Dear Linux using buddies:

I could use your help. I'm running a cluster with Rocks! Mostly, this is familiar. Rocks is a set of wrapper scripts
and programs around CentOS. The only puzzle for me right now is NFS. I've not administered NFS before. I've gotten
some advice on the rocks list that I can't parse. Rather than make a fool of myself to that group yet again, I'm making
a fool of myself in front of you, my closest friends. :)

In case you have fiddled with NFS, you may know the answer easily. Here's the deal.

On the cluster head node, the Rocks install proceeds fine and it makes a partition /state/partition1 that is
symbolically linked to /export and that's where it keeps NFS storage. I don't know why they call the partition
/state/partition1 and then link from it to /export, but they do. The user accounts are created under /export/home on
the head node, and then when the compute nodes start and users access them, the user homes are on the NFS mounted part.
There's some autofs magic that makes the user see home as /home/username, but it is actually on the NFS share.

On this system, there is also a very large external storage, a dell md3000 with 15TB. The dell device is not available
when Rocks Cluster is installed for the first, it requires some drivers to mount, but it is available once the system is
running. I want the user home storage to be on that md3000.

So I face the challenge of getting that external storage, which eventually appears as /dev/sdd1 (all 15 tb in one giant
partition! debated back and forth between xfs and ext3 or ext4. blah) on the system.

The fellow in the Rocks list says I need to export the external device in the NFS config and then mount it in place of
/export/home.

I don't understand why I need to export it at all. If I just run

# mount /dev/sdd1 /mnt/dellmd3000

# ln -sf /mnt/dellmd3000 /export/home

won't that make "all that extra space" transparently available under /export/home for the client compute nodes?

The NFS config file exports already lists /export as available.


--
Paul E. Johnson email: paul...@ku.edu
Professor, Political Science http://pj.freefaculty.org
1541 Lilac Lane, Rm 504
University of Kansas Office: (785) 864-9086
Lawrence, Kansas 66044-3177 FAX: (785) 864-5700

Rezty Felty

unread,
Dec 10, 2009, 1:06:17 PM12/10/09
to kul...@googlegroups.com
Can't you use /etc/auto_home to make it mount as the users log in, transparently? I. E.:

vi /etc/auto_home:

user1         <Dell>:/&
user2         <Dell>:/&
user3         <Dell>:/&
*                <Dell>:/&

:wq

It seems to me like the method you are using above is mounting it as a local filesystem, not an nfs mount.  Of course, I could be wrong, it wouldn't be the first time.

Rezty Felty
SysAdmin
Sourcecorp

9133697789 Home Re...@KC-Felty.net
9136203683 Work 91362...@txt.att.net
MSN rusty...@hotmail.com
YIM HiRez_L
AIM HiRezL
ICQ 1932818
Googletalk Re...@KC-Felty.Net



--

You received this message because you are subscribed to the Google Groups "kulua-l" group.
To post to this group, send email to kul...@googlegroups.com.
To unsubscribe from this group, send email to kulua-l+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/kulua-l?hl=en.



Nick Anderson

unread,
Dec 10, 2009, 2:18:23 PM12/10/09
to kul...@googlegroups.com
Hey Paul,

Is this the bio-informatics cluster? Just curious as I have worked on
that cluster before but it was not rocks at the time.

First of all I am sorry for your misfortune in using ROCKS :). As a side
note you should check out perceus and warewulf. Greg Kurtzer and the
guys who do it are stand up folks.

Now as to your issue.

Your external storage seems to be found as a standard block device
(since you see it at /dev/sdd1).

You should be able to just mount your storage at the place where ROCKS
wants storage (I think it looks like /state/partition1 from your email).
I dont fully remember all the black magic that ROCKS does but this is
what I would try.

mkdir /mnt/tmpdrive
mount /dev/sdd1 /mnt/tmpdrive
rsync -a /state/partition1/ /mnt/tmpdrive/
umount /mnt/tmpdrive
mv /state/partition1 /state/partition1.bak
mkdir /state/partition1
mount /dev/sdd1 /state/partition1 (you should place this in fstab to
make it permanent)

If its not clear you should do this during a maintenance window and I
would powerdown all nodes that typically mount that nfs export.

Hope this helps you. Feel free to contact me directly if you like as well.

--
Nick Anderson
http://www.cmdln.org

Paul Johnson

unread,
Dec 11, 2009, 1:24:53 AM12/11/09
to kul...@googlegroups.com
Nick Anderson wrote:
> Hey Paul,
>
> Is this the bio-informatics cluster?

Do, it is some hardware that was purchased for the Center for Research Methodology. We have 63 blades altogether, in 3
racks.

I don't want to administer 3 completely separate clusters, I'll knit them together once I can actually make one work.

> Just curious as I have worked on
> that cluster before but it was not rocks at the time.

The guys here pushed me to use Rocks, they said "everybody" does. There's some momentum here to create a MOAB framework
that links the clusters together (I'll believe it when it actually sits in my lap).

>
> First of all I am sorry for your misfortune in using ROCKS :). As a side
> note you should check out perceus and warewulf. Greg Kurtzer and the
> guys who do it are stand up folks.

Thanks, I will look.

>
> Now as to your issue.
>
> Your external storage seems to be found as a standard block device
> (since you see it at /dev/sdd1).
>
> You should be able to just mount your storage at the place where ROCKS
> wants storage (I think it looks like /state/partition1 from your email).
> I dont fully remember all the black magic that ROCKS does but this is
> what I would try.
>
> mkdir /mnt/tmpdrive
> mount /dev/sdd1 /mnt/tmpdrive
> rsync -a /state/partition1/ /mnt/tmpdrive/
> umount /mnt/tmpdrive
> mv /state/partition1 /state/partition1.bak
> mkdir /state/partition1
> mount /dev/sdd1 /state/partition1 (you should place this in fstab to
> make it permanent)
>

Thanks. I have been checking into this. Your approach is essentially correct.

My symbolic link proposal was wrong, I found a good discussion in a Solaris NFS document. Apparently, if you try use a
sym link within an NFS hierarchy, then NFS exports the link, but not the stuff it points toward. So the user would see
a symbolic link in his NFS mounted home, not the contents.
(http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch06_04.htm). This was unexpected to me.

> If its not clear you should do this during a maintenance window and I
> would powerdown all nodes that typically mount that nfs export.
>
> Hope this helps you. Feel free to contact me directly if you like as well.
>
> --
> Nick Anderson
> http://www.cmdln.org


Nick Anderson

unread,
Dec 11, 2009, 2:56:54 AM12/11/09
to kul...@googlegroups.com
Paul Johnson wrote:
The guys here pushed me to use Rocks, they said "everybody" does. There's some momentum here to create a MOAB framework
that links the clusters together (I'll believe it when it actually sits in my lap)
Yeah "everybody" :P. Rocks is ok for a homegrown cluster, or one that really dosnt have an admin. Like if a researcher had to do it himself. But from an administrators point of view its a PITA. Unless they have made changes (which is very possible) things like changing an ip address on your head node will cause things to go wacky. In the past the accepted answer on the list was "reload entire cluster". Not that its hard with rocks but what kind of answer is that really?

Percues and warewulf are pretty sweet. You can easily do diskless nodes or hybrid nodes. And if you have researches that needed scientific linux you can just use a vnfs capsule of scientific linux and provision a few nodes with that. Just is a lot more flexible IMHO. Plus i wrote the bash completion for perceus :P. That was afew years back so im certain its changed a bit.

Good luck with your cluster admining. It can be fun and it can be hell. But just make sure you get them to send you to supercomputing next year :)

Seth Galitzer

unread,
Dec 11, 2009, 10:17:22 AM12/11/09
to kul...@googlegroups.com
Paul Johnson wrote:

> The guys here pushed me to use Rocks, they said "everybody" does. There's some momentum here to create a MOAB framework
> that links the clusters together (I'll believe it when it actually sits in my lap).

Yeah, "everybody" except people who are serious cluster admins. We run
a 1K-node cluster here, all using gentoo, with a Solaris fileserver so
we can use ZFS properly. We also use Sun Grid Engine (SGE) for
scheduling. The nodes all boot from PXE and mount their root filesystem
over NFS. They all also have local disk storage for swap and tmp for
jobs that need it. We use ganglia to monitor the whole thing.

We do use NFS mounts for homedirs (and other shared filespace) on all of
our linux machines otherwise in the dept. We use the automount daemon
to handle this (amd or am-utils, depending on your distro). No matter
which lab machine or server you login to, you get the same file
structure, same environment.

I don't directly admin our cluster, we have another guy who does only
that. Here's a link to some of our documentation:
http://support.cis.ksu.edu/BeocatDocs. I'm pretty sure the systems list
is out of date, but the other docs are what we give to our users to run
their jobs.

Seth

--
Seth Galitzer

The beatings will continue until morale has improved.

Rudy, Jared

unread,
Dec 11, 2009, 10:26:04 AM12/11/09
to kul...@googlegroups.com
Wow, I think I just found my new pet project for home. I just happened
to have acquired a Sunblade 2000 that I installed Solaris on. And I
happen to have a really good background with Gentoo. No so much with
pxe though. This should be fun!

Funny thing, I would have probably used ROCKS also if I was going to
build a cluster, but I've always liked Gentoo so I'll be curious to see
how this works out.

Thanks for the link to your documentation; it will probably come in
handy.

Cheers,

Jared Rudy
UNIX Administrator
St. Francis Health Center
1700 SW 7th
Topeka, KS 66606
785-295-7942

Nick Anderson

unread,
Dec 11, 2009, 10:54:08 AM12/11/09
to kul...@googlegroups.com
Rudy, Jared wrote:
> Wow, I think I just found my new pet project for home. I just happened
> to have acquired a Sunblade 2000 that I installed Solaris on. And I
> happen to have a really good background with Gentoo. No so much with
> pxe though. This should be fun!
>
> Funny thing, I would have probably used ROCKS also if I was going to
> build a cluster, but I've always liked Gentoo so I'll be curious to see
> how this works out.
>
Anyone thats ever managed a computational cluster will tell you (or
should) that there really is nothing special about it. Its just a bunch
of computers networked together. Now your cluster may be more or less
fancy than someone else's and use a high speed interconnect and run jobs
with MPI but for all clusters the important thing to me boils down to
administration and flexibility. The management tools are what sets one
cluster apart from another.
A gentoo cluster will work just as well as a debian cluster, or a centos
cluster. But they both come back to management. Sure you can write all
your own scripts for management but I know there are many smarter people
than I out there. Plus when considering the "bus factor" I think its
beneficial to use a cluster "framework". Independent documentation and a
community that is familiar with the way the cluster is managed so that
training a new person even in the absence of the predecessor is somewhat
easier.


Paul Johnson

unread,
Dec 13, 2009, 7:44:23 PM12/13/09
to kul...@googlegroups.com
Rudy, Jared wrote:

>
> Yeah, "everybody" except people who are serious cluster admins.

Oh, please. How much religion did you get for Christmas? All "serious" admins do what you do.

pj

Christofer C. Bell

unread,
Dec 13, 2009, 9:54:50 PM12/13/09
to kul...@googlegroups.com
On Sun, Dec 13, 2009 at 6:44 PM, Paul Johnson <paul...@ku.edu> wrote:
Rudy, Jared wrote:

>
> Yeah, "everybody" except people who are serious cluster admins.

Oh, please.  How much religion did you get for Christmas?  All "serious" admins do what you do.

No they don't, Paul.  He said he's running it on Gentoo.  *No one* that's "serious" touches that shit.

:-)

--
Chris



Jeffrey Watts

unread,
Dec 14, 2009, 12:19:07 AM12/14/09
to kul...@googlegroups.com
Haha.  Chris has got him there.  ;-)

J.

Rudy, Jared

unread,
Dec 14, 2009, 9:34:08 AM12/14/09
to kul...@googlegroups.com
To clarify, that wasn't me who said that! I haven't touched clusters
since college.

Jared Rudy
UNIX Administrator
St. Francis Health Center
1700 SW 7th
Topeka, KS 66606
785-295-7942

-----Original Message-----
From: kul...@googlegroups.com [mailto:kul...@googlegroups.com] On
Behalf Of Paul Johnson
Sent: Sunday, December 13, 2009 6:44 PM
To: kul...@googlegroups.com
Subject: Re: [KULUA] ever administer NFS?

Paul Espinosa

unread,
Dec 14, 2009, 9:43:09 AM12/14/09
to kul...@googlegroups.com
On Mon, Dec 14, 2009 at 8:34 AM, Rudy, Jared <Jared...@sftks.net> wrote:
> To clarify, that wasn't me who said that!  I haven't touched clusters
> since college.
>


Probably a good thing. I hear it causes hair to grow on your palms.

gladi...@gmail.com

unread,
Dec 14, 2009, 1:06:12 PM12/14/09
to kulua-l

Hrm. I have no opinions on clustering software. I think people with
opinions on clustering software are... oh... waitaminute... right...
topic. TOPIC aaaand here we go:

If you didn't get it from the above paragraph, I've not worked with
Rocks or anything clusterish at all; however, I have worked/lived/ran-
screaming-away-from some large-scale NFS environments.

You're running into a couple things. The first is that creating a
symlink between the mount point/directory to /export is going to leave
you with just that: a symlink.

The second: when an export is registered with mountd, a handle is
generated for that node just As It Is At That Time. You can test this
by creating a directory (mkdir /test), exporting it (exportfs
127.0.0.1:/test), creating a file (touch /test/
this.is.just.a.directory) and then mounting it (mkdir /test.nfs; mount
127.0.0.1:/test /test.nfs). Predictably, you will see the file that
you just created in /test.nfs. Now mount anything on /test (flash
disk, cdrom, whatever). If you examine the contents of /test.nfs, you
will still see the file that was created before mounting your Other
Filesystem. To make the new filesystem available, you would need to
reexport /test and remount from the client side. I imagine in a
clustered environment that this might be a bit annoying.

What I would suggest is to incorporate the driver for your disk array
into your initrd. I've been working a lot with the new(ish)
mkinitramfs system that debian derivatives have gone to, so I don't
have any particular wisdom re: mkinitrd scripts that redhat & co still
use floating around the front of my head, but with a little
experimentation you should be able to make it (mkinitrd) do your
bidding. The documentation is a little sparse, but it's just a shell
script (famous last words)

You could also just uncompress the initrd file, and work with the cpio
archive--problem there comes from when you've got to deal with package
management driven kernel upgrades.

Good luck, mate.

-Stephen

Rudy, Jared

unread,
Dec 14, 2009, 1:27:05 PM12/14/09
to kul...@googlegroups.com
Quote: "What I would suggest is to incorporate the driver for your disk
array into your initrd."

The newest linux kernels actually allow compiling in proprietary drivers
directly into the kernel. I had to do it when I compiled a kernel to
run on a Solaris Sunblade 2000. I think I have my documentation
somewhere on how to do it if you want it.

Cheers,

Jared Rudy
UNIX Administrator
St. Francis Health Center
1700 SW 7th
Topeka, KS 66606
785-295-7942

-----Original Message-----
From: kul...@googlegroups.com [mailto:kul...@googlegroups.com] On

gladi...@gmail.com

unread,
Dec 14, 2009, 2:33:31 PM12/14/09
to kulua-l


On Dec 14, 12:27 pm, "Rudy, Jared" <Jared.R...@sftks.net> wrote:
> [...]
> The newest linux kernels actually allow compiling in proprietary drivers
> directly into the kernel.  I had to do it when I compiled a kernel to
> run on a Solaris Sunblade 2000.  I think I have my documentation
> somewhere on how to do it if you want it.
> [...]

Indeed there is that; however, unless you put together an SRPM, you're
still stuck with the same business of having to deal with new kernels
with every update. I understand the desirability in regards to the
version stability of RHEL and derivatives, but I think driver
management on that platform will become much more streamlined when
there's a RHEL release that supports the DKMS scripts (or something
similar). With the exception of the older version of ubuntu I'm
running at work (8.04), I haven't had to bat an eyelash when it comes
to external drivers such as those for virtualbox or blob drivers from
video card vendors. (In 8.04, linux headers aren't automagically
installed with new kernel versions, so you have to be sure to manually
install 'em or DKMS will hurl)

-S
Reply all
Reply to author
Forward
0 new messages