Status of the project

30 views
Skip to first unread message

Greg Kurtzer

unread,
Nov 21, 2009, 1:29:34 PM11/21/09
to xc...@googlegroups.com
Hello,

I am interested in the status of XCPU and the long term goals and
future plans for the project?

Thanks!

--
Greg M. Kurtzer
Chief Technology Officer
HPC Systems Architect
Infiscale, Inc. - http://www.infiscale.com

ron minnich

unread,
Nov 23, 2009, 8:19:42 PM11/23/09
to xc...@googlegroups.com
On Sat, Nov 21, 2009 at 10:29 AM, Greg Kurtzer <gmku...@gmail.com> wrote:
>
> Hello,
>
> I am interested in the status of XCPU and the long term goals and
> future plans for the project?

Hi Greg.

I am afraid that ssh rules. My feeling after 5 years of xcpu and 9
years of bproc is that people really want their ssh on a cluster. It
scales well enough for the small stuff (64 or less) that constitutes
most systems out there, and people don't care enough about scaling to
large systems. It gives them a familiar environment.

Note that the fastest machine on the planet, the Oak Ridge Jaguar
system, runs sshd on every node. I had xcpu running on my XT4, and
demo'd it, and it always came back to: "But it doesn't look like ssh".

I think any future job spawning system for clusters has to either be
ssh or feel enough like ssh that nobody knows the difference.

ron

Eric Van Hensbergen

unread,
Nov 23, 2009, 8:31:16 PM11/23/09
to xc...@googlegroups.com
We are still using the xcpu2 incarnation of xcpu for stuff and it
seems to work well enough. We have been playing with an extended
model based on xcpu2 for our ongoing Blue Gene work. There are two
papers: the Unified Execution Model paper (LADIS 2009) and the PUSH
shell (PODC 2009). Code is evolving quickly, but hopefully we'll have
some initial public release 1Q10.

-eric

Greg Kurtzer

unread,
Nov 24, 2009, 2:34:58 AM11/24/09
to xc...@googlegroups.com
Hiya Ron,

Look and feel like SSH?? Are these the same people that don't use a
scheduler/resource manager?

I assume that this fate includes XGET...

Greg

--

Latchesar Ionkov

unread,
Nov 24, 2009, 10:24:33 AM11/24/09
to xc...@googlegroups.com
XGET became too complex and hard to maintain. We may give it another go with simpler interface and less features.
I don't know what to say about XCPU... Obviously the majority doesn't care about it.

Thanks,
Lucho

ron minnich

unread,
Nov 24, 2009, 11:05:04 AM11/24/09
to xc...@googlegroups.com
On Mon, Nov 23, 2009 at 11:34 PM, Greg Kurtzer <gmku...@gmail.com> wrote:

> Look and feel like SSH?? Are these the same people that don't use a
> scheduler/resource manager?

They're the same people who want to run emacs on 10,000 cluster nodes.
They're our customers, sadly, and they still do things like run
editors written in Fortran.

Greg, we need to get into a different business where people are more
open to change.

Henry Ford put it best, I paraphrase: "Had I listened to my customers
I would have made better buggy whips".

>
> I assume that this fate includes XGET...

Well, I am re-looking at beoboot. Beoboot did a really good job for
what we needed and scaled very well indeed.

I'm actually moving on a bit. We're looking at what it takes to boot
10M kernels and it's pretty clear to me that nothing we've done to
date is really up to snuff.

Thanks

ron

Andrew Shewmaker

unread,
Nov 24, 2009, 11:21:55 AM11/24/09
to xc...@googlegroups.com
On Mon, Nov 23, 2009 at 6:19 PM, ron minnich <rmin...@gmail.com> wrote:
> I think any future job spawning system for clusters has to either be
> ssh or feel enough like ssh that nobody knows the difference.

Another option would be to provide side-by-side options so that people
can transition at their leisure. SSH for those that want familiarity,
and something like XCPU for those that want to actually use their big
systems. Many LANL users eventually got used to our BProc systems,
but we should have made the transition much smoother for them (e.g.
run the system as lightweight BProc, but provide a full distro file
system and start up sshd on each node).

People don't expect to be able to ssh into individual cores, but they
unfortunately don't tend to think of clusters the same way. Unless it
is something like an Altix.

--
Andrew Shewmaker

Eric Van Hensbergen

unread,
Nov 24, 2009, 12:12:23 PM11/24/09
to xc...@googlegroups.com
Can I ask the stupid question of what behavior is different with ssh?
It seems like ssh is a subset of xcpu functionality so why couldn't
that look/feel be provided side-by-side? (disclaimer: I don't
understand why someone would want ssh functionality, but I don't see
where xcpu falls short of that)

-eric

Eric W. Biederman

unread,
Nov 24, 2009, 1:04:24 PM11/24/09
to xc...@googlegroups.com
Eric Van Hensbergen <eri...@gmail.com> writes:

> Can I ask the stupid question of what behavior is different with ssh?
> It seems like ssh is a subset of xcpu functionality so why couldn't
> that look/feel be provided side-by-side? (disclaimer: I don't
> understand why someone would want ssh functionality, but I don't see
> where xcpu falls short of that)

I think the core of compatibility is rsh actually. There are tons of
programs automation and other things that work by assuming you can
say <prefix cmd> <command to run on other machine>.

Eric

ron minnich

unread,
Nov 24, 2009, 1:06:40 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 10:04 AM, Eric W. Biederman
<ebie...@xmission.com> wrote:

> I think the core of compatibility is rsh actually.

True. But even mentioning something like rsh in the USG nowadays is a
bad thing to do. So you say "ssh" and it bypasses all the
cybersecurity mental filters. "Oh, ssh, that's secure".

ron

Latchesar Ionkov

unread,
Nov 24, 2009, 1:08:18 PM11/24/09
to xc...@googlegroups.com
xcpu both hash rsh/ssh syntax compatibility and security. there is something else missing, i guess :)

ron minnich

unread,
Nov 24, 2009, 1:17:49 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 10:08 AM, Latchesar Ionkov <lio...@lanl.gov> wrote:
>
> xcpu both hash rsh/ssh syntax compatibility and security. there is something else missing, i guess :)

I hate to say it but one thing missing is ptys.

ron

Eric Van Hensbergen

unread,
Nov 24, 2009, 1:21:06 PM11/24/09
to xc...@googlegroups.com

That's easy enough to emulate, what specifically? So you can run
vi/emacs over the xcpu connection?

-eric

Latchesar Ionkov

unread,
Nov 24, 2009, 1:27:19 PM11/24/09
to xc...@googlegroups.com
I don't think slurm or torque support ptys. That doesn't make them less popular.

Eric W. Biederman

unread,
Nov 24, 2009, 1:28:53 PM11/24/09
to xc...@googlegroups.com
Latchesar Ionkov <lio...@lanl.gov> writes:

> xcpu both hash rsh/ssh syntax compatibility and security. there is something else missing, i guess :)

I know what I missed in bproc was the ability to run basic shell
scripts. When I was looking at that I figured I could provide that
capability with 10M-20M of basic binaries, and much less if I used
busybox.

My gut feel says you have to demonstrate a lot of benefit to convince
people to leave their creature comforts behind.

Eric

Latchesar Ionkov

unread,
Nov 24, 2009, 1:32:46 PM11/24/09
to xc...@googlegroups.com
xcpu2 allows you to even run cups on each compute node even if you have only busybox installed on them. You just do:

xrx n[1-100] /etc/init.d/cups start

ron minnich

unread,
Nov 24, 2009, 2:03:48 PM11/24/09
to xc...@googlegroups.com
OK, here is some of what we learned people seem to want on these
clusters. This is drawn from experiences with users of bproc and
conventional clusters.

They were used to
ssh node cmd

and if cmd was a shell script,
ssh node script

Now on bproc (bpsh) we learned that asking people to do this instead:
./script
and change commands in script from:
command
to
bpsh node-list command

in essence, turn the script inside out,
was a big hurdle for many folks, and they did not like it, *even if it
was only one line to change*, and *even if it gave them 1000-fold or
greater performance improvement*. I am not making this up.

People want to ssh in and have a full system, with command history and
all that jazz. This has other implications.

And I hate to say it, but people here at SNL who run clusters for a
living have found xcpu hard to set up and use. Performance is still
disappointing and really lags bproc by quite a bit.

Setup difficulty was also true for bproc -- it had kernel footprint
and keeping it all working was pretty awful, and it was not able to
function with even minor heterogeneity, e.g. a geode and a P4 were not
usable as one bproc cluster.

No matter what, xcpu2 has to be as easy to set up and use as ssh, and
"different even if better" translates to "harder" for most people.

Anyway, still tired from travel but hope this is not too incoherent.

ron

ron minnich

unread,
Nov 24, 2009, 2:04:59 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 10:27 AM, Latchesar Ionkov <lio...@lanl.gov> wrote:
>
> I don't think slurm or torque support ptys. That doesn't make them less popular.

yes but they support the type of things Eric B. is mentioning. I think
what Eric is saying is correct.

ron

Latchesar Ionkov

unread,
Nov 24, 2009, 2:07:29 PM11/24/09
to xc...@googlegroups.com
If that is true, I don't see any reason to continue working on xcpu. People have ssh, they don't want anything better. So be it.

Lucho

Eric Van Hensbergen

unread,
Nov 24, 2009, 2:22:08 PM11/24/09
to xc...@googlegroups.com
but xcpu2 solves this problem - it can run any manner of scripts quite handily.

-eric

ron minnich

unread,
Nov 24, 2009, 2:26:52 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 11:22 AM, Eric Van Hensbergen <eri...@gmail.com> wrote:
>
> but xcpu2 solves this problem - it can run any manner of scripts quite handily.

xcpu2 is the way forward. The remaining issues are getting it to be
more familiar to sysadmins on setup, performance, and so on. But it's
all doable.

ron

Eric Van Hensbergen

unread,
Nov 24, 2009, 2:39:56 PM11/24/09
to xc...@googlegroups.com

I think it would be useful to have some well defined requirements
here. What about it do sysadmins find difficult to configure?

My experience is there are two things that one needs to know how to do:

a) setup authentication
b) manage the machine list

ssh gets around (a) by using LDAP or some other local auth. xcpu2
could do something similar, but there are some folks that use xcpu2
locally specifically because they don't have to get involved with an
outside userid/authentication mechanism

ssh doesn't get around (b) either, but it does seem like it would be
nice to have some easier mechanism for maintaining this sort of
information. Something like zeroconf could help on a local network,
but many of the networks we deploy xcpu2 on are segmented.

I guess (b) can be got around by using another workload management
system on top of xcpu/ssh which handles the management/monitoring of
physical resources.

Something like a distributed registry with distributed auth could help
solve some of these problems, but you'll likely need to configure at
least one node per network segment with the right information (auth
server and registry server) and then everyone else could pick it up
with zero conf. Alternatively, you could point all nodes to a
hierarchical parent which would eventually fully connect a
hierarchical tree of nodes in a manner which the
auth/registry/monitoring information could be distributed.

-eric

Andrew Shewmaker

unread,
Nov 24, 2009, 2:51:59 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 12:39 PM, Eric Van Hensbergen <eri...@gmail.com> wrote:

> I think it would be useful to have some well defined requirements
> here.  What about it do sysadmins find difficult to configure?
>
> My experience is there are two things that one needs to know how to do:
>
> a) setup authentication
> b) manage the machine list
>
> ssh gets around (a) by using LDAP or some other local auth.  xcpu2
> could do something similar, but there are some folks that use xcpu2
> locally specifically because they don't have to get involved with an
> outside userid/authentication mechanism

At one point I was tasked to try out Scali MPI on a BProc cluster, but
its daemons required PAM. Torque and slurm also manage authentication
using PAM modules. Is there a module so those sorts of things can look
to xcpu2 for authentication? Or can xcpu2 use PAM auth instead of its
own?

--
Andrew Shewmaker

Josh England

unread,
Nov 24, 2009, 2:58:12 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 10:21 AM, Eric Van Hensbergen <eri...@gmail.com> wrote:
>
> That's easy enough to emulate, what specifically? So you can run
> vi/emacs over the xcpu connection?


Actually, so you can run bash :/

-JE

Josh England

unread,
Nov 24, 2009, 3:05:30 PM11/24/09
to xc...@googlegroups.com
You lost me at LDAP. Then you lost me again when you said zeroconf.
The distributed auth/registry I just somewhat glazed over. The whole
argument here is that ssh is simple/easy as compared to xcpu2, and
these kinds of things just make it worse. All large installations
that I know of use passwd authentication with ssh keys to go
passwordless. It takes a little less than 30 seconds to set up and it
just plain works.

-JE

Eric Van Hensbergen

unread,
Nov 24, 2009, 3:08:29 PM11/24/09
to xc...@googlegroups.com

I run bash with xcpu2 without a problem.

-eric

Eric Van Hensbergen

unread,
Nov 24, 2009, 3:11:18 PM11/24/09
to xc...@googlegroups.com
Those are implementation specifics that the user/admin can be largely
unaware of. It would be quite trivial to assume the same environment
as .ssh (system-level password authentication and/or key-files on a
shared file system).

The unfortunate side of that is it requires shared distributed file
system or shared auth mechanisms be present which mean you require
something more than the drone systems we currently deploy with xcpu2
which are much easier to manage.

However, that being said, there is no reason (that I can see) why
xcpu2 couldn't support both sorts of environments.

-eric

Andrew Shewmaker

unread,
Nov 24, 2009, 3:43:57 PM11/24/09
to xc...@googlegroups.com
On Tue, Nov 24, 2009 at 1:11 PM, Eric Van Hensbergen <eri...@gmail.com> wrote:
> Those are implementation specifics that the user/admin can be largely
> unaware of.  It would be quite trivial to assume the same environment
> as .ssh (system-level password authentication and/or key-files on a
> shared file system).
>
> The unfortunate side of that is it requires shared distributed file
> system or shared auth mechanisms be present which mean you require
> something more than the drone systems we currently deploy with xcpu2
> which are much easier to manage.

We don't necessarily use a shared distributed file system for things
like system keys. Since they don't change often, we may put them into
a RAM root image and perhaps update them with a tree'd remote copy.

I want to clarify what I said before, since I combined authentication
and account authorization. In addition to something like ssh key
authentication, resource managers like torque use PAM to determine
which accounts are active on a node at a given time.

Now, I'm not particularly fond of any of the existing resource
managers, so I would be content if a scheduler (Moab in our case)
talked directly to xcpu2. We also need tight integration with MPI
implementations. Currently we have a situation where the resource
manager has to establish connections to all of the nodes in an
allocation, then MPI has to do the same sort of wireup. I understand
that it is non-trivial to get Open MPI to utilize xcpu.

--
Andrew Shewmaker

Daniel Gruner

unread,
Nov 24, 2009, 5:11:26 PM11/24/09
to xc...@googlegroups.com
Well, I am probably the only one outside of "the labs" that stuck it
out and had an
xcpu cluster running users' jobs for several months. I am very sad
about its demise...

To me the big missing pieces were: scheduler and MPI. Even though mvapich was
kind of working, it never really got debugged enough. And the bjs port remained
buggy too. If those two had worked properly, the cluster would still
be running xcpu.
I am not managing it anymore, so it has gone to caoslinux with torque/maui.
I still have a couple of bproc clusters running...

Now we manage a 4,000 node cluster using moab, xCAT,
diskless/stateless, but with
a "real" os image on every node. It works, even if it is ugly. Enough said...

Daniel

Reply all
Reply to author
Forward
0 new messages