dedicated cluster

13 views
Skip to first unread message

Kelvin Kakugawa

unread,
Jan 19, 2011, 5:52:48 PM1/19/11
to jclou...@googlegroups.com
Hi all,

I'm adding support for deploying to a dedicated cluster.

If you're interested in participating, let's coordinate.

-Kelvin

Adrian Cole

unread,
Jan 19, 2011, 5:58:23 PM1/19/11
to jclou...@googlegroups.com
Cool idea, Kelwin.

I know a few folks are interested or doing similar things. Have a
look at skeletons/standalone-compute. We just need to supply this
info from a static source (may Iterables deserialized from json or
something). If you want, I can make a sandbox-api for this and we can
collaborate.

From a naming perspective, I think byon is cute (bring your own
nodes), but certainly open to options.

Thoughts?
-A

> --
> You received this message because you are subscribed to the Google Groups
> "jclouds-dev" group.
> To post to this group, send email to jclou...@googlegroups.com.
> To unsubscribe from this group, send email to
> jclouds-dev...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/jclouds-dev?hl=en.
>

Kelvin Kakugawa

unread,
Jan 19, 2011, 6:02:55 PM1/19/11
to jclou...@googlegroups.com
BYON -- bring your own nodes, is clever.

I'll take a look at skeletons/standalone-compute.  I already forked jclouds on github.  So, if you cut a branch w/ the sandbox-api, I'll jump in.

-Kelvin

Adrian Cole

unread,
Jan 20, 2011, 12:26:36 PM1/20/11
to jclou...@googlegroups.com
cool!

so, I'll make a branch today (probably by lunch) with a penciled-in
byon provider.

it will be located at sandbox-apis/byon with the maven artifact
org.jclouds.api/byon

In basic form, it will throw unsupported exceptions on runNodes
commands, but operate on others, and require a resource to build
static lists from.

If you don't mind, can you open a ticket here?

http://code.google.com/p/jclouds/issues

-Adrian
p.s. if you happen to be in the valley, come to the meetup tonight!

http://www.meetup.com/jclouds/calendar/15867124/

Kelvin Kakugawa

unread,
Jan 20, 2011, 12:32:55 PM1/20/11
to jclou...@googlegroups.com
Awesome.

I cut an issue, here:

I'm in the city.  :-/  Not sure if I can make it down to PA.

-Kelvin

Adrian Cole

unread,
Jan 24, 2011, 3:44:24 PM1/24/11
to jclou...@googlegroups.com
Good session with Kelvin and Greg today.

We have a sandbox-api byon provider

Right now, it needs implementation of yaml parsing written into BYONComputeServiceContextModule.provideNodeList  and a corresponding test in BYONParseTest.testNodesParse

The idea is that the yaml source is specified as the jclouds endpoint (ex. byon.endpoint=file://etc/nodes.yaml

After this works, we need to create another provider: proxy provider.  This would take a backend pool of running nodes and reserve/release them.  This way, existing code that use runNodes* can work.  I think some code like this is in arquillian.

thoughts?
-a

Adrian Cole

unread,
Jan 25, 2011, 11:24:07 PM1/25/11
to jclou...@googlegroups.com
ok.  here's the current yaml format.

nodes:
    cluster-1:
        id: cluster-1
        description: xyz
        hostname: cluster-1.mydomain.com
        os_arch: x86
        os_family: rhel
        os_name: redhat
        os_version: 5.3
        group: hadoop
        tags:
            - vanilla
        username: myUser
        credential: ZmFuY3lmb290
        sudo_password: c3Vkbw==

Note that for the time being, the jclouds "tag" is associated with the yaml "group".  I'll put another mail post about what's going on here.

If you need help with base64, there's a couple utilities in jclouds

import static org.jclouds.crypto.CryptoStreams.base64;
...
// encode
base64("secret".getBytes())

// decode
new String(base64("c3Vkbw=="))


Once you mvn install on sandbox-apis/byon, you can do the following to control your existing nodes:

contextProperties.setProperty("byon.nodes", nodeYamlBuiltAsAString);

context = new ComputeServiceContextFactory().createContext("byon", "foo", "bar", ImmutableSet.<Module> of(
               new JschSshClientModule(), new Log4JLoggingModule()), contextProperties);


or...

contextProperties.setProperty("byon.endpoint", URI.create("file://path/to/config.yaml");

context = new ComputeServiceContextFactory().createContext("byon", "foo", "bar", ImmutableSet.<Module> of(
               new JschSshClientModule(), new Log4JLoggingModule()), contextProperties);

After this is setup, listNodes() should work fine, as does running scripts.  Here's an example of how to run an ad-hoc command:

import static org.jclouds.compute.options.RunScriptOptions.Builder.wrapInInitScript;

  ....

responses = context.getComputeService().runScriptOnNodesMatching(
               Predicates.<NodeMetadata> alwaysTrue(), exec("echo fooble"), wrapInInitScript(false).runAsRoot(false));

Please kick the tires on this and/or contribute additions.  If it works alright, we can promote this to a *real* api so that it publishes to snapshot.

-A

Adrian Cole

unread,
Jan 26, 2011, 10:27:54 AM1/26/11
to Noah Campbell, Adrian Cole, jclou...@googlegroups.com

Hi, Noah. The 'id' is an opaque string uniquely identifying this node from another in the system.  In ec2, it would be 'us-east-1/i-36fe1c'    in vcloud, or other restful systems, it is the href. Main requirement is that it is unique on the backend, where in this case the backend is the yaml file. 

-A

On Jan 26, 2011 7:18 AM, "Noah Campbell" <no...@dtosolutions.com> wrote:
> Is cluster-1 the canonical name of the node. Also, what's the purpose of id, is that meant to be an asset tag?
>
> -Noah

Adrian Cole

unread,
Jan 26, 2011, 2:27:15 PM1/26/11
to Noah Campbell, Adrian Cole, jclou...@googlegroups.com
Here are some changes per IRC:

1. having id as the key and also a part of the value is redundant.
2. a user-specified name of the node maps correctly to jclouds
3. we should have the default source as classpath://<providername>.yaml, so that we can reuse file format for other providers without having a name conflict.
   - in this case it would be named byon.yaml

nodes:
    i-36fe1c: << opaque unique id
        name: cluster-1 << optional; user specified name
        description: xyz  << note this field is not yet in jclouds NodeMetadata
        hostname: cluster-1.mydomain.com
        os_arch: x86
        os_family: rhel
        os_name: redhat
        os_version: 5.3
        group: hadoop
        tags: << note this list is not yet in jclouds NodeMetadata

            - vanilla
        username: myUser
        credential: ZmFuY3lmb290
        sudo_password: c3Vkbw==

wrt missing fields, this is a fyi for now.  We should add issues to add these, unless people believe they are unnecessary.

I'll give them until the end of the day to settle, and then incorporate any revisions.

Cheers,
-A

Adrian Cole

unread,
Jan 28, 2011, 8:17:20 PM1/28/11
to Noah Campbell, Adrian Cole, jclou...@googlegroups.com
Hi, Team.

I've incorporated some feedback from Kelvin and Hugo. I'd like to get
this out the door for beta-9 this weekend, if there are no strong
opinions to making this different.

High-level changes:

1. switch from map -> list format as it is easier to deal with (kelvin)
2. remove constraint to base64 secrets (hugod)
3. os_name -> os_description as jclouds has more references to this
field (adrian)
4. accept credential_url where someone can specify a file containing
their login credential

Here's the doc which is taken from here:
https://github.com/jclouds/jclouds/blob/master/sandbox-apis/byon/README.txt

If you can, vote +1 on releasing byon as it is, or suggest a revision.

Cheers, and thanks guys!

-Adrian

= Bring Your Own Nodes to the jclouds ComputeService =
The bring your own node provider (byon) allows you to specify a source
which jclouds will read
nodes from. Using this, you can have jclouds control your standalone
machines, or even cloud
hosts that are sitting idle.

== Constraints ==
The byon provider only supports the following functions of ComputeService:
* listNodes
* listNodesDetailsMatching
* getNodeMetadata
* runScriptOnNodesMatching

== How to use the byon provider ==
The byon provider requires you supply a list of nodes using a
property. Here are
the valid properties you can use:
* byon.endpoint - url to access the list, can be http://, file://,
classpath://
* byon.nodes - inline defined yaml in string form.

Note:

The identity and credential fields of the ComputeServiceContextFactory
are ignored.

=== Java example ===

Properties props = new Properties();

// if you built the yaml string by hand
props.setProperty("byon.nodes", stringLiteral);

// or you can specify an external reference
props.setProperty("byon.endpoint", "file://path/to/byon.yaml");

// or you can specify a file in your classpath
props.setProperty("byon.endpoint", "classpath:///byon.yaml");

context = new ComputeServiceContextFactory().createContext("byon",
"foo", "bar",

ImmutableSet.<Module> of(new JschSshClientModule()), props);

== File format ==
You must define your nodes in yaml, and they must be in a collection
called nodes.

Here are the properties:

* id - opaque unique id
* name - optional; user specified name
* description - optional; long description of this node
* note this is not yet in jclouds NodeMetadata
* hostname - name or ip address to contact the node on
* os_arch - ex. x86
* os_family - must conform to
org.jclouds.compute.domain.OsFamily in lower-hyphen format
ex. rhel, ubuntu, centos, debian, amzn-linux
* os_description - long description of the os ex. Ubuntu with lamp stack
* os_version - normalized to numbers when possible. ex. for
centos: 5.3, ubuntu: 10.10
* group - primary group of the machine. ex. hadoop
* tags - optional; list of arbitrary tags.
* note this list is not yet in jclouds
NodeMetadata
* username - primary login user. ex. ubuntu, vcloud, toor, root
* sudo_password - optional; when a script is run with the
"runAsRoot" option true, yet the
username is not root, a sudo command is
invoked. If sudo_password
is set, the contents will be passed to sudo -S.
Ex. echo 'foobar'| sudo -S init 5

one of:

* credential - RSA private key or password
* credential_url - location of plain-text RSA private key or password.
ex. file:///home/me/.ssh/id_rsa
classpath:///id_rsa

=== Example File ===

nodes:
- id: i-sdfkjh7
name: cluster-1
description: accounting analytics cluster


hostname: cluster-1.mydomain.com
os_arch: x86
os_family: rhel

os_description: redhat with CDH


os_version: 5.3
group: hadoop
tags:
- vanilla
username: myUser
credential: |

-----BEGIN RSA PRIVATE KEY-----

MIIEowIBAAKCAQEAuzaE6azgUxwESX1rCGdJ5xpdrc1XC311bOGZBCE8NA+CpFh2

u01Vfv68NC4u6LFgdXSY1vQt6hiA5TNqQk0TyVfFAunbXgTekF6XqDPQUf1nq9aZ
lMvo4vlaLDKBkhG5HJE/pIa0iB+RMZLS0GhxsIWerEDmYdHKM25o
-----END RSA PRIVATE KEY-----
sudo_password: go panthers!

Adrian Cole

unread,
Jan 28, 2011, 9:51:30 PM1/28/11
to Adrian Cole, jclou...@googlegroups.com, Noah Campbell

Ok. Gonna need your DevOps hat for this one. As Hugo rightly mentions, how (mechanism) one escalates to admin privileges, or if they can at all is node-specific.  Most of the time, we're talking sudo here, but it could be rbac or something else. We also may have a policy to not escalate privileges. 

At the risk of conflation and maybe being too coarse grained, I think we could build in something to do this.

What do you think of a node property: admin_mode

Maybe we could default to: never-escalate and have another option: sudo?

I'm sure the names are bad, so please suggest otherwise.

-Adrian

Adrian Cole

unread,
Jan 29, 2011, 12:16:16 PM1/29/11
to Adrian Cole, jclou...@googlegroups.com, Noah Campbell
Hi, team.

If we don't decide on what to expose wrt privilege escalation, we'll have to add it next time.  I don't want to block the release based on a property we can add later.

Here's the syntax for the concepts that were put in the code yesterday. 

Please let me know by tomorrow morning, if you think we need to change or remove anything here:

Cheers!
-Adrian


nodes:
   - id: i-sdfkjh7
     name: cluster-1
     description: accounting analytics cluster
     hostname: cluster-1.mydomain.com
     os_arch: x86
     os_family: rhel
     os_description: redhat with CDH
     os_version: 5.3
     group: hadoop
     tags:
         - vanilla
     username: myUser
     credential: |
                 -----BEGIN RSA PRIVATE KEY-----
                 MIIEowIBAAKCAQEAuzaE6azgUxwESX1rCGdJ5xpdrc1XC311bOGZBCE8NA+CpFh2
                 u01Vfv68NC4u6LFgdXSY1vQt6hiA5TNqQk0TyVfFAunbXgTekF6XqDPQUf1nq9aZ
                 lMvo4vlaLDKBkhG5HJE/pIa0iB+RMZLS0GhxsIWerEDmYdHKM25o
                 -----END RSA PRIVATE KEY-----
     sudo_password: go panthers!




Adrian Cole

unread,
Jan 29, 2011, 3:34:35 PM1/29/11
to Noah Campbell, jclou...@googlegroups.com
Thanks for the feedback.

So, any thoughts on naming the feature of supporting sudo?  We can make support of sudo default to true on non-windows hosts.

maybe a property: role_escalation: (sudo|none) ?

Here's an example of our current credential mechanism, which we could use instead of embedding them in the yaml.  This works today, although I'm not sure if it handles the sudo password (I'll have to look)

      // assuming you have a blobstore somewhere (ex. file, s3), in this example, in memory:
      blobContext = new BlobStoreContextFactory().createContext("transient", "foo", "bar");
      // note the container must exist, although you can simply create it, if need be
      credentialsMap = blobContext.createInputStreamMap("credentials");


      computeContext = new ComputeServiceContextFactory().createContext("byon", "foo", "bar",
               ImmutableSet.of(new CredentialStoreModule(credentialsMap)), properties);


     // when operations occur, passwords will be read from the credentials container
     // key of node:<nodeid> will have a json value of the login credentials, ex.
                              {
                                "identity": "toor",
                                "credential" : "letmein"
                              }

Would something like this work?  If not, how should we revise it?

-A

On Sat, Jan 29, 2011 at 12:17 PM, Noah Campbell <no...@dtosolutions.com> wrote:
You should probably decouple username/credential from the node.  We're currently looking at doing this with RunDeck (not committed, just kicking the idea around).  

The other thing is we rely on properly configured sudo to make things work.

i.e. 

 %wheel        ALL=(ALL) NOPASSWD: ALL

The NOPASSWD: makes entering a password not required.

Here's a full example:

#
# ControlTier sudoers file granting a limited amount of functionality to ctier 
# to install software.
#

#
# Installation and management of software
#
Cmnd_Alias SOFTWARE = /bin/rpm, /usr/bin/yum

#
# necessary for CtlCenter to execute a ssh <hostname> sudo yum ... set it
# explicitly for the %ctier group.
#
Defaults:%ctier    !requiretty

#
# Allow the ctier user to execute the SOFTWARE utilities as root without a
# password prompt.
#
# TODO: adjust to the host_alias to be more restritive.
#
%ctier ALL=(root) NOPASSWD: SOFTWARE

This locks down want the sudoer can do.  In this case, the ctier group can only use rpm and yum.

Food for thought.

-Noah

Andrew Phillips

unread,
Jan 29, 2011, 7:31:58 PM1/29/11
to jclou...@googlegroups.com, no...@dtosolutions.com
> Ok. Gonna need your DevOps hat for this one. As Hugo rightly mentions,
> how (mechanism) one escalates to admin privileges, or if they can at all
> is node-specific. Most of the time, we're talking sudo here, but it
> could be rbac or something else. We also may have a policy to not
> escalate privileges.
>
> At the risk of conflation and maybe being too coarse grained, I think we
> could build in something to do this.
>
> What do you think of a node property: admin_mode
>
> Maybe we could default to: never-escalate and have another option: sudo?
>
> I'm sure the names are bad, so please suggest otherwise.

I see two related concerns here which we may be happy to combine, or
want to split:
1) Do we want to execute commands on this node with admin
privileges...and if so, *all* commands or just specified commands?
2) If we want to execute some or all commands as admin, how is this to
be achieved?

2) is platform specific and could be part of a "connection spec" for a
node, along with the protocol to use for command execution (e.g. SSH),
file transfer (e.g. SFTP) etc.
1) is more of a conceptual property that could fit with the "admin_mode"
property discussed. However, in line with the yes/no/how separation
being discussed, values such as "never", "bootstrap-only" (whatever that
might mean), "all" etc. would make more sense.

The connection spec for a node would have a corresponding property
indicating whether it supports admin_mode command execution.

ap

Adrian Cole

unread,
Jan 30, 2011, 5:29:41 AM1/30/11
to jclou...@googlegroups.com, no...@dtosolutions.com
Ok.  I think we're going to need to move the larger topic to another release, as there's definitely more to this concept.

Here's an idea of what we could try in this release.

  * we have sudo_password

What if we base our execution of sudo on presence of this property, and whether or not it is blank? 

  ex.
 
  if (node.sudo_password)
      if (sudo_password blank)
            run sudo normally
      else
           echo sudo_password |sudo -S cmd

Hugo,

Would above rules like this allow your stuff to work without adding an additional property?

Noah,

Would specifying a "blank' sudo_password be reasonable to you?

-A


--

Adrian Cole

unread,
Jan 30, 2011, 4:55:50 PM1/30/11
to jclou...@googlegroups.com, no...@dtosolutions.com
Hi, team.

I think that rather than put in a half-baked property, it would be better to continue this discussion after the jclouds release. 

I'm going to promote byon, but take the sudo_password property completely out of the yaml format.  Note that even without sudo_password, the current jclouds codebase will do a simple passwordless sudo attempt if asked for runAsRoot and the username isn't literally "root."  So, the rundeck case should work fine, as well anyone who sets up sudoers files as noah mentioned.

After tomorrow, I'll summarize this sudo issue on a new thread we can work out during the next cut.

Thanks for all the feedback!
-A
Reply all
Reply to author
Forward
0 new messages