compute: documenting last tested image, location, etc. ids

2 views
Skip to first unread message

Adrian Cole

unread,
Jul 18, 2011, 8:16:31 PM7/18/11
to jclou...@googlegroups.com
Hi, team.

There's a best practice of serializing the image, hardware, and
location ids derived from templateBuilder when using jclouds in
production scenarios. This prevents shifting sands once you are
through dev. However, even during dev images can change and add
entropy to the process of troubleshooting something. I've heard this
problem arise in many downstream systems such as pallet and whirr
(more recently [1])

Our templateBuilder expressions shouldn't return a different id on
every run, and if they do, this is likely a problem that can be
revised with a more specific expression. Regardless, we should know
what was last tested so that when we release a version of jclouds,
downstream users can understand what configurations are stable
regardless of whether they choose to use template matching or explicit
ids.

An idea to achieve this is to serialize to disk the default
templateBuilder expression and the ids inside the template that
resolved to.

For example, the template parameters in gogrid:

osFamily(UBUNTU).osVersionMatches("1[10].[10][04]").imageNameMatches(".*w/
None.*")

ex.
This could be stored in an extension to ProviderMetadata or in json as:

providers/gogrid/src/main/resources/org/jclouds/gogrid/compute/templateBuilder.json
{
"osFamily" : "UBUNTU",
"osVersionMatches" : "1[10].[10][04]",
"os64Bit" : true
}

Then, once the *ComputeServiceLiveTest completes, we can store the
resulting template location, image, and hardware ids, and template
options to a file.

providers/gogrid/lasttested/template.json
{
"locationId" : "1"
"imageId" : "5489",
"hardwareId" : "1"
}

Thoughts are that this not only exposes what configuration was tested,
but by whom and when. As we'd commit the lasttested updates, it
should have the owner id and timestamp which will help also provide
visibility into our test process.

Thoughts?
-Adrian

[1] https://issues.apache.org/jira/browse/WHIRR-341

Andrew Phillips

unread,
Jul 19, 2011, 3:53:44 PM7/19/11
to jclou...@googlegroups.com
> providers/gogrid/lasttested/template.json
> {
> "locationId" : "1"
> "imageId" : "5489",
> "hardwareId" : "1"
> }
>
> Thoughts are that this not only exposes what configuration was tested,
> but by whom and when. As we'd commit the lasttested updates, it
> should have the owner id and timestamp which will help also provide
> visibility into our test process.
>
> Thoughts?

I like the idea of storing a "signature" of the images that were
actually used, although I would place it in "target" (or somewhere
under the derived file tree) because it's the result of an execution.
Jenkins and all the other CI platforms have nice ways to archive such
files if a permanent record is desired.

I don't feel it should be part of the *source* because, as you point out,

"templateBuilder expressions shouldn't return a different id on
every run, and if they do, this is likely a problem that can be
revised with a more specific expression"

Basically, if your requirements for tests are such that you need an
image that is more specific than those matched by your template
expression, then there's a bug in the code.
If the requirements can't (yet) be expressed by a template, we'd have
a feature request.

But if the template really does capture all the requirements you have
for your image...why shouldn't it work even if a different (matching)
image were returned every time.

That said, I'm all for predictable behaviour, so if there *are* some
sensible rules we can apply to deterministically pick one of the
matching images that would certainly be worth looking at.

However, that would probably imply testing *all* available images
(rather than picking the first that matches), which could be quite a
performance hint.

Perhaps this could be an additional arg to templateBuilder?
.pickFirstMatching() or .lexicallySortAndPickFirst() etc....?

ap

Adrian Cole

unread,
Jul 19, 2011, 7:01:03 PM7/19/11
to jclou...@googlegroups.com
andrew. Thanks very much for the comments. inline below:

On Wed, Jul 20, 2011 at 5:53 AM, Andrew Phillips <aphi...@qrmedia.com> wrote:
>> providers/gogrid/lasttested/template.json
>> {
>>   "locationId" : "1"
>>   "imageId" : "5489",
>>   "hardwareId" : "1"
>> }
>>
>> Thoughts are that this not only exposes what configuration was tested,
>> but by whom and when.  As we'd commit the lasttested updates, it
>> should have the owner id and timestamp which will help also provide
>> visibility into our test process.
>>
>> Thoughts?
>
> I like the idea of storing a "signature" of the images that were actually
> used, although I would place it in "target" (or somewhere under the derived
> file tree) because it's the result of an execution. Jenkins and all the
> other CI platforms have nice ways to archive such files if a permanent
> record is desired.

Well, one desire of mine is to be able to know if we are testing
something different than before. For example, when the cloud provider
or image provider updates a template, it would be nice to be able to
log a warning or something saying this is fresh. I'm keen on other
places to retrieve this from, just seemed nice to be able to run
without a dependency. Where do you think we could store test
results?.. hehe maybe a CloudBees app? :)


>
> I don't feel it should be part of the *source* because, as you point out,
>
> "templateBuilder expressions shouldn't return a different id on
> every run, and if they do, this is likely a problem that can be
> revised with a more specific expression"
>
> Basically, if your requirements for tests are such that you need an image
> that is more specific than those matched by your template expression, then
> there's a bug in the code.
> If the requirements can't (yet) be expressed by a template, we'd have a
> feature request.

I think this will unveil new features for templateBuilder, or perhaps
give us enough momentum to get beyond past stalemates on the topic ;)
[1] I'm particularly interested in capturing the intent of the image,
which is generally a combination of it being free, small, owned by a
stable entity (generally the provider), and baseOs. When we get more
specific queries, I think the fear of image thrashing will be quelled.
Before we start, we should really be able to measure this, hence the
thought about storing the test results.

>
> But if the template really does capture all the requirements you have for
> your image...why shouldn't it work even if a different (matching) image were
> returned every time.

This is a concern I agree with, but I don't think our query is strong
enough yet (to your point above). Especially when the image cannot be
guaranteed to be the base os, or even from a specific publisher, I can
imagine trouble can arise, and it does. So in a way, I think our
template expression isn't quite specific enough to give downstream
tools a high enough level chance.

That said, I rarely have to muddle with the default image for our
tests to pass, and I can't remember the last time I needed to do so on
aws-ec2. Maybe this success should be a warning sign. Perhaps we
need something more quirky than installing openjdk, an os user, sudo,
setting up ip tables, modifying ssh, and running jboss.

Maybe more important is that we only test one version of one operating
system. Just because our template that matches amazon linux works per
above, it means nothing if you choose ubuntu or centos. I think we
will end up needing relevant expressions for each OS. For example, I
want CentOs published by Rightscale, but if I choose Ubuntu, I want
Canonical's, but not their daily build. Again.. this is more to your
point above. When we do have more precise expressions, I think tools
will have a good chance on working.

>
> That said, I'm all for predictable behaviour, so if there *are* some
> sensible rules we can apply to deterministically pick one of the matching
> images that would certainly be worth looking at.

+1 I've labored this point above :D

>
> However, that would probably imply testing *all* available images (rather
> than picking the first that matches), which could be quite a performance
> hint.

Well, I think what we can do is basically query users for the images
they trust, and get better at capturing those requirements. For
example instead of ami-2131234, which is only relevant to a zone and a
pretty opaque ask, query the user about what essentially are they
looking for? In this case, perhaps it is a Debian from a specific
owner. When we have better queries, we can match on any region.
Picking favorites has been recent past for us, but I think it is
worthwhile testing on at least 2 operating systems per cloud, since
not everyone use ubuntu. Perhaps starting with CentOs (or RHEL'ish) +
Ubuntu, refining templates for these, and testing the entire compute
suite based on a profile (-Dtest.compute.osfamily=CENTOS) It does
double the test execution time, but I think it may be worth it.

>
> Perhaps this could be an additional arg to templateBuilder?
> .pickFirstMatching() or .lexicallySortAndPickFirst() etc....?

+1 I like the idea of exposing these things.
>
> ap
>
> --
> You received this message because you are subscribed to the Google Groups
> "jclouds-dev" group.
> To post to this group, send email to jclou...@googlegroups.com.
> To unsubscribe from this group, send email to
> jclouds-dev...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/jclouds-dev?hl=en.
>
>
[1] http://groups.google.com/group/jclouds-dev/browse_thread/thread/71917fe9a015b3a8/b870c2a7ad870b3d?lnk=gst&q=ownerid#b870c2a7ad870b3d

Reply all
Reply to author
Forward
0 new messages