bound request failing

4 views
Skip to first unread message

Ezra Kissel

unread,
Jun 9, 2014, 10:01:07 AM6/9/14
to protoge...@googlegroups.com
Hi,

I have been trying to get the attached rspec working at protogeni utah
for the last week or so and have been hitting a number of roadblocks.
The background is that I want to bring up a shared vlan but also use a
custom image on the two nodes spanning the necessary LAN. I was able to
get the shared vlan working at PG once last week, the POA command
worked, and things looked promising. I now can't seem to get back to
that state no matter what I try.

I look for "available now=true" resources in the PG Utah advertisement,
find two nodes, and update my request rspec. When I try to
createsliver, the status is either failed or unknown almost immediately,
and the node resources are "notready". This also happens without my
custom image and removing any additional properties for the sliver_type.
Attached is the sliverstatus for an active slice. Could someone
please take a look?

thanks,
- ezra
shared_vlan_utah_pg.xml
idms-pg-utah-sliverstatus-emulab-net.json

Ezra Kissel

unread,
Jun 9, 2014, 12:54:30 PM6/9/14
to protoge...@googlegroups.com
I just got a notice from emulab stating that the two physical nodes
(pc218, pc238) I requested were idle for a long time. Is that the
problem, maybe, that it's giving me raw PCs instead of Xen VMs?

Leigh Stoller

unread,
Jun 9, 2014, 2:53:50 PM6/9/14
to protoge...@googlegroups.com
> I just got a notice from emulab stating that the two physical nodes (pc218, pc238) I requested were idle for a long time. Is that the problem, maybe, that it's giving me raw PCs instead of Xen VMs?

You got VMs … but you fixed them to specific nodes, so instead of VMs
on a shared node, you got dedicate VMs. If you leave the request unbound,
you will shared nodes.

Leigh





Leigh Stoller

unread,
Jun 9, 2014, 2:57:55 PM6/9/14
to protoge...@googlegroups.com
> I look for "available now=true" resources in the PG Utah advertisement, find two nodes, and update my request rspec. When I try to createsliver, the status is either failed or unknown almost immediately, and the node resources are "notready". This also happens without my custom image and removing any additional properties for the sliver_type. Attached is the sliverstatus for an active slice. Could someone please take a look?

Hi Ezra. This sliver looks fine right now. Are you still seeing
sliver status with a failure in it?

Leigh





Ezra Kissel

unread,
Jun 9, 2014, 2:58:26 PM6/9/14
to protoge...@googlegroups.com
On 06/09/2014 02:53 PM, Leigh Stoller wrote:
>> I just got a notice from emulab stating that the two physical nodes (pc218, pc238) I requested were idle for a long time. Is that the problem, maybe, that it's giving me raw PCs instead of Xen VMs?
>
> You got VMs ... but you fixed them to specific nodes, so instead of VMs
> on a shared node, you got dedicate VMs. If you leave the request unbound,
> you will shared nodes.
>
> Leigh
>

Oh, so you get a dedicated physical node for your VM even with
exclusive=false? Just specifying a component_id guarantees that?

btw, I was finally able to get that rspec to work after a few more tries
(and fixing the image URLs). I was getting confused because the
behavior and status returned is different for the PG and IG AMs...

Nicholas Bastin

unread,
Jun 9, 2014, 3:00:55 PM6/9/14
to protoge...@googlegroups.com
On Mon, Jun 9, 2014 at 2:58 PM, Ezra Kissel <ezki...@indiana.edu> wrote:
Oh, so you get a dedicated physical node for your VM even with exclusive=false?  Just specifying a component_id guarantees that?

There are other hints in the advertisement rspec which tell you whether a given node is *already* a VM host.  If it's not, then it will get converted to one and you will have the node exclusively.

I don't recall what the tags are off the top of my head, but the code in geni-lib (https://bitbucket.org/barnstorm/geni-lib) tries to handle this for you so you could poke around in there (or just use it.. :-)).

--
Nick

Leigh Stoller

unread,
Jun 9, 2014, 3:06:59 PM6/9/14
to protoge...@googlegroups.com
> Oh, so you get a dedicated physical node for your VM even with
> exclusive=false? Just specifying a component_id guarantees that?

Exclusive is a hint, it says that you are willing to tolerate a shared
node.

If you specify a specific node, and that node is a shared node host, and it
has slots available, you will get it.

> btw, I was finally able to get that rspec to work after a few more tries
> (and fixing the image URLs). I was getting confused because the behavior
> and status returned is different for the PG and IG AMs...

In what sense? They should be the same, so if we have something broken, it
would be good to know.

Leigh






Ezra Kissel

unread,
Jun 9, 2014, 3:09:47 PM6/9/14
to protoge...@googlegroups.com
On 6/9/2014 3:00 PM, Nicholas Bastin wrote:
> On Mon, Jun 9, 2014 at 2:58 PM, Ezra Kissel <ezki...@indiana.edu
> <mailto:ezki...@indiana.edu>> wrote:
>
> Oh, so you get a dedicated physical node for your VM even with
> exclusive=false? Just specifying a component_id guarantees that?
>
>
> There are other hints in the advertisement rspec which tell you whether
> a given node is *already* a VM host. If it's not, then it will get
> converted to one and you will have the node exclusively.
>
> I don't recall what the tags are off the top of my head, but the code in
> geni-lib (https://bitbucket.org/barnstorm/geni-lib) tries to handle this
> for you so you could poke around in there (or just use it.. :-)).
>
> --

Ok - I'll check that out. If I understand your comment, though, you
could get a different result for the same rspec depending on which
component you pick. If the node is already a VM host, you get a shared
Xen VM on that host. If it's not already shared, it turns into a VM
host but it's exclusively yours? I was expecting that specifying a
component_id would always get you a shared VM on that resource, and you
would have to ask explicitly if you want a dedicated node. Either way
is fine, just curious...

Ezra Kissel

unread,
Jun 9, 2014, 3:15:35 PM6/9/14
to protoge...@googlegroups.com
On 6/9/2014 3:06 PM, Leigh Stoller wrote:
>> Oh, so you get a dedicated physical node for your VM even with
>> exclusive=false? Just specifying a component_id guarantees that?
>
> Exclusive is a hint, it says that you are willing to tolerate a shared
> node.
>
> If you specify a specific node, and that node is a shared node host, and it
> has slots available, you will get it.
>
>> btw, I was finally able to get that rspec to work after a few more tries
>> (and fixing the image URLs). I was getting confused because the behavior
>> and status returned is different for the PG and IG AMs...
>
> In what sense? They should be the same, so if we have something broken, it
> would be good to know.
>

Here's what I noticed:

* ig-bbn tells me that hostname exceeds 63 characters. pg-utah does not.
* ig-bbn tells me image url metadata cannot be read, pg-utah does not.

Status for pg-utah just said nodes were "notready" and gave me no
indication as to what was wrong. Overall status would either be
"unknown" or "failed" but no indication as to why that I could find.

When I fixed things up for the ig-bbn case, then I tried the updated
rspec at pg-utah and it worked after some time. Again, nodes were
"notready" instead of "changing" like at ig-bbn, but I probably miss
intermediate state changes as I poll for sliverstatus.

I guess the key thing would be to get those helpful error notifications
everywhere.

Leigh Stoller

unread,
Jun 9, 2014, 3:53:02 PM6/9/14
to protoge...@googlegroups.com
> * ig-bbn tells me that hostname exceeds 63 characters. pg-utah does not.

It is the fully qualified domain name, and utah-pg and ig-bbn have
different domain (bbn is a longer domain, so eating up more of the
63 chars you have). Pick shorter names?

> * ig-bbn tells me image url metadata cannot be read, pg-utah does not.

Details please.

> Status for pg-utah just said nodes were "notready" and gave me no
> indication as to what was wrong. Overall status would either be "unknown"
> or "failed" but no indication as to why that I could find.

Again, I see your latest slice to be up and going. So when this happens,
please send us details right away so we can look.

Leigh






Ezra Kissel

unread,
Jun 9, 2014, 4:24:32 PM6/9/14
to protoge...@googlegroups.com
On 6/9/2014 3:52 PM, Leigh Stoller wrote:
>> * ig-bbn tells me that hostname exceeds 63 characters. pg-utah does not.
>
> It is the fully qualified domain name, and utah-pg and ig-bbn have
> different domain (bbn is a longer domain, so eating up more of the
> 63 chars you have). Pick shorter names?
>

Ok - so both AMs do error if the final hostname is > 63 characters but I
forgot that underscores aren't allowed in hostnames, although DNS is
fine with underscores.

From the emulab log on a VM:

Resetting hostname to
ibig_100_5.idms-bu-stitch.ch-geni-net.instageni.gpolab.bbn.com ...
hostname: the specified hostname is invalid
*** FAILED!

That might not be a big deal unless something on the host is depending
on the hostname being set. One of those self-inflicted annoyances.


>> * ig-bbn tells me image url metadata cannot be read, pg-utah does not.
>
> Details please.
>

$ omni.py -f portal -r idms -a ig-bbn createsliver testrspec
ig_single_ibp_server.xml
...
16:11:17 WARNING omni: Failed CreateSliver for slice testrspec at
ig-bbn. Error from Aggregate: code 2. protogeni AM code: 2: Could not
setup images:
*** image_import:
Could not read metadata from
https://www.emulab.net/image_metadata.php?uuid=4324acc3-ea98-11e3-8053-001143e453fe
(PG log url - look here for details on any failures:
https://boss.instageni.gpolab.bbn.com/spewlogfile.php3?logfile=54f599f5f2cfd057a9bb26a52f023991).
16:11:17 INFO omni:
------------------------------------------------------
16:11:17 INFO omni: Completed createsliver:
Args: createsliver testrspec ig_single_ibp_server.xml

Result Summary: Failed CreateSliver for slice testrspec at ig-bbn.
Error from Aggregate: code 2. protogeni AM code: 2: Could not setup images:
*** image_import:
Could not read metadata from
https://www.emulab.net/image_metadata.php?uuid=4324acc3-ea98-11e3-8053-001143e453fe
(PG log url - look here for details on any failures:
https://boss.instageni.gpolab.bbn.com/spewlogfile.php3?logfile=54f599f5f2cfd057a9bb26a52f023991).

16:11:17 INFO omni:
======================================================

Compare to:

$ omni.py -f portal -r idms -a pg-utah createsliver testrspec
ig_single_ibp_server.xml
...
16:06:47 INFO omni: (PG log url - look here for details on any
failures:
https://www.emulab.net/spewlogfile.php3?logfile=8dc51275d2242a2981322f7590a2af89)
16:06:47 INFO omni: Got return from CreateSliver for slice testrspec
at utah-pg:
16:06:47 INFO omni: <!-- Reserved resources for:
Slice: testrspec
at AM:
URN: urn:publicid:IDN+emulab.net+authority+cm
URL: https://www.emulab.net:12369/protogeni/xmlrpc/am/2.0
-->
16:06:47 INFO omni: <rspec
xmlns="http://www.geni.net/resources/rspec/3"
xmlns:planetlab="http://www.planet-lab.org/resources/sfa/ext/planetlab/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
type="manifest"
xsi:schemaLocation="http://www.geni.net/resources/rspec/3
http://www.geni.net/resources/rspec/3/manifest.xsd"
expires="2014-06-10T01:06:03Z">

<node client_id="short_name"
component_id="urn:publicid:IDN+emulab.net+node+pc509"
component_manager_id="urn:publicid:IDN+emulab.net+authority+cm"
sliver_id="urn:publicid:IDN+emulab.net+sliver+192605" exclusive="false">
<emulab:routable_control_ip
xmlns:emulab="http://www.protogeni.net/resources/rspec/ext/emulab/1"/>
<sliver_type name="emulab-xen">
<!-- <disk_image
url="https://www.utahddc.geniracks.net/image_metadata.php?uuid=e2b09a1f-ef42-11e3-8689-000000000000"/>
-->
<disk_image
url="https://www.emulab.net/image_metadata.php?uuid=4324acc3-ea98-11e3-8053-001143e453fe"/>
</sliver_type>
<interface client_id="ibp0:if0">
<ip address="10.10.1.100" netmask="255.255.255.0"/>
</interface>
<rs:vnode
xmlns:rs="http://www.protogeni.net/resources/rspec/ext/emulab/1"
name="pcvm509-3"/><host
name="short_name.testrspec.ch-geni-net.emulab.net"/><services><login
authentication="ssh-keys" hostname="pcvm509-3.emulab.net" port="22"
username="kissel"/><login authentication="ssh-keys"
hostname="pcvm509-3.emulab.net" port="22"
username="ezkissel"/></services></node>
</rspec>
16:06:47 INFO omni:
------------------------------------------------------
16:06:47 INFO omni: Completed createsliver:
Args: createsliver testrspec ig_single_ibp_server.xml

Result Summary: Got Reserved resources RSpec from emulab-net
16:06:47 INFO omni:
======================================================

See attached sliverstatus. It just goes into "notready" and overall
sliver status "failed". That image URL in the test rspec refers to an
image I deleted.

>> Status for pg-utah just said nodes were "notready" and gave me no
>> indication as to what was wrong. Overall status would either be "unknown"
>> or "failed" but no indication as to why that I could find.
>
testrspec-sliverstatus-emulab-net.json

Leigh Stoller

unread,
Jun 9, 2014, 5:06:26 PM6/9/14
to protoge...@googlegroups.com
> Ok - so both AMs do error if the final hostname is > 63 characters but I
> forgot that underscores aren't allowed in hostnames, although DNS is fine
> with underscores.
>
> From the emulab log on a VM:
>
> Resetting hostname to
> ibig_100_5.idms-bu-stitch.ch-geni-net.instageni.gpolab.bbn.com
> ... hostname: the specified hostname is invalid
> *** FAILED!

Hmm, we should be catching that underscore earlier, I will check to see
why.
Well, it doesn't exist in the database, and I remember now that this is on
my todo list already; to catch this error earlier.

Leigh






Reply all
Reply to author
Forward
0 new messages