Problems with image

4 views
Skip to first unread message

john.ou...@gmail.com

unread,
Jun 27, 2024, 11:55:48 AM (7 days ago) Jun 27
to cloudlab-users
The experiment below failed to start because of problems with the image:


The error message in the experiment says:

*** ERROR: os_setup: *** Image file for image homa-PG0/homa6138:13 on [Node: amd271] *** (c6525-100g) does not exist

This is curious, because I can see that the image does exist (I created it yesterday, so this is the first time it has been used to start an experiment). Also, the profile doesn't actually specify "homa6138:13", it specifies just "homa6138"; to add the ":13" suffix the system must have noticed the existence of that version?

Any suggestions? Is there something I'm doing wrong?

-John-

Leigh Stoller

unread,
Jun 27, 2024, 12:36:17 PM (7 days ago) Jun 27
to 'Nurlan Nazaraliyev' via cloudlab-users

> https://www.cloudlab.us/status.php?uuid=7837ddb1-349b-11ef-9f39-e4434b2381fc
>
> The error message in the experiment says:
>
> *** ERROR: os_setup: *** Image file for image homa-PG0/homa6138:13 on [Node: amd271] *** (c6525-100g) does not exist
>
> This is curious, because I can see that the image does exist (I created it yesterday, so this is the first time it has been used to start an experiment). Also, the profile doesn't actually specify "homa6138:13", it specifies just "homa6138"; to add the ":13" suffix the system must have noticed the existence of that version?

The :13 is normal, that is version number internaly, nothing to worry about.
The actual problem is that the image creation failed and left it behind.
Surprised you did not see a failure message or email.

I deleted the broken image descriptor, the system will revert to version 12 now.

Leigh

john.ou...@gmail.com

unread,
Jun 27, 2024, 12:46:16 PM (7 days ago) Jun 27
to cloudlab-users
Ahah, thanks. I did see the error message creating the image, which was this:

About to: '/usr/testbed/bin/sshtb -n -o ConnectTimeout=5 -host hp160 /usr/local/bin/imagezip -i -F 0 -s 0 /dev/ada0'
About to: '/usr/testbed/bin/sshtb -n -o ConnectTimeout=5 -host hp160 df -k /dev/ada0'
reboot (hp160): Attempting to reboot ...
Emulab scheduling tbprepare to run via systemd.
reboot (hp160): Successful!
reboot: Done. There were 0 failures.
   hp160
Waiting up to 360 seconds for nodes to come up.
*** create_image:
    Failed to boot MFS on: hp160
Restored startup cmd from virt_nodes for hp160
reboot (hp160): Attempting to reboot ...
*** node_reboot: hp160 appears dead; will power cycle.
hp160 now rebooting
reboot: Done. There were 0 failures.
   hp160
FAILED: Node setup failed ...

I (mis)interpreted this as meaning there was a problem with the node rebooting after image creation, not a problem actually creating the image. Any reason why a poisonous image is left around after image creation failures?

-John-

Leigh Stoller

unread,
Jun 27, 2024, 1:00:37 PM (7 days ago) Jun 27
to cloudla...@googlegroups.com

> I (mis)interpreted this as meaning there was a problem with the node rebooting after image creation, not a problem actually creating the image. Any reason why a poisonous image is left around after image creation failures?

So if you want to start up a new experiment and give it another try, Mike
might be able to diagnose the image creation problem, if it happens again.
Leave the experiment running if it does fail.

Thanks
Leigh


Reply all
Reply to author
Forward
0 new messages