How to extend beyond 16 hours when reserved for more

7 views
Skip to first unread message

Ertza Warraich

unread,
Jun 9, 2024, 10:56:12 PM (9 days ago) Jun 9
to cloudlab-users
Hi, 

I wanted to know how can I extend my experiment beyond the 16 hours when the reservation was already approved for 3 days? This is my experiment link: https://www.cloudlab.us/status.php?uuid=95973664-26cf-11ef-9f39-e4434b2381fc

Also, I received an email that my reservation has not been used and it will rescind, however, I was already using the reservation at that time and it had been a few hours that I was using it, I did delete my experiment and was creating it again when I received the email.

Thank you in advance,
Ertza

Leigh Stoller

unread,
Jun 10, 2024, 8:55:21 AM (9 days ago) Jun 10
to 'Nurlan Nazaraliyev' via cloudlab-users

> I wanted to know how can I extend my experiment beyond the 16 hours when the reservation was already approved for 3 days? This is my experiment link: https://www.cloudlab.us/status.php?uuid=95973664-26cf-11ef-9f39-e4434b2381fc

Hi. Click on the Extend button on the experiment status page. Within the first
two weeks of an experiment, extensions are generally auto approved.

> Also, I received an email that my reservation has not been used and it will rescind, however, I was already using the reservation at that time and it had been a few hours that I was using it, I did delete my experiment and was creating it again when I received the email.

The experiment history shows you started an experiment at 11:24am (Mountain time)
and terminated it five minutes later. That is only experiment I see in your history.
That was too short to pass the automated checks that ensure a reservation is not
being wasted.

Leigh

Ertza Warraich

unread,
Jun 10, 2024, 1:24:10 PM (8 days ago) Jun 10
to cloudla...@googlegroups.com
I think there might be some issue in how I create/start the experiment, because:

1. I cannot extend it and it automatically denies my request stating that the resources are reserved for other projects and it cannot be assigned to the experiment
2. And I was running and using the cluster for 7-8 hours continuously, atleast, but if it doesn't show in the logs then maybe the experiment is not "connected" to my reservation or something?

Both these issues seem correlated to me. Also, I was only able to get 10 nodes despite the multiple tries, any 2 random nodes would fail to boot up and stay stuck in 'changing' state, can that be a reason that my reservation is not counted as being used as I am not using ALL of the reservation nodes?

But anyways the logs not showing me using the resources for hours is still something I am not sure why. For the experiment setup I simply create a new experiment profile and use it to request the resources without mentioning any reservation id or something in the geni script, is that correct, or is there somewhere I need to connect the two?

Thank you again. 

--
You received this message because you are subscribed to a topic in the Google Groups "cloudlab-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cloudlab-users/CxSchHN-VOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cloudlab-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloudlab-users/32FA4DDE-E287-4B67-ACCF-6E9D6674F9B8%40gmail.com.

Leigh Stoller

unread,
Jun 10, 2024, 2:07:58 PM (8 days ago) Jun 10
to 'Nurlan Nazaraliyev' via cloudlab-users

> 1. I cannot extend it and it automatically denies my request stating that the resources are reserved for other projects and it cannot be assigned to the experiment
> 2. And I was running and using the cluster for 7-8 hours continuously, atleast, but if it doesn't show in the logs then maybe the experiment is not "connected" to my reservation or something?
>
> Both these issues seem correlated to me. Also, I was only able to get 10 nodes despite the multiple tries, any 2 random nodes would fail to boot up and stay stuck in 'changing' state, can that be a reason that my reservation is not counted as being used as I am not using ALL of the reservation nodes?

Hi. Let me clarify. Your reservation was scheduled to start June 9, 10am
(Mountain). You ran a 5 minute experiment on June 9 at 10:24. Your
reservation was canceled. Your next experiment was June 10 at 10:22am.
By then you had lost your reservation and some of those nodes were
likely picked up by other users.

In other words, make sure you schedule the start of your reservation for
a time when you can utilize them. The clusters are all very very busy, and
so the system will take back underutilized resources.

Leigh


Ertza Warraich

unread,
Jun 10, 2024, 2:37:55 PM (8 days ago) Jun 10
to cloudla...@googlegroups.com
Then again, I am saying I ran an 8 hour experiment on the reservation on June 9th, utilizing 10 nodes, starting at around 11 AM (Mountain Time) and I was actively using it.

I did all my installations and setups in the VMs and I was actively using it for 7-8 hours, I don't know why it does not show up in your logs and that is why I asked if I am missing something in my experiment setup. 

Maybe my last question wasn't clear, but I was actively using the cluster when I received the email that stated I am not using my reservation, and I also was not able to extend my reservation. That tells me there was some issue in the experiment not being "linked" to the reservation or something.

And again, all of this trying to extend it time and again and working on the experiments was happening on June 9th, starting around 11 AM and onwards (Mountain Time).

--
You received this message because you are subscribed to a topic in the Google Groups "cloudlab-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cloudlab-users/CxSchHN-VOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cloudlab-user...@googlegroups.com.

Leigh Stoller

unread,
Jun 10, 2024, 2:49:12 PM (8 days ago) Jun 10
to 'Nurlan Nazaraliyev' via cloudlab-users

> Then again, I am saying I ran an 8 hour experiment on the reservation on June 9th, utilizing 10 nodes, starting at around 11 AM (Mountain Time) and I was actively using it.
>
> I did all my installations and setups in the VMs and I was actively using it for 7-8 hours, I don't know why it does not show up in your logs and that is why I asked if I am missing something in my experiment setup.
>
> Maybe my last question wasn't clear, but I was actively using the cluster when I received the email that stated I am not using my reservation, and I also was not able to extend my reservation. That tells me there was some issue in the experiment not being "linked" to the reservation or something.

Ah I see. Small detail I was not aware of; the experiment failed and
that you clicked on the "Ignore Failure” button. There appears to be
a bug there, in how the experiment is recorded, which also affects
the reservation utilization code.

I will make a note to track that down.

Thanks!
Leigh


Ertza Warraich

unread,
Jun 11, 2024, 12:38:07 AM (8 days ago) Jun 11
to cloudla...@googlegroups.com
Ah okay, so I should just leave the "Ignore" button alone. I thought I needed to use the ignore button to set the state and move forward with using the experiment with whatever amount nodes did come up successfully, but I will leave it be from now on.

Thanks for the help on this.  

--
You received this message because you are subscribed to a topic in the Google Groups "cloudlab-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cloudlab-users/CxSchHN-VOk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cloudlab-user...@googlegroups.com.

Leigh Stoller

unread,
Jun 11, 2024, 8:22:15 AM (8 days ago) Jun 11
to 'Nurlan Nazaraliyev' via cloudlab-users

> Ah okay, so I should just leave the "Ignore" button alone. I thought I needed to use the ignore button to set the state and move forward with using the experiment with whatever amount nodes did come up successfully, but I will leave it be from now on.

Hi. Sorry, that is not what I meant. I just didn’t know that you had done
that, so I was looking in the wrong place. I have since changed the
My History link to include “failed” experiments so that it is more obvious.

But always good to provide as much detail as you can when reporting
a problem. :-)

Most often, a node can be brought back to life with a power cycle. But in
general, you would use the Ignore Errors button if you have a larger experiment,
and you can tolerate one or two failed nodes. Since restarting a large
experiment can take a long time, and there might not be other nodes available
anyway.

Regarding the canceled reservation yesterday, I have a fix for that I am
testing, I expect to install that today.

Leigh

Reply all
Reply to author
Forward
0 new messages