vm_slots in cloud_resources.conf

miccloud

unread,

May 7, 2010, 8:24:43 AM5/7/10

to cloudscheduler

I have set vm_slots: 3 in cloud_resources.conf. But if I launch a
file.job like this cloudscheduler always launch a new ami.

Universe = vanilla
Executable = /bin/hostname
Arguments = -f
Log = x.log
Output = x.out
Error = x.error
priority = 10
should_transfer_files = YES
when_to_transfer_output = ON_EXIT

+VMAMI = "ami-fdee0094"

Queue

Why? Does cloudscheduler stop to launch ami when the machine are
equals to vm_slots quantity?
If I launch four time file.job it always setup a new machine.

And when cloudscheduler will be able to shut down the machine that are
idle for a long time?

Michael Paterson

unread,

May 7, 2010, 12:22:20 PM5/7/10

to cloudsc...@googlegroups.com

miccloud wrote:
> I have set vm_slots: 3 in cloud_resources.conf. But if I launch a
> file.job like this cloudscheduler always launch a new ami.
>
> Universe = vanilla
> Executable = /bin/hostname
> Arguments = -f
> Log = x.log
> Output = x.out
> Error = x.error
> priority = 10
> should_transfer_files = YES
> when_to_transfer_output = ON_EXIT
>
> +VMAMI = "ami-fdee0094"
>
> Queue
>
> Why? Does cloudscheduler stop to launch ami when the machine are
> equals to vm_slots quantity?
> If I launch four time file.job it always setup a new machine.
>
>

This appears to be a bug with the EC2 code, Normally if you specify 3 vm
slots in your cluster it will boot a maximum of 3 VMs. (at least under
Nimbus, Patrick would know more about the EC2 implementation if this was
intended or not)

> And when cloudscheduler will be able to shut down the machine that are
> idle for a long time?
>

Currently, cloudscheduler shuts down idle machines when the number of
jobs in the queue is less than the number of machines (or when other
users are requesting different vmtypes)
In order to function as an overflow and handling VMs created when your
normal condor pool is full this will probably require some tweaking to
account for that, cloudscheduler has not been tested under that setup
that I know of.

Patrick Armstrong

unread,

May 7, 2010, 1:05:54 PM5/7/10

to cloudsc...@googlegroups.com

On 7-May-10, at 9:22 AM, Michael Paterson wrote:
> This appears to be a bug with the EC2 code, Normally if you specify
> 3 vm slots in your cluster it will boot a maximum of 3 VMs. (at
> least under Nimbus, Patrick would know more about the EC2
> implementation if this was intended or not)

This was a bug, yes. It's been fixed in the dev branch on github in
this commit: http://github.com/hep-gc/cloud-scheduler/commit/98461024dd4e7304db19756e82d8486255db1a3d

>> And when cloudscheduler will be able to shut down the machine that
>> are
>> idle for a long time?
>>
> Currently, cloudscheduler shuts down idle machines when the number
> of jobs in the queue is less than the number of machines (or when
> other users are requesting different vmtypes)

Yes. Michele, how do you know the machines aren't being shut down? Are
you still seeing them in the output of condor_status? If you are,
check the output of cloud_status -m, and make sure they match. If you
don't have your init script set up correctly, you won't see the
machine disappear from condor_status until CLASSAD_LIFETIME expires,
which iirc defaults to 15 minutes.

> In order to function as an overflow and handling VMs created when
> your normal condor pool is full this will probably require some
> tweaking to account for that, cloudscheduler has not been tested
> under that setup that I know of.

That's right, it's a use case we haven't really considered yet.

--patrick

miccloud

unread,

May 11, 2010, 5:09:17 AM5/11/10

to cloudscheduler

Hi,
vm_slots problem is solved.

But if I submit a work like this:
Universe = vanilla
Executable = /usr/local/bin/om_model
Arguments= -r model_request.xml
Log = log/om_model.log.$(PROCESS)
transfer_input_files = http://site.com/image.png
should_transfer_files=YES
when_to_transfer_output = ON_EXIT
Output = output/modello.xml
Error = log/om_model.err.log.$(PROCESS)
Requirements = VMType =?= "om.vm.type"
+VMAMI = "ami-65ee070"
Queue

When I type condor_q -better-analyze I have this output:
condor_q -better-analyze

-- Submitter: hermes.pin.unifi.it : <150.217.48.161:8080> :
hermes.pin.unifi.it
---
022.000: Run analysis summary. Of 1 machines,
1 are rejected by your job's requirements
0 reject your job because of their own requirements
0 match but are serving users with a better priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 match but are currently offline
0 are available to run your job
No successful match recorded.
Last failed match: Tue May 11 11:01:17 2010
Reason for last match failure: no match found

WARNING: Be advised:
No resources matched request's constraints

The Requirements expression for your job is:

( target.VMType is "om.vm.type" ) && ( target.Arch == "INTEL" ) &&
( target.OpSys == "LINUX" ) && ( target.Disk >= DiskUsage ) &&
( ( ( target.Memory * 1024 ) >= ImageSize ) &&
( ( RequestMemory * 1024 ) >= ImageSize ) ) &&
( target.HasFileTransfer )

Condition Machines Matched Suggestion
--------- ---------------- ----------
1 ( target.VMType is "om.vm.type" ) 0 MODIFY TO
undefined
2 ( ( ( 1024 * target.Memory ) >= 75 ) && ( ( 1024 *
ceiling(ifThenElse(JobVMMemory isnt undefined,JobVMMemory,
7.324218750000000E-02)) ) >= 75 ) )
0 REMOVE
3 ( target.Arch == "INTEL" ) 1
4 ( target.OpSys == "LINUX" ) 1
5 ( target.Disk >= 75 ) 1
6 ( target.HasFileTransfer ) 1

If I remove the line Requirements the job is succesfully executed.
In condor_config.local file of worker node I have set
VMType="om.vm.type".

If I type condor_status and condor_status -m I see the node that
CloudScheduler get up.

This is the output of cloud_status -m
VM ID VM TYPE STATUS CLUSTER
i-d90150b2 om.vm.type Running aws.amazon.com

So the machine are not shut down.

What can I do?

Thanks.

On May 7, 7:05 pm, Patrick Armstrong <patri...@uvic.ca> wrote:
> On 7-May-10, at 9:22 AM, Michael Paterson wrote:
>
> > This appears to be a bug with the EC2 code, Normally if you specify
> > 3 vm slots in your cluster it will boot a maximum of 3 VMs. (at
> > least under Nimbus, Patrick would know more about the EC2
> > implementation if this was intended or not)
>
> This was a bug, yes. It's been fixed in the dev branch on github in

> this commit:http://github.com/hep-gc/cloud-scheduler/commit/98461024dd4e7304db197...

Reply all

Reply to author

Forward