condor_status question about the difference between owner and claimed

3 views
Skip to first unread message

Dimitris Kaseridis

unread,
Oct 31, 2009, 8:44:37 PM10/31/09
to archer-us...@googlegroups.com
Hi,

I am have been running some simulation in archer and after monitoring
the condor_status.

So. Today I was running 30 jobs. I have defined in my scripts to run
the jobs on our local pool if possible.
So condor_q only for our pool gives

Total Owner Claimed Unclaimed Matched Preempting Backfill

INTEL/LINUX 3 1 0 2 0 0 0
X86_64/LINUX 88 10 21 57 0 0 0

Total 91 11 21 59 0 0 0

10 of them are characterized Owner and 21 of them Claimed....
So what is the difference? I haven't done anything drastically
different between the 30 jobs except some parameters and different
workloads.

Thanks,
Dimitris

Nathan Blythe

unread,
Nov 1, 2009, 3:40:04 PM11/1/09
to archer-us...@googlegroups.com
Dimitris,

(None of the official Archer folks have jumped in so I'll answer; I'm
sure I'll be corrected if I'm wrong!)

When you said "condor_q", did you mean "condor_status"? The output
you showed is part of the output of "condor_status", which shows the
statuses of the hosts in (in this case) your local pool.

Owner means that the host is busy doing its own thing - either the
host has local processes with CPU utilization over some threshold or
the keyboard/mouse are in use. Claimed means that some job is
executing on the host (either yours or someone else's).

"condor_q" will show the statuses of the jobs you have submitted.

Hope that helps,
Nathan
> --
>
> You received this message because you are subscribed to the Google Groups
> "Archer User's Group" group.
> To post to this group, send email to archer-us...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/archer-users-group?hl=en.
>
>
>

Dimitris Kaseridis

unread,
Nov 1, 2009, 9:44:00 PM11/1/09
to archer-us...@googlegroups.com
Hi again,

Thanks Nathan, for the clarifications....I read this info over the
condor website also... but
the interesting part is that since some times I am the only running
jobs in the whole archer I noticed that while
I have 30 jobs running normally (according to condor_q )
the condor_status reports 20 slots 'claimed' (normal run) and 10 slots
under 'owner' status.

Since I was the only one running sth in the whole archer at that point
and condor_q claims 30 jobs running... I assumed that all the slots I
used should be 'claimed'..... so the question was why 10 of them look
like 'owner' in condor_status.



--
Dimitris

Nathan Blythe

unread,
Nov 1, 2009, 10:48:20 PM11/1/09
to archer-us...@googlegroups.com
Dimitris,

Are you sure that the hosts marked "owner" are running your jobs? You
can check where your jobs are running by executing "condor_q -run".

I was under the impression that a host marked "owner" won't accept a
job until it is "unclaimed" - someone else can confirm or deny that
I'm sure.

- Nathan

Dimitris Kaseridis

unread,
Nov 2, 2009, 2:32:08 PM11/2/09
to archer-us...@googlegroups.com
well I had checked back then and seemed like that was the case....

I can't check now since my jobs finished. I am not sure I can
replicate it. If it happens again I will let the list know.
I was just curious because I thought I had understood the different
types of modes for the VMs.

Thanks
Dimitris

David Isaac Wolinsky

unread,
Nov 2, 2009, 2:40:55 PM11/2/09
to archer-us...@googlegroups.com
Pretty much all that has been said so far is accurate. Perhaps I can
give some context on why it is happening. OWNER can be based upon cpu
utilization. Sometimes funny things happen in the VMs and cause a spike
in CPU usage. I don't know if it physical or virtual CPU usage. This
causes the appliances to go OWNER / Idle. I wouldn't be alarmed about
it. Though if your jobs are being reset, we can look into this.

Regards,
David

rjo...@gmail.com

unread,
Nov 2, 2009, 2:44:24 PM11/2/09
to Archer User's Group
Dimitris,

Generally a machine is in "Owner" state if there's a user interacting
with it (keyboard, mouse) or if its load is above a threshold (defined
in the Condor config file).

If this was happening on cluster nodes, most likely the machine was
"Owner" because of load exceeding the threshold. This could be due to
programs running on the host, outside the VM; it might also have to do
with how load is measured within the VM - not sure. We should keep an
eye on what's causing this.

I am not sure how we are configured right now, but I think the system
is configured such that when a machine goes to Owner mode your job is
suspended for a period of time; if the machine goes out of Owner
state, your job resumes on the same host. This might be what happened
to your jobs. If it stays on Owner beyond a time threshold, I think
Condor will stop your job and restatr on another host.

--rf


On Nov 2, 2:32 pm, Dimitris Kaseridis <kaseri...@gmail.com> wrote:
> well I had checked back then and seemed like that was the case....
>
> I can't check now since my jobs finished. I am not sure I can
> replicate it. If it happens  again I will let the list know.
> I was just curious because I thought I had understood the different
> types of modes for the VMs.
>
> Thanks
> Dimitris
>
> On Sun, Nov 1, 2009 at 9:48 PM, Nathan Blythe <nbly...@gmail.com> wrote:
> > Dimitris,
>
> > Are you sure that the hosts marked "owner" are running your jobs?  You
> > can check where your jobs are running by executing "condor_q -run".
>
> > I was under the impression that a host marked "owner" won't accept a
> > job until it is "unclaimed" - someone else can confirm or deny that
> > I'm sure.
>
> > - Nathan
>
> > On 11/1/09, Dimitris Kaseridis <kaseri...@gmail.com> wrote:
> >> Hi again,
>
> >> Thanks Nathan, for the clarifications....I read this info over the
> >> condor website also... but
> >> the interesting part is that since some times I am the only running
> >> jobs in the whole archer I noticed that while
> >> I have 30 jobs running normally (according to condor_q )
> >> the condor_status reports 20 slots 'claimed' (normal run) and 10 slots
> >> under 'owner' status.
>
> >> Since I was the only one running sth in the whole archer at that point
> >> and condor_q claims 30 jobs running... I assumed that all the slots I
> >> used should be 'claimed'..... so the question was why 10 of them look
> >> like 'owner' in condor_status.
>
> >> --
> >> Dimitris
>
> >> On Sun, Nov 1, 2009 at 3:40 PM, Nathan Blythe <nbly...@gmail.com> wrote:
> >>> Dimitris,
>
> >>> (None of the official Archer folks have jumped in so I'll answer; I'm
> >>> sure I'll be corrected if I'm wrong!)
>
> >>> When you said "condor_q", did you mean "condor_status"?  The output
> >>> you showed is part of the output of "condor_status", which shows the
> >>> statuses of the hosts in (in this case) your local pool.
>
> >>> Owner means that the host is busy doing its own thing - either the
> >>> host has local processes with CPU utilization over some threshold or
> >>> the keyboard/mouse are in use.  Claimed means that some job is
> >>> executing on the host (either yours or someone else's).
>
> >>> "condor_q" will show the statuses of the jobs you have submitted.
>
> >>> Hope that helps,
> >>>  Nathan
>
Reply all
Reply to author
Forward
0 new messages