[Rocks-Discuss] SGE: Pending Jobs (qw)

122 views
Skip to first unread message

Concha, Monica

unread,
Sep 20, 2011, 11:42:16 AM9/20/11
to npaci-rocks...@sdsc.edu
Dear All,
I'm new using SGE. When I submit a job always goes to "Pending Jobs"
[mconcha@franklin jobs]$ qstat -explain E
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
al...@compute-0-16.local BIP 0/0/8 0.00 lx26-amd64 s
---------------------------------------------------------------------------------
al...@compute-0-17.local BIP 0/0/12 0.07 lx26-amd64 s
---------------------------------------------------------------------------------
al...@compute-0-18.local BIP 0/0/24 0.10 lx26-amd64 s
---------------------------------------------------------------------------------
al...@compute-0-19.local BIP 0/0/24 -NA- lx26-amd64 aus
---------------------------------------------------------------------------------
m...@compute-0-12.local BIP 0/0/8 0.00 lx26-amd64
---------------------------------------------------------------------------------
m...@compute-0-7.local BIP 0/0/8 0.00 lx26-amd64
---------------------------------------------------------------------------------
al...@compute-0-12.local BIP 0/0/8 0.00 lx26-amd64 s
---------------------------------------------------------------------------------
al...@compute-0-7.local BIP 0/0/8 0.00 lx26-amd64 s

##################################################################
- PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
#################################################################
36601 0.56000 Sleeper mconcha qw 09/20/2011 10:14:21 1

qstat -explain c -j 36601
script_file: sleeper.sh
scheduling info: queue instance "al...@compute-0-7.local" dropped because it is temporarily not available
queue instance "al...@compute-0-12.local" dropped because it is temporarily not available
queue instance "al...@compute-0-17.local" dropped because it is temporarily not available
queue instance "al...@compute-0-19.local" dropped because it is temporarily not available
queue instance "al...@compute-0-16.local" dropped because it is temporarily not available
queue instance "al...@compute-0-18.local" dropped because it is temporarily not available
cannot run in queue "mm" because it is not contained in its hard queue list (-q)

I know everything is a mess in mys cluster!
Do you think I need to reinstall the SGE?
I'm using:
SGE Version: 6.2u4
CentOS 5.4

Rocks 5.3 (Rolled Tacos)


Thank you very much for your help
Monica

Monica C. Concha
Physical Science Technician
Cotton Structure and Quality Research
Southern Regional Research Center
1100 Robert E. Lee Blvd.
New Orleans, LA 70124
monica...@ars.usda.gov<mailto:monica...@ars.usda.gov>
phone: 504-286-4252
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/attachments/20110920/634300a7/attachment.html

Rayson Ho

unread,
Sep 20, 2011, 1:06:11 PM9/20/11
to Discussion of Rocks Clusters
A lot of your queue instances (nodes) are in "suspended" state:


http://wikis.sun.com/display/GridEngine/Monitoring+and+Controlling+Queues#MonitoringandControllingQueues-ClusterQueueStatus

Did you or someone manually use qmod to suspend them??

You can use  qmod -usq  to unsuspend them:

http://gridscheduler.sourceforge.net/htmlman/htmlman1/qmod.html


Rayson

=================================
Grid Engine / Open Grid Scheduler
http://gridscheduler.sourceforge.net

Vlad Manea

unread,
Sep 20, 2011, 1:51:35 PM9/20/11
to Discussion of Rocks Clusters
Monica,

If in your cluster is a mess, the best way is to reinstall Rocks.
SGE worked out of the box for me.

V.

Reply all
Reply to author
Forward
0 new messages