[root@lpge-cluster ~]# showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
0 Active Jobs 0 of 40 Processors Active (0.00%)
0 of 5 Nodes Active (0.00%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
1 cforain Idle 1 00:10:00 Mon Sep 20 07:58:27
2 cforain Idle 1 00:10:00 Mon Sep 20 08:18:37
3 cforain Idle 1 00:10:00 Mon Sep 20 08:18:39
4 cforain Idle 1 00:10:00 Mon Sep 20 08:18:40
5 cforain Idle 1 00:10:00 Mon Sep 20 08:18:41
6 cforain Idle 1 00:10:00 Mon Sep 20 08:18:42
7 cforain Idle 1 00:10:00 Mon Sep 20 08:18:43
8 cforain Idle 1 00:10:00 Mon Sep 20 08:18:44
9 cforain Idle 1 00:10:00 Mon Sep 20 08:18:45
10 cforain Idle 1 00:10:00 Mon Sep 20 08:18:46
Total Jobs: 10 Active Jobs: 0 Idle Jobs: 0 Blocked Jobs: 10
How do I delete those jobs from the queue? thanks in advance.
2010/9/20 Cláudio Forain <claudi...@gmail.com>:
2010/9/20 Cláudio Forain <claudi...@gmail.com>:
[cforain@lpge-cluster ~]$ qsub teste.sh
27.lpge-cluster.ufrj.br
[cforain@lpge-cluster ~]$ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
0 Active Jobs 0 of 40 Processors Active (0.00%)
0 of 5 Nodes Active (0.00%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
27 cforain Idle 1 00:10:00 Mon Sep 20 10:13:48
Total Jobs: 1 Active Jobs: 0 Idle Jobs: 0 Blocked Jobs: 1
If I list my nodes, it gives me:
compute-0-0
state = down
np = 1
ntype = cluster
compute-0-1
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux compute-0-1.local
2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009
x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=265494,totmem=7120428kb,availmem=6986568kb,physmem=6100312kb,ncpus=8,loadave=0.00,netload=504661144,state=free,jobs=,varattr=,rectime=1284988584
compute-0-2
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux compute-0-2.local
2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009
x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=263121,totmem=7118816kb,availmem=6996500kb,physmem=6098700kb,ncpus=8,loadave=0.00,netload=487294788,state=free,jobs=,varattr=,rectime=1284988592
compute-0-3
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux compute-0-3.local
2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009
x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=266371,totmem=7118848kb,availmem=7000188kb,physmem=6098732kb,ncpus=8,loadave=0.00,netload=486211808,state=free,jobs=,varattr=,rectime=1284988593
compute-0-4
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux compute-0-4.local
2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009
x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=265942,totmem=7118848kb,availmem=6998184kb,physmem=6098732kb,ncpus=8,loadave=0.00,netload=486129000,state=free,jobs=,varattr=,rectime=1284988593
compute-0-5
state = free
np = 8
ntype = cluster
status = opsys=linux,uname=Linux compute-0-5.local
2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009
x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=265960,totmem=7118852kb,availmem=6997224kb,physmem=6098736kb,ncpus=8,loadave=0.02,netload=485707033,state=free,jobs=,varattr=,rectime=1284988593
Which is right. I see the trace of the job I just ran:
[cforain@lpge-cluster ~]$ tracejob 27
/opt/torque/server_priv/accounting/20100920: Permission denied
/opt/torque/mom_logs/20100920: No such file or directory
/opt/torque/sched_logs/20100920: No such file or directory
09/20/2010 10:13:48 S enqueuing into default, state 1 hop 1
09/20/2010 10:13:48 S Job Queued at request of
cfo...@lpge-cluster.ufrj.br, owner = cfo...@lpge-cluster.ufrj.br,
job
name = teste.sh, queue = default
Here is the scripts I tried to run:
[cforain@lpge-cluster ~]$ cat teste.sh
#!/bin/bash
#PBS -lwalltime=0:10:0
echo starting
sleep 10
echo ending
And a MPI one:
[cforain@lpge-cluster ~]$ cat teste-mpi.sh
#!/bin/bash
#PBS -lwalltime=0:10:0
#PBS -lnodes=4
echo starting openmpi job:
/opt/openmpi/bin/mpirun /opt/mpi-tests/bin/mpi-ring
echo ending
I have resolved the issue. I just had to set the defaulg to ACTIVE via command:
qmgr -c "set queue batch enabled=true"
Thanks for reading and I hope it helps someone that needs.
2010/9/20 Cláudio Forain <claudi...@gmail.com>: