qsub_cmd = `echo $home/julia-release-basic --worker` | `qsub -N JULIA -terse -cwd -j y -o $sgedir -t 1:$n`julia> addprocs_sge(2)
ERROR: assertion failed: ?
in error at error.jl:22
in assert at error.jl:43
in success at process.jl:392
in map at abstractarray.jl:1478
in success at process.jl:394
in start_sge_workers at multi.jl:1009
in addprocs_sge at multi.jl:1044
$ qstat -u "kmsquire"job-ID prior name user state submit/start at queue slots ja-task-ID -----------------------------------------------------------------------------------------------------------------
358164 10.50000 JULIA kmsquire r 06/16/2013 14:01:52 al...@compute-4-60.local 1 1 358164 10.50000 JULIA kmsquire r 06/16/2013 14:01:52 al...@compute-4-53.local 1 2
julia> addprocs_sge(2) ERROR: assertion failed: ? in error at error.jl:22 in assert at error.jl:43 in success at process.jl:394 in all at reduce.jl:175 in success at process.jl:401 in start_sge_workers at multi.jl:941 in addprocs_sge at multi.jl:976
$ qstat -u "ucaktpa"job-ID prior name user state submit/start at queue slots ja-task-ID -----------------------------------------------------------------------------------------------------------------9696992 0.50290 JULIA ucaktpa qw 06/16/2013 22:16:14 1 1,2sched = findResource('scheduler', 'configuration', configuration);
pjob = createParallelJob(sched);
set(pjob, 'MaximumNumberOfWorkers', minNumWorkers);set(pjob, 'MinimumNumberOfWorkers', maxNumWorkers);I checked the line in multi.jl you mentioned, and was thinking that I pass several other options to qsub, e.x. in order to allocate memory or set runtime thresholds (-l h_vmem=8G,vf=8G -l h_rt=0:3:0). It may be good to pass them as extra arguments to start_sge_workers(); alternatively, we could pass a single argument, which could be a configuration file, similar to the matlab sample code below:
sched = findResource('scheduler', 'configuration', configuration);pjob = createParallelJob(sched);set(pjob, 'MaximumNumberOfWorkers', minNumWorkers);set(pjob, 'MinimumNumberOfWorkers', maxNumWorkers);
I will try to trace the addprocs_sge() error message...
/home/blauwens/julia/usr/bin/julia-release-basic: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.15' not found (required by /home/blauwens/julia/usr/bin/../lib/libjulia-release.so)
qsub_cmd = `echo $home/julia-release-basic --worker` |> `qsub -N JULIA -V -terse -cwd -j y -o $sgedir -t 1:$n`sleep(0.5)bash: module: line 1: syntax error: unexpected end of file
bash: error importing function definition for `module'
julia_worker:9009#192.168.1.226while !fexists
try
fl = open(fname)
try
while !fexists
conninfo = readline(fl)
hostname, port = parse_connection_info(conninfo)
fexists = (hostname != "")
end
finally
close(fl)
end
catch
print(".");
sleep(0.5)
end
endAfter these modifications,works on a HP cluster running x86_64 GNU/Linux.addprocs_sge()Some feedback from other SGE users should be useful and perhaps this hack can be merged in julia base.Ben
ClusterManagers.addprocs_sge. This is already great, but now I want to be able to submit jobs to the main scheduler.
I am going to look into extracting the machine file given by the scheduler and start from there.
Concerning UCL I am leaving soon, but I think it would be nice for them to get it up and running on legion.
Thank you for your response, and I will update this thread if I am able to write up a good submission script.
very best,