Monitor state of progress

7 views
Skip to first unread message

David Dreher

unread,
Jul 23, 2018, 9:03:22 AM7/23/18
to gc3...@googlegroups.com
Dear GC3Pie Team,

I’m currently trying to embed gc3pie into my workflow, but I’m not sure how to achieve what I want to do. I have taken part in a 'GC3Pie for programmers’ tutorial, but it was a while back, hence I’m sorry if I’m missing something obvious.

My use case is an arbitrary parallel execution of independent matlab jobs.
I can write the main Application object and things like that, but my use case would require submitting the jobs inside a python object. Keep this object around and query completed tasks once their done programmatically. And keep track of jobs that have erred or encountered other problems. 
I know how to submit a SessionBasedScript but this seems to be geared toward manual supervision of the submitted tasks.

In case pseudocode makes more sense

inputFiles = [‘file1.mat’, ‘file2.mat’, …]
sessionObj = matlabSessionScript( inputFiles )

for iF = range(0, length(inputFiles) ):
finishedTask = sessionObj.fetchNext( timeout )
if finishedTask.returncode.exitcode == 0:
  outputFiles = finishedTask.outputFiles()
  % do something with the output files
else:
% Log the error and do something else
end

missingTasks = sessionObj.fetchProblemTasks()
% Do something with all tasks that did not run in time or had other issues.

Any help is greatly appreciated,
David

Riccardo Murri

unread,
Jul 23, 2018, 10:07:33 AM7/23/18
to gc3...@googlegroups.com
Hi David,

> My use case is an arbitrary parallel execution of independent matlab jobs.
> I can write the main Application object and things like that, but my use
> case would require submitting the jobs inside a python object. Keep this
> object around and query completed tasks once their done programmatically.
> And keep track of jobs that have erred or encountered other problems.

I would do it like this, in (incomplete Python) code::

```
from gc3libs import ANY_OUTPUT, Application, create_engine
from gc3libs.workflow import ParallelTaskCollection

# 1. create an `Engine` object to run tasks
engine = create_engine()

# 2. create applications to process all input files
input_files = ['file1', 'file2', ...]
apps = [
Application(
['process', basename(filename)],
inputs=[filename],
outputs=ANY_OUTPUT,
output_dir=(),
...
) for filename in input_files]

# 3. bundle them all in a `ParallelTaskCollection`
top = ParallelTaskCollection(apps)

# 4. run the task collection through the Engine
engine.submit(top)
while top.execution.state != 'TERMINATED':
engine.progress()

# 5. collect tasks that errored out
failed = [task for task in top.tasks if task.execution.returncode != 0]
```

Use of the `ParallelTaskCollection` is actually optional: you could
submit all the tasks in a `for`-loop but then you would have to check
the status of each task individually. E.g., you could replace 3.
with:

```
for app in apps:
engine.submit(app)
while done < len(app):
engine.progress()
done = len([app for app in apps if app.execution.state == 'TERMINATED'])
```

(In my opinion this latter code would only make sense if you have to
break out of the loop earlier, e.g., when 80% of the tasks are
terminated successfully, or when a few critical ones have terminated
unsuccessfully...)

Does this answer your question?

Ciao,
R
Reply all
Reply to author
Forward
0 new messages