render farm submission workflow

Eric Mehl

unread,

Sep 20, 2017, 11:09:11 AM9/20/17

to gaffer-dev

I'm brand new to Gaffer and a little confused on the typical workflow with submitting to a render farm. We use Thinkbox Deadline here, and don't have access to Tractor to do any testing so I'm going based on the code in TractorDispatcher.py in particular.

It seems like there are two methods of connecting Gaffer to the farm:

All jobs on the render farm are Gaffer scripts that then call renderers through their API (or a System Command script) to render a frame, then save the frame, maybe with a slap comp in between or some other image processing. Looking at the farm job list would show only Gaffer jobs. This seems like what the built-in dispatcher is meant to create, for example what TractorDispatcher performs. Is that right?
Gaffer actually submits jobs in "native" format to the farm, so looking at the job list would show a Houdini sim, 10 Arnold jobs, 5 Nuke jobs and so on. This sounds like what the discussion over Afanasy was suggesting (https://groups.google.com/d/topic/gaffer-dev/COAOwXaICo0/discussion). This seems like it would require special nodes for the farm manager for each task, and the Dispatcher wouldn't be used much at all? So in my case a DeadlineNuke node, DeadlineVray node, etc.

It seems to me like #1 preserves as much functionality of Gaffer as possible, but are there ways to keep Gaffer's slap comp ability (for example) in method 2? I'm a little reluctant to give up the flexibility and support of submitting native jobs to the farm (like method #2) but maybe I just need to learn to leave that behind.

I'm guessing I'm missing another method or two as well. Ultimately my question is what is the recommended best practice for submitting to a render farm, given that I'm going to need to add support for Deadline in one way or another.

Andrew Kaufman

unread,

Sep 20, 2017, 1:05:38 PM9/20/17

to gaffe...@googlegroups.com

You're correct thus far, and we do both (1) and (2) at Image Engine (sort of). I liked Dan's approach to the last thread, so I'm going to copy him and start with some Gaffer terminology:

TaskNode - The red nodes in your graph that represent offline processing, usually creating new files on disk.
TaskBatch - A batch of tasks to be executed (e.g several frames of a TaskNode clumped together). Never represented visually.
Dispatcher - Also a node, though not currently visible in the graph. When dispatched, walks a graph of TaskNodes, creating another (non-visual) graph of TaskBatches.
`gaffer execute` app - a commandline application for executing a single TaskBatch, otherwise identical to the main gui (the `gaffer gui` app).
LocalDispatcher - A specific dispatcher that executes a graph of TaskBatches on the user's current machine, in serial, using `gaffer execute` commandlines on a background thread.
TractorDispatcher - A specific dispatcher that translates a graph of TaskBatches to a single Tractor Job and spools it to Tractor. Each task in Tractor will be a `gaffer execute` commandline, but this time, tasks might run in parallel (if the submitted graph allows for it).

I'm not familiar with how Deadline handles task-to-task dependencies, which is often where you'd hit downfalls when implementing a Gaffer Dispatcher for new farm software. Since Gaffer's TaskBatch graph is a true DAG, many farm managers aren't well suited to represent the submission exactly (e.g. both Qube and Deadline take a list based approach to tasks, while Tractor takes a tree based approach). As it turns out, Tractor can accept DAG submissions just fine, they just draw it a bit oddly in the UI. When we used Qube at IE, we were able to make DAG submissions work, via a quite complex intermediate layer. I imagine you could do something similar for Deadline if needed.

But if we ignore DAG submissions for a moment, and say you just have a simpler setup:

PythonCommand -> AppleseedRender -> ImageWriter -> SystemCommand

and say your PythonCommand is updating some asset database for about-to-be-created rendered images, and your SystemCommand is launching nuke (or ffmpeg) to generate a quicktime from the about-to-be-created slap comp images.

Then if you wanted to write a DeadlineDispatcher that can handle that setup, the most natural thing to implement (given my limited knowledge of deadline) would result in 4 jobs in Deadline, with dependencies between the jobs (or between the frames if that's possible). All 4 jobs would be `gaffer execute` commandlines, but the last one would be launching Gaffer just to call os.system with your nuke commandline. Presumably you can add some metadata/label to make it look like a Nuke job in Deadline if you prefer, but its easier to just let it call gaffer under the hood.

Alternatively, you could implement your DeadlineDispatcher such that the `gaffer execute` commandline is only used for the nodes that require it (e.g. AppleseedRender and ImageWriter), and the others are swapped out at submission time to be replaced by whatever commandline you'd like. We do that in certain cases at IE, though we're trending away from it over time.

Its also worth mentioning that your DeadlineDispatcher can inject some control plugs onto every TaskNode. We do this at IE, for example, to expose memory limits, machine groups, etc so the artist can define what resources each TaskNode will require when it gets to the farm.

Cheers,

Andrew

--
You received this message because you are subscribed to the Google Groups "gaffer-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gaffer-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Andrew Kaufman - R&D Lead

Image Engine

studio: +1 (604) 874-5634 | and...@image-engine.com | www.image-engine.com

15 West 5th Avenue, Vancouver, BC, V5Y 1H4, Canada

If you are not the intended recipient, disclosure, copying, distribution and use of this email is prohibited. Please notify us immediately and delete this email from your systems. You may contact us at in...@image-engine.com if you do not wish to receive further commercial electronic messages. We may still send you messages for which we do not require consent.

Eric Mehl

unread,

Sep 20, 2017, 3:14:36 PM9/20/17

to gaffer-dev

That's super helpful, thanks Andrew!

That gives me a great starting point to start tinkering. Deadline has a number of dependency modes including frame dependencies (including a mysterious start and end offset I've never used and isn't described too well in the documentation) and scripted dependencies (a Python script that returns true when the dependency is satisfied) that I'm guessing could be used to adapt to Gaffer's DAG. For now I'll likely stick to your example and add more functionality as needed.

Reply all

Reply to author

Forward