making gaffer dispatcher work with a different farm manager

401 views
Skip to first unread message

Hradec

unread,
Feb 28, 2017, 5:23:27 PM2/28/17
to gaffer-dev
We use CGRU Afanasy as render farm manager, which is totally free and open source. 


We have being using it for about a year now, and it proven to be really solid and stable for us, with a nice web based UI, python API, etc. 

So, I suppose GafferTractor/GafferTractorUI would be the "template" module in this case, to write  a new GafferAfanasy/GafferAfanasyUI module, correct? 

Is there anything else? 

cheers...
-H

Andrew Kaufman

unread,
Feb 28, 2017, 5:32:27 PM2/28/17
to gaffe...@googlegroups.com
Yep, that's about it. We don't actually use GafferTractor in production at Image Engine (yet), but it should be a working example of a custom Dispatcher for a render farm.

Are you considering open-sourcing your GafferAfanasy implementation by any chance? If so, that could be a cool example of a plugin to use in a "How to write a Gaffer plugin" tutorial in the docs.

Cheers,
Andrew

PS. We have our own internal FarmDispatcher here at IE, which doesn't talk directly to a particular farm, but instead to an intermediate layer, of which you may be familiar with the origins ;)


--
You received this message because you are subscribed to the Google Groups "gaffer-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gaffer-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Andrew Kaufman - R&D Lead
Image Engine
studio: +1 (604) 874-5634 | and...@image-engine.com | www.image-engine.com



15 West 5th Avenue, Vancouver, BC, V5Y 1H4, Canada

If you are not the intended recipient, disclosure, copying, distribution and use of this email is prohibited. Please notify us immediately and delete this email from your systems. You may contact us at in...@image-engine.com if you do not wish to receive further commercial electronic messages. We may still send you messages for which we do not require consent.

Hradec

unread,
Mar 2, 2017, 1:47:22 PM3/2/17
to gaffer-dev
Yep, that's about it. We don't actually use GafferTractor in production at Image Engine (yet), but it should be a working example of a custom Dispatcher for a render farm.

Awesome!
 
Are you considering open-sourcing your GafferAfanasy implementation by any chance? If so, that could be a cool example of a plugin to use in a "How to write a Gaffer plugin" tutorial in the docs.

Yep! Definetelly!

Specially since Afanasy is also open source (and unbelievable easy to get up and running), it could be a great option for people to use Gaffer in a farm environment, maybe to render Appleseed in the farm?! ;)
 

PS. We have our own internal FarmDispatcher here at IE, which doesn't talk directly to a particular farm, but instead to an intermediate layer, of which you may be familiar with the origins ;)
"Hoooo my"...lol... indeed! :)

cheers mate!
-H
 

John Haddon

unread,
Mar 2, 2017, 2:02:53 PM3/2/17
to gaffe...@googlegroups.com
On 2 March 2017 at 18:47, Hradec <hra...@gmail.com> wrote:
 
Are you considering open-sourcing your GafferAfanasy implementation by any chance? If so, that could be a cool example of a plugin to use in a "How to write a Gaffer plugin" tutorial in the docs.

Yep! Definetelly!

Specially since Afanasy is also open source (and unbelievable easy to get up and running), it could be a great option for people to use Gaffer in a farm environment, maybe to render Appleseed in the farm?! ;)

Nice one!

One thing that may or may not be a problem is that Gaffer's job structure is a pretty arbitrary DAG, where frames of one task can even depend on different frames of other tasks. Is this something that Afanasy can represent? I took a totally superficial look over Afanasy's job description schema and it seemed to assume a simpler sequence based job definition? Or maybe you can chain multiple simple jobs together to build a DAG?

Cheers...
John

Hradec

unread,
Mar 9, 2017, 2:54:15 PM3/9/17
to gaffer-dev

One thing that may or may not be a problem is that Gaffer's job structure is a pretty arbitrary DAG, where frames of one task can even depend on different frames of other tasks. Is this something that Afanasy can represent? I took a totally superficial look over Afanasy's job description schema and it seemed to assume a simpler sequence based job definition? Or maybe you can chain multiple simple jobs together to build a DAG?

A Job in afanasy is made of blocks, and each block can have multiple tasks (frames). 

You can set dependency masks from job to job, block to block, block->(frame to frame) and block->frames to block->frames! 

Actually, afanasy does come with plugins for Nuke and Houdini. It's basically a node where you can plug writeOut nodes to it to send to the farm, in a similar way the dispatch nodes in Gaffer do. (as far as I understood then :)

For example, a block/block dependency would be like this:

JOB +---- block01(RIB Generation)--+--frame01 rib
    |                              |--frame02 rib
    |                              |--frame03 rib
    |                              `--frame03 rib
    |
    `---- block02(prman render)----+--frame01 prman rib (waits for block01 to finish)
                                   |--frame02 prman rib (waits for block01 to finish)
                                   `--frame03 prman rib (waits for block01 to finish)

so the actual render would only start after block01 finishes generating all ribs!

As for a block->frame/block->frame dependency:

JOB +---- block01(RIB Generation)--+--frame01 rib
    |                              |--frame02 rib
    |                              |--frame03 rib
    |                              `--frame03 rib
    |
    `---- block02(prman render)----+--frame01 prman rib (wait for block01/frame01 to finish)
                                   |--frame02 prman rib (wait for block01/frame02 to finish)
                                   `--frame03 prman rib (wait for block01/frame03 to finish)

In this case, block02->frame01 will start render immediately after block01->frame01 finishes generating the rib! 

Multiple dependencies are also possible... for example, on a mix block->(frame/frame) and block->frame / block->frame dependency, with an initial cloth simulation block:

JOB +---- block01(Cloth Sim)-------+--frame01 to frame02 cloth sim alembic
    |                              |--frame03 to frame04 cloth sim alembic (wait for "block01/frame01 to frame02" to finish)
    |                              |--frame05 to frame06 cloth sim alembic (wait for "block01/frame03 to frame04" to finish)
    |                              `--frame07 to frame08 cloth sim alembic (wait for "block01/frame05 to frame06" to finish)
    |
    |---- block02(RIB Generation)--+--frame01 rib (wait for "block01/frame01 to frame02" to finish)
    |                              |--frame02 rib (wait for "block01/frame01 to frame02" to finish)
    |                              |--frame03 rib (wait for "block01/frame03 to frame04" to finish)
    |                              |--frame04 rib (wait for "block01/frame03 to frame04" to finish)
    |                              |--frame05 rib (wait for "block01/frame05 to frame06" to finish)
    |                              |--frame06 rib (wait for "block01/frame05 to frame06" to finish)
    |                              |--frame07 rib (wait for "block01/frame07 to frame08" to finish)
    |                              `--frame08 rib (wait for "block01/frame07 to frame08" to finish)
    |
    |---- block03(prman render)----+--frame01 prman rib (wait for block02/frame01 AND "block01/frame01 to frame02" to finish)
    |                              |--frame02 prman rib (wait for block02/frame02 AND "block01/frame01 to frame02to finish)
    |                              |--frame03 prman rib (wait for block02/frame03 AND "block01/frame03 to frame04" to finish)
    |                              |--frame04 prman rib (wait for block02/frame04 AND "block01/frame03 to frame04" to finish)
    |                              |--frame05 prman rib (wait for block02/frame05 AND "block01/frame05 to frame06" to finish)
    |                              |--frame06 prman rib (wait for block02/frame06 AND "block01/frame05 to frame06" to finish)
    |                              |--frame07 prman rib (wait for block02/frame07 AND "block01/frame07 to frame08" to finish)
    |                              `--frame08 prman rib (wait for block02/frame08 AND "block01/frame07 to frame08" to finish)
    |
    `---- block04(create dailies)--+--frame01 playblast block01 cloth sim alembic (wait for block01 to finish completely)
                                   `--frame02 ffmpeg [all block03 frames] (wait for block03 to finish completely)
 
So, although a job is only 2 level hierarchy,  thanks to the dependency mask, one can create arbitrary interdependecy pretty much like an arbitrary DAG.
I haven't do a complex example like this last one, though. In theory (based on api docs) it's doable, but I'm not sure if it's fully working. 
The first 2 examples works perfectly. 

The UI representation of the interdependecy is color/text based, essentially informing that a block of frame is waiting on something... there's not much detail on what is waiting on what, unfortunately.
This is an screenshot of a simple job with 2 block where the 2o. block is waiting the first one to finish completely:
Inline image 1
So this is the JOB ui, showing 2 blocks and the frames for each. Block02 is purple with a WDP beacause  it's waiting for a dependency! (in this case, block01)

We have being using afanasy for the past couple of years, and it's working pretty well so far. (we switched over from Qube, and no regrets!)
The best for me is the WebUI, so everyone can check renders remotely from anywhere. (the screenshot actually is from chrome)
It even comes with an WebGL image sequence player, so we can even check the actual rendered frames online too!

The team behind the project is russian, and very restrict on accepting changes to the code, which is a good thing in my opnion! 

cheers...
-H


Alex Fuller

unread,
Mar 4, 2018, 9:43:03 PM3/4/18
to gaffer-dev
Hi Roberto,

Did you progress at all with GafferAfanasy ?

Cheers....

Hradec

unread,
Mar 7, 2018, 1:35:12 PM3/7/18
to gaffer-dev


On Sun, Mar 4, 2018 at 6:43 PM, Alex Fuller <bobe...@gmail.com> wrote:
Hi Roberto,

Did you progress at all with GafferAfanasy ?

Cheers....


not really... loads of other things to do and never got to it... sorry! :(

I'm still working on adapting gaffer to our studio (or better saying, adapt our studio to gaffer), so until I manage to make then use Gaffer as the main pipeline tool, connecting cgru afanasy to it is lower priority. 

BUT, it's still on my to-do list... there's hope! lol

If helps, I do have developed some python code to interact with Afanasy JSON backend in our PipeVFX pipeline. It's not on github yet, but if you're interested, I can publish it. 

There's a gamma of higher level functions, for example:
  1. creating jobs (although this function is very much oriented to PipeVFX, one can "clean it up" easily)
  2. list jobs (with all properties)
  3. restart jobs/tasks in a job
  4. change parameters in a job
  5. get parameters from a job
  6. delete jobs
  7. list render nodes (with all properties)
  8. delete render nodes
  9. change parameters of render nodes
  10. get parameters of render nodes
  11. list how many frames left to render on a job
  12. get the idle time of a render node

It's not fully complete, since I have being writing then based on a "need basis", but I'm trying to be careful enough to create basic interaction functions with afanasy, and then use those functions in the higher level ones. (apart from creating jobs, which uses afanasy's own pyton module)

It's very simple to interact with afanasy, since EVERYTHING you can do on the web gui, is listed on the net panel, which you can find at the bottom of the page. (just like mel code in maya)

The net panel lists the JSON used when you click on something, so you can just grab as a template to use in your own code. You don't even need documentation! 

for example:

to open the net panel, click here
then here:


that should open the net panel. 
now select a render node, and double click "capacity" to change it. 


after that, look at the net panel, and you'll find the JSON used to change the capacity. (be fast, since the net panel lists EVERYTHING, so it scrolls up pretty fast past your "capacity change"! lol - you can use the browser "find in page" function if you miss it. Just look for "action" or something else in the json example I show below! ) 


Then you can use that as a reference to change any parameters on a render node.
In python, I did like this: 

import af
import afnetwork

class afanasy(baseFarmJobClass):

    def _init(self):
        ''' import afanasy python module into our class'''
        import af
        import afnetwork
        self.af = af
        self.afnetwork = afnetwork

    def _runJSON(self, json):
        ''' the base function used to talk to afanasy via JSON requests '''
        ret = self.afnetwork.sendServer( self.af.json.dumps(json), False )
        if ret[0]:
            return ret[1]
        return []

    def _list(self, filter='', mode="full"):
        ''' list all information from jobs. Use filter to select jobs based on a sub-string of the job name '''
        json = {"get":{"type":"jobs", "mode":mode}}
        return [ x for x in self._runJSON(json)['jobs'] if filter in x['name'] ]

    def _renderNodes(self, name='', state=''):
        ''' returns a dictionary with all information for render nodes '''
        json = {"get":{"type":"renders"}}
        ret =   self._runJSON(json)
        if not ret:
            return []

        fret = []
        for r in ret['renders']:
            if name in r['name'] and state in r['state']:
                fret += [r]
        return fret



afanasy = _afanasy()
for rendernode in afanasy._renderNodes( "newfarm-005" ):
        afanasy._runJSON( {"action":{
                    "user_name":"coord",
                    "host_name":"pc",
                    "type":"renders",
                    "ids":[rendernode['id']],
                    "params":{
                        "capacity : "500"
                    }
                }
        } )


If you end up working on gafferAfanasy, let me known! I would be very interesting on it too! And if you need any more help with afanasy, I'll do my best to help you! just drop me a message... 

some more afanasy  info:
=====================

We're using afanasy for about 2 years now, without ANY regrets. 
It's really surprising how good, reliable and flexible it is, being an open source renderfarm managers. Basically, EVERYTHING that I wanted/needed to do with it, I was able to! 
From my experience, it's the FIRST open source renderfarm manager project that surpass lots of commercial products on the market! (being a old-school alfred fan, today I really prefer afanasy for it's structure, flexibility and simplicity!)
 
Recently (about 4 months now) I've integrated our PipeVFX with  Google Cloud Computing, and now we're able to allocate up to 100  VM machines (64/96 cores each, 64Gb ram) into our renderfarm, seamlessly.
Afanasy afrender runs on the google vm machines, and they show up in the studio just like any other local farm machine.
Afanasy is able to distribute the renders and manage everything perfectly... we tested with up to 122  nodes (100 google machines and 22 local machines) without issues! 
to be fair, version 2.1 of afanasy used to crash with more than 100 nodes, but the latest 2.2 handles more than a 100 perfectly, without slow downs compared to just 22 local nodes. 

btw, this integration with GCC is done with 100% open source software, and it is able to "boot" the google vm machines with our custom arch linux distro, and all the upload/download of project data is done all transparently by a custom cache fuse filesystem on top of a sshfs fuse filesystem. We just submit a job to the farm, afanasy starts the cmd line on the google machines, and all the needed files get transfered over as the machines load up the scenese. Even apps like maya get uploaded and cached locally by the filesystem transparently. Licenses are all tunneled up by an automatic ssh systemd service  to the google farm machines, from our local license servers. 

Our internet connection down in Brazil is not the best, just 100mbit up and 200mbit down, but even at that low speed, it has proven to be enough to work with GCC. Off cource, when we have 200GB of alembic cache, that will take about 4-5 hours to upload over a 100mbit connection, but with 100 * 64 cores nodes to render, it's still 50x faster to render than on our local 22 machines farm. 

If someone is interested, I'll be publishing this GCC<>pipeVFX<>afanasy integration into pipeVFX github at some point, including the cache fuse filesystem I've developed. 

As you can see, that's basically why I haven't had much time to work on afanasy->gaffer... LOL...

by the way, not you sure if you guys known, but I've found out yesterday that a guy from Poland released a new node based GUI for afanasy back in march/2017. It does look pretty cool: 

and also, you can find cgru afanasy docker containers at dockhub now! makes it very simple and easy to setup a bed test!! 

cheers... 
-H


Alex Fuller

unread,
Mar 14, 2018, 3:00:12 PM3/14/18
to gaffer-dev
Hey Roberto,

Thank you for the detailed reply and sorry for getting back to you so late. What you're doing really sounds quite incredible, especially the fuse/sshfs filesystem stuff that submits to the cloud. For me I just wanted a nice way of throwing jobs from my windows desktop to a linux workstation with many cores so I could render while working in parallel (and maybe use the cloud in future, using said workstation as a testbed) and afanasy seemed to be the best supported one out there. 

That node web UI looks amazing, as the default UI is quite terrible to look at... :)


On Tuesday, 28 February 2017 14:23:27 UTC-8, Roberto Hradec wrote:
Reply all
Reply to author
Forward
0 new messages