Brewery runner tool

17 views
Skip to first unread message

Stefan Urbanek

unread,
Mar 14, 2012, 4:40:31 PM3/14/12
to datab...@googlegroups.com
Hi,

I've started (re)writing the Brewery runner tool 'brewery'. Currently only one command 'run' is implemented and takes single argument: a .json file with stream network description. Example:

{
    "label": "Basic Data Audit",
    "description": "Provides basic data statistics, such as completeness",
    
    "nodes" : {
        "src": {
            "type": "csv_source",
        },
        "audit": {
            "type": "audit"
        },
        "target": {
            "type": "csv_target",
            "resource": "output.csv"
        }
    },
    "connections": [
        ["src", "audit"],
        ["audit", "target"]
    ]
}


You can replace "resource" with filename or URL (for source only at this moment).

Currently only two top-level json keys are used: "nodes" and "connections". Where "nodes" is a dictionary of name -> node info and "connections" is a list of [source, target] node names.

You can run it as:

    brewery run example.json

or directly an URL:


More to come:

* allow command-line configuration of some node parameters
* allow configuration parameters be taken from another json, example:

    brewery run -p step1.json stream.json
    brewery run -p step2.json stream.json

What do you think?

Regards,

Stefan Urbanek
data analyst and data brewmaster

Twitter: @Stiivi



Reply all
Reply to author
Forward
0 new messages