Brewery runner tool

Skip to first unread message

Stefan Urbanek

Mar 14, 2012, 4:40:31 PM3/14/12

I've started (re)writing the Brewery runner tool 'brewery'. Currently only one command 'run' is implemented and takes single argument: a .json file with stream network description. Example:

    "label": "Basic Data Audit",
    "description": "Provides basic data statistics, such as completeness",
    "nodes" : {
        "src": {
            "type": "csv_source",
        "audit": {
            "type": "audit"
        "target": {
            "type": "csv_target",
            "resource": "output.csv"
    "connections": [
        ["src", "audit"],
        ["audit", "target"]

You can replace "resource" with filename or URL (for source only at this moment).

Currently only two top-level json keys are used: "nodes" and "connections". Where "nodes" is a dictionary of name -> node info and "connections" is a list of [source, target] node names.

You can run it as:

    brewery run example.json

or directly an URL:

More to come:

* allow command-line configuration of some node parameters
* allow configuration parameters be taken from another json, example:

    brewery run -p step1.json stream.json
    brewery run -p step2.json stream.json

What do you think?


Stefan Urbanek
data analyst and data brewmaster

Twitter: @Stiivi

Reply all
Reply to author
0 new messages