Ann: Brewery 0.8 Released

15 views

Skip to first unread message

Stefan Urbanek

unread,

Apr 4, 2012, 11:10:33 AM4/4/12

to datab...@googlegroups.com

Hello,

I'm glad to announce new release of Brewery – stream based data auditing and analysis framework for Python.

There are quite a few updates, to mention the notable ones:

* new ``brewery`` [runner](http://packages.python.org/brewery/tools.html#brewery) with commands `run` and `graph`

* new nodes: *pretty printer* node (for your terminal pleasure), *generator function* node

* many CSV updates and fixes

Added several simple how-to examples [1], such as:

aggregation of remote CSV, basic audit of a CSV, how to use a generator

function.

[1] https://github.com/Stiivi/brewery/tree/master/examples

Note that there are couple changes that break compatibility, however they can

be updated very easily. I apologize for the inconvenience, but until 1.0 the

changes might happen more frequently. On the other hand, I will try to make

them as painless as possible. Feedback and questions are welcome. I'll help you.

Full listing of news, changes and fixes is below.

NEWS

* Changed license to MIT

* Created new brewery runner commands: 'run' and 'graph':

* 'brewery run stream.json' will execute the stream

* 'brewery graph stream.json' will generate graphviz data

* Nodes: Added pretty printer node - textual output as a formatted table

* Nodes: Added source node for a generator function

* Nodes: added analytical type to derive field node

* Preliminary implementation of data probes (just concept, API not decided yet

for 100%)

* CSV: added empty_as_null option to read empty strings as Null values

* Nodes can be configured with node.configure(dictionary, protected). If

'protected' is True, then protected attributes (specified in node info) can

not be set with this method.

* added node identifier to the node reference doc

* added create_logger

* added experimental retype feature (works for CSV only at the moment)

* Mongo Backend - better handling of record iteration

CHANGES

* CSV: resource is now explicitly named argument in CSV*Node

* CSV: convert fields according to field storage type (instead of all-strings)

* Removed fields getter/setter (now implementation is totally up to stream

subclass)

* AggregateNode: rename ``aggregates`` to ``measures``, added ``measures`` as

public node attribute

* moved errors to brewery.common

* removed ``field_name()``, now str(field) should be used

* use named blogger 'brewery' instead of the global one

* better debug-log labels for nodes (node type identifier + python object ID)

**WARNING:** Compatibility break:

* depreciate ``__node_info__`` and use plain ``node_info`` instead

* ``Stream.update()`` now takes nodes and connections as two separate arguments

FIXES

* added SQLSourceNode, added option to keep ifelds instead of dropping them in

FieldMap and FieldMapNode (patch by laurentvasseur @ bitbucket)

* better traceback handling on node failure (now actually the traceback is

displayed)

* return list of field names as string representation of FieldList

* CSV: fixed output of zero numeric value in CSV (was empty string)

LINKS

* github **sources**: https://github.com/Stiivi/brewery

* **Documentation**: http://packages.python.org/brewery/

* **Mailing List**: http://groups.google.com/group/databrewery/

* Submit **issues** here: https://github.com/Stiivi/brewery/issues

* IRC channel: #databrewery on irc.freenode.net

If you have any questions, comments, requests, do not hesitate to ask.

Stefan Urbanek

data analyst and data brewmaster

Twitter: @Stiivi

Home: http://stiivi.com

Brewery: http://databrewery.org

Github: https://github.com/Stiivi

Reply all

Reply to author

Forward

0 new messages