Ann: Brewery 0.8 Released

15 views
Skip to first unread message

Stefan Urbanek

unread,
Apr 4, 2012, 11:10:33 AM4/4/12
to datab...@googlegroups.com
Hello,

I'm glad to announce new release of Brewery – stream based data auditing and analysis framework for Python.

There are quite a few updates, to mention the notable ones:

* new ``brewery`` [runner](http://packages.python.org/brewery/tools.html#brewery) with commands `run` and `graph`
* new nodes: *pretty printer* node (for your terminal pleasure), *generator function* node
* many CSV updates and fixes

Added several simple how-to examples [1], such as:
aggregation of remote CSV, basic audit of a CSV, how to use a generator
function.


Note that there are couple changes that break compatibility, however they can
be updated very easily. I apologize for the inconvenience, but until 1.0 the
changes might happen more frequently. On the other hand, I will try to make
them as painless as possible. Feedback and questions are welcome. I'll help you.

Full listing of news, changes and fixes is below.

NEWS

* Changed license to MIT
* Created new brewery runner commands: 'run' and 'graph':
    * 'brewery run stream.json' will execute the stream
    * 'brewery graph stream.json' will generate graphviz data
* Nodes: Added pretty printer node - textual output as a formatted table
* Nodes: Added source node for a generator function
* Nodes: added analytical type to derive field node
* Preliminary implementation of data probes (just concept, API not decided yet
  for 100%)
* CSV: added empty_as_null option to read empty strings as Null values
* Nodes can be configured with node.configure(dictionary, protected). If 
  'protected' is True, then protected attributes (specified in node info) can 
  not be set with this method.

* added node identifier to the node reference doc
* added create_logger

* added experimental retype feature (works for CSV only at the moment)
* Mongo Backend - better handling of record iteration

CHANGES

* CSV: resource is now explicitly named argument in CSV*Node
* CSV: convert fields according to field storage type (instead of all-strings)
* Removed fields getter/setter (now implementation is totally up to stream
  subclass)
* AggregateNode: rename ``aggregates`` to ``measures``, added ``measures`` as
  public node attribute
* moved errors to brewery.common
* removed ``field_name()``, now str(field) should be used
* use named blogger 'brewery' instead of the global one
* better debug-log labels for nodes (node type identifier + python object ID)

**WARNING:** Compatibility break:

* depreciate ``__node_info__`` and use plain ``node_info`` instead
* ``Stream.update()`` now takes nodes and connections as two separate arguments

FIXES

* added SQLSourceNode, added option to keep ifelds instead of dropping them in 
  FieldMap and FieldMapNode (patch by laurentvasseur @ bitbucket)
* better traceback handling on node failure (now actually the traceback is
  displayed)
* return list of field names as string representation of FieldList
* CSV: fixed output of zero numeric value in CSV (was empty string)

LINKS

* github  **sources**: https://github.com/Stiivi/brewery
* IRC channel: #databrewery on irc.freenode.net

If you have any questions, comments, requests, do not hesitate to ask.

Stefan Urbanek
data analyst and data brewmaster

Twitter: @Stiivi



Reply all
Reply to author
Forward
0 new messages