Meeting notes: 2009-12-14 -- Binary, Serialize, Linux kernel modules, args, random, Motivik

18 views

Skip to first unread message

Igal Koshevoy

unread,

Dec 15, 2009, 2:22:48 PM12/15/09

to pdxfunc

We had a great meeting with many excellent talks and a turnout of nearly 30 people. Thanks to all that participated!

FORMATS

Most of the night's talks were 10-20 minutes long and this worked very well. While we're still interested in long talks, these short talks are nice because they offer lots of content on many topics and give a chance for more presenters to participate.

VENUES

We've had some logistics issues again that came up at the last minute, including nearly losing access to the venue (which Reid Beels saved us from), not having a projector (which Bart Massey saved us from), seem to have run out of chairs, etc. Given these issues, I intend to explore alternative venues and would be grateful if you can send me personal email with leads from organizations that'd be interesting in hosting user group meetings for us and possibly others -- I've had offers from Galois and Reductive Labs for long-term and PSU for short-term.

PRESENTATIONS

We had the following presentations:

"Data.Binary" by Don Stewart
"cereal: Data.Serialize" by Trevor Elliot
"Linux Kernel Modules in Haskell" by Thomas DuBuisson
"Command line argument parsing" by Bart Massey
"Control.Monad.Random" by Julian Blake Kongslie
"System.Random.Mersenne" by Don Stewart
"Mersenne.Random.Pure64" by Don Stewart
"Motivik, music signal processing" by Jeremy Voorhis

My detailed notes on the presentations -- please post any additions or corrections:

1. "Data.Binary" by Don Stewart

Don presented the Data.Binary serialization system that he co-wrote and maintains at http://hackage.haskell.org/package/binary
Wanted a fast mechanism to convert Haskell data to bytes and back, such as for marshalling data across the network, to disk, distributed systems, etc.
Had to choose which features and their associated design constraints, e.g., lazy, fast, flexible, etc.
"The Bits Between The Lambdas" by Wallace and Runciman describes a Haskell API for treating storage media as arbitrary-length lazy streams. Full text available for ACM members at: http://portal.acm.org/citation.cfm?id=286872
The Wallace and Runciman paper provided the basis of Haskell's NewBinary package that's been used for many years. However, it had issues: it was strict, slow (100x slower than the Binary package), complicated (used different disk and other bindings), and impure.
The Binary Strike Team met in Oxford in 2007 to create a new system:

Simple interface.
Fast, its capable of 1G/second streaming.
Lazy, can encode/decode in constant space.
Functions to put/get atomic types
Compositional
Widely-used, is one of the top 5 most-used Haskell libraries

Complains made about this Binary library:

Lazy operations result in asynchronous exceptions, which can be hard to deal with.
Some people confuse the Put/Get and Binary class, since these operate at different levels.

2. "cereal: Data.Serialize" by Trevor Elliot

Trevor presented the Data.Serialize library ("cereal" package on hackage) that he maintains at http://hackage.haskell.org/package/cereal
The Binary library's lazy nature results in asynchronous exceptions that are difficult to deal with. Instead, the Seralize library adds a Get monad that's an Exception and State monad to provide continuations for success and failure, which is clearer.
The Binary library's lazy nature results in overhead maintaining the stream's state and doing housekeeping on the constant space buffers. Instead, the Serialize library processes a strict byte stream, which makes it faster and simpler.
The Serialize library preserves compatibility with many of Binary's functions.
Added "label" function to annotate data to make it possible to track parsing errors.
Added "isolate" function to read specific bytes from a strict bytestream and if there's an issue, returns a failure and label to identify the problem. Adds useful combinators, such as "getListOf", "getTwoOf", etc.

3. "Linux Kernel Modules in Haskell" by Thomas DuBuisson

Thomas' talk featured very detailed, standalone slides that you can get from http://www.galois.com/~lerkok/techSeminarSlides/KernelModulesInHaskell.pdf
Also see his project site at http://haskell.org/haskellwiki/Kernel_Modules

4. "Command line argument parsing" by Bart Massey

Bart presented a number of libraries for parsing command line arguments in Haskell.
System.Console.GetOpt

Details at http://www.haskell.org/haddock/libraries/System.Console.GetOpt.html

It's a Haskell port of the C "getopt" library.
Bart felt it was very awkward and ugly.

System.SimpleArgs

Details at http://hackage.haskell.org/package/simpleargs
True to its name, it's very simple and lacks many features, but it may be good for simple tasks.
Julian described it as terrifying.

parseargs

Details at http://hackage.haskell.org/package/parseargs
Bart wrote it and seems to like it. :)
The Haskell code that one writes to describe the options seemed much clearer and more descriptive, although more verbose, than the alternatives.
Uses functions like `getArgInt` that can fail if it can't parse them, which can result in very confusing errors.

Console.cmdargs

Details at http://hackage.haskell.org/package/cmdargs
Recently published library with potential.
Uses combinators to describe how to parse the arguments and options. While this approach is powerful, it's also rather awkward to write and difficult to read.

What's the next parser?

Bart discussed an unwritten parser that uses a different approach that has potential to improve things, where the user writes XML in a standardized documentation format and generates a parser for it, so the parser and documentation stay in sync.
Igal showed a somewhat similar approach in Perl's GetOpt::Declare parser. You write a documentation string that follows some conventions and instantiate a parser with it to describe the rules. Although its documentation string format isn't as structured as Bart's XML, it's very clear and easier for most humans to reason about. Details: http://www.perlmonks.org/?node_id=56048

5. "Control.Monad.Random" by Julian Blake Kongslie

Julian presented a library that makes it easier to generate random data in Haskell. Example at http://hackage.haskell.org/packages/archive/MonadRandom/0.1.4/doc/html/Control-Monad-Random.html#1
It lets you carry around a generator that you can easily get values from.
Offers useful but badly-named `fromLists` function that gives statistical weights to values it can return.
However, it's inefficient if you need to pull many values out of it.

6. "System.Random.Mersenne" by Don Stewart

Don presented a fast library for generating high quality pseudorandom numbers, it's 100x faster than older approach. Don maintains the package at http://hackage.haskell.org/package/mersenne-random
The implementation uses very fast SIMD code written in C that exposes a single global, that Haskell accesses as a single generator through FFI.
It's annoying that you can only have one generator, but see Mersenne.Random.Pure64 below for an alternative.

7. "Mersenne.Random.Pure64" by Don Stewart

Don presented another fast library for generating high quality pseudorandom numbers. Don maintains the package at http://hackage.haskell.org/package/mersenne-random-pure64
This implementation uses fast underlying SIMD code like "System.Random.Mersenne", but uses it differently. Rather than having a single global generator accessed via FFI, the Pure64 library lets you have as many Haskell generators as you want. It allows this in a clever way by having Haskell allocate memory, giving the memory address to C to fill with random data, and returning the results.

8. "Motivik, music signal processing" by Jeremy Voorhis

Jeremy presented his project for doing signal processing, which is available at http://github.com/jvoorhis/Motivik
"Motivik is a Ruby domain specific language for computer music. [..] Motivik is a functional, compiled DSL, inspired by Elliott et. al. Its design is comparable to Pan's but specialized for audio signals. Motivik also differs from Pan by producing JIT-compiled code via LLVM rather
than going via C." Link to Conal Elliott's functional-reactive PAN: http://conal.net/papers/bridges2001/
The Ruby DSL provides an elegant way to interact and manipulate sound data, treating it like mathematical formulas.
The library uses LLVM to generate fast, compiled code and replacements for many Ruby features, like "sin".
Jeremy demonstrated the library by generating and altering various sounds.