Meeting notes: 2009-12-14 -- Binary, Serialize, Linux kernel modules, args, random, Motivik

18 views
Skip to first unread message

Igal Koshevoy

unread,
Dec 15, 2009, 2:22:48 PM12/15/09
to pdxfunc
We had a great meeting with many excellent talks and a turnout of nearly 30 people. Thanks to all that participated!


FORMATS

Most of the night's talks were 10-20 minutes long and this worked very well. While we're still interested in long talks, these short talks are nice because they offer lots of content on many topics and give a chance for more presenters to participate.


VENUES

We've had some logistics issues again that came up at the last minute, including nearly losing access to the venue (which Reid Beels saved us from), not having a projector (which Bart Massey saved us from), seem to have run out of chairs, etc.  Given these issues, I intend to explore alternative venues and would be grateful if you can send me personal email with leads from organizations that'd be interesting in hosting user group meetings for us and possibly others -- I've had offers from Galois and Reductive Labs for long-term and PSU for short-term.


PRESENTATIONS

We had the following presentations:
  • "Data.Binary" by Don Stewart
  • "cereal: Data.Serialize" by Trevor Elliot
  • "Linux Kernel Modules in Haskell" by Thomas DuBuisson
  • "Command line argument parsing" by Bart Massey
  • "Control.Monad.Random" by Julian Blake Kongslie
  • "System.Random.Mersenne" by Don Stewart
  • "Mersenne.Random.Pure64" by Don Stewart
  • "Motivik, music signal processing" by Jeremy Voorhis

My detailed notes on the presentations -- please post any additions or corrections:

1. "Data.Binary" by Don Stewart
  • Don presented the Data.Binary serialization system that he co-wrote and maintains at http://hackage.haskell.org/package/binary
  • Wanted a  fast mechanism to convert Haskell data to bytes and back, such as for marshalling data across the network, to disk, distributed systems, etc.
  • Had to choose which features and their associated design constraints, e.g., lazy, fast, flexible, etc.
  • "The Bits Between The Lambdas" by Wallace and Runciman describes a Haskell API for treating storage media as arbitrary-length lazy streams. Full text available for ACM members at: http://portal.acm.org/citation.cfm?id=286872
  • The Wallace and Runciman paper provided the basis of Haskell's NewBinary package that's been used for many years. However, it had issues: it was strict, slow (100x slower than the Binary package), complicated (used different disk and other bindings), and impure.
  • The Binary Strike Team met in Oxford in 2007 to create a new system:
    • Simple interface.
    • Fast, its capable of 1G/second streaming.
    • Lazy, can encode/decode in constant space.
    • Functions to put/get atomic types
    • Compositional
    • Widely-used, is one of the top 5 most-used Haskell libraries
  • Complains made about this Binary library:
    • Lazy operations result in asynchronous exceptions, which can be hard to deal with.
    • Some people confuse the Put/Get and Binary class, since these operate at different levels.

2. "cereal: Data.Serialize" by Trevor Elliot
  • Trevor presented the Data.Serialize library ("cereal" package on hackage) that he maintains at http://hackage.haskell.org/package/cereal
  • The Binary library's lazy nature results in asynchronous exceptions that are difficult to deal with. Instead, the Seralize library adds a Get monad that's an Exception and State monad to provide continuations for success and failure, which is clearer.
  • The Binary library's lazy nature results in overhead maintaining the stream's state and doing housekeeping on the constant space buffers. Instead, the Serialize library processes a strict byte stream, which makes it faster and simpler.
  • The Serialize library preserves compatibility with many of Binary's functions.
  • Added "label" function to annotate data to make it possible to track parsing errors.
  • Added "isolate" function to read specific bytes from a strict bytestream and if there's an issue, returns a failure and label to identify the problem. Adds useful combinators, such as "getListOf", "getTwoOf", etc.

3. "Linux Kernel Modules in Haskell" by Thomas DuBuisson

4. "Command line argument parsing" by Bart Massey
  • Bart presented a number of libraries for parsing command line arguments in Haskell.
  • System.Console.GetOpt
    • It's a Haskell port of the C "getopt" library.
    • Bart felt it was very awkward and ugly.
  • System.SimpleArgs
  • parseargs
    • Details at http://hackage.haskell.org/package/parseargs
    • Bart wrote it and seems to like it. :)
    • The Haskell code that one writes to describe the options seemed much clearer and more descriptive, although more verbose, than the alternatives.
    • Uses functions like `getArgInt` that can fail if it can't parse them, which can result in very confusing errors.
  • Console.cmdargs
    • Details at http://hackage.haskell.org/package/cmdargs
    • Recently published library with potential.
    • Uses combinators to describe how to parse the arguments and options. While this approach is powerful, it's also rather awkward to write and difficult to read.
  • What's the next parser?
    • Bart discussed an unwritten parser that uses a different approach that has potential to improve things, where the user writes XML in a standardized documentation format and generates a parser for it, so the parser and documentation stay in sync.
    • Igal showed a somewhat similar approach in Perl's GetOpt::Declare parser. You write a documentation string that follows some conventions and instantiate a parser with it to describe the rules. Although its documentation string format isn't as structured as Bart's XML, it's very clear and easier for most humans to reason about. Details: http://www.perlmonks.org/?node_id=56048

5. "Control.Monad.Random" by Julian Blake Kongslie

6. "System.Random.Mersenne" by Don Stewart
  • Don presented a fast library for generating high quality pseudorandom numbers, it's 100x faster than older approach. Don maintains the package at http://hackage.haskell.org/package/mersenne-random
  • The implementation uses very fast SIMD code written in C that exposes a single global, that Haskell accesses as a single generator through FFI.
  • It's annoying that you can only have one generator, but see  Mersenne.Random.Pure64 below for an alternative.

7. "Mersenne.Random.Pure64" by Don Stewart
  • Don presented another fast library for generating high quality pseudorandom numbers. Don maintains the package at http://hackage.haskell.org/package/mersenne-random-pure64
  • This implementation uses fast underlying SIMD code like "System.Random.Mersenne", but uses it differently. Rather than having a single global generator accessed via FFI, the Pure64 library lets you have as many Haskell generators as you want. It allows this in a clever way by having Haskell allocate memory, giving the memory address to C to fill with random data, and returning the results.

8. "Motivik, music signal processing" by Jeremy Voorhis
  • Jeremy presented his project for doing signal processing, which is available at http://github.com/jvoorhis/Motivik
  • "Motivik is a Ruby domain specific language for computer music. [..] Motivik is a functional, compiled DSL, inspired by Elliott et. al. Its design is comparable to Pan's but specialized for audio signals. Motivik also differs from Pan by producing JIT-compiled code via LLVM rather
    than going via C." Link to Conal Elliott's functional-reactive PAN:  http://conal.net/papers/bridges2001/
  • The Ruby DSL provides an elegant way to interact and manipulate sound data, treating it like mathematical formulas.
  • The library uses LLVM to generate fast, compiled code and replacements for many Ruby features, like "sin".
  • Jeremy demonstrated the library by generating and altering various sounds.

-igal
Reply all
Reply to author
Forward
0 new messages