[ANN] Transducers for Go

830 views
Skip to first unread message

Sam Boyer

unread,
Dec 5, 2014, 6:59:29 PM12/5/14
to golan...@googlegroups.com
A few months ago, Clojure introduced "transducers" - a composable way to build reusable algorithmic transformations - and baked it into the foundations of their latest release. Think iteration, pipelines, generics (eek!). I thought it'd be fun to to an implementation in Go.

Took a little while to get it all together, but I've gotten a fully working proof of concept together: https://github.com/sdboyer/go-transducers . (The README there is designed to frame a conversation about this, so no point in duplicating it here). Honestly, I'm not sure how I feel about it - it's not idiomatic Go, but writing it also didn't feel the way it usually does when I stray from the idiomatic path. 

I'd love feedback!

Sam Boyer

unread,
Dec 5, 2014, 7:20:58 PM12/5/14
to golan...@googlegroups.com
though i suppose a little example eye candy won't hurt:




func main() {
// To make it work, we need four things. (definitions are in the README's glossary):
// 1) an input stream
input := Range(4) // ValueStream containing [0 1 2 3]
// 2) a stack of Transducers
transducers := []Transducer{Map(Inc), Filter(Even)} // increment then filter odds
// 3) a reducer to put at the bottom of the transducer stack
reducer := Append() // very simple reducer - just appends values into a []interface{}
// 4) a processor that puts it all together
result := Transduce(input, reducer, transducers...)

fmt.Println(result) // [2 4]


// Or, we can use the Go processor, which does the work in a separate goroutine
// and returns results through a channel.

// Make an input chan, and stream each value from Range(4) into it
in_chan := make(chan interface{}, 0)
go StreamIntoChan(Range(4), in_chan)

// Go provides its own bottom reducer (that's where it sends values out through
// the return channel). So we don't provide one - just the input channel. Note
// that we can safely reuse the transducer stack we declared earlier.
out_chan := Go(in_chan, 0, transducers...)

result2 := make([]interface{}, 0) // zero out the slice
for v := range out_chan {
result2 = append(result2, v)
}

fmt.Println(result) // [2 4]

David Anderson

unread,
Dec 5, 2014, 7:27:34 PM12/5/14
to Sam Boyer, golang-nuts
I must be missing something - isn't this a very indirect way of setting up goroutines and channels to do what you want? Doing this with transducers reads a bit like node.js asynchronous spaghetti, rather than nice blocking straight-line code. Am I missing the point of transducers?

- Dave

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jason Playne

unread,
Dec 5, 2014, 7:43:53 PM12/5/14
to David Anderson, Sam Boyer, golang-nuts
Hi Sam,

Firstly, nice work on scratching your programming itch!

While the transducer code looks simpler/cleaner - I feel that using channels for your stream processing is a little more cleaner/clearer for the intent of the program. In this case less code is not more understandable.

Sam Boyer

unread,
Dec 5, 2014, 7:47:33 PM12/5/14
to golan...@googlegroups.com, samuel....@gmail.com
yeah, it's a very different way of putting things together - much, much more functional in style. it took me a couple weeks to even wrap my head around it.

now, my own take on what makes for bad spaghetti in node/js is that you see nested series of callbacks get used for more or less everything; this is a much more narrowly defined domain. but functional languages tend to be...well, functional. so, lots of passing functions to functions. and Go's got plenty of support for that sorta thing.

lemme give a different example: this is what it looks like to define an absurd/fun transform that was published (https://gist.github.com/richhickey/b5aefa622180681e1c81) as an example in clojure early on:

xform := []Transducer{
Map(Inc),
Filter(Even),
Dedupe(),
Mapcat(Range),
Chunk(3),
ChunkBy(func(value interface{}) interface{} {
return sum(value.(ValueStream)) > 7
}),
Mapcat(Flatten),
RandomSample(1.0),
TakeNth(1),
Keep(func(v interface{}) interface{} {
if v.(int)%2 != 0 {
return v.(int) * v.(int)
} else {
return nil
}
}),
KeepIndexed(func(i int, v interface{}) interface{} {
if i%2 == 0 {
return i * v.(int)
} else {
return nil
}
}),
Replace(map[interface{}]interface{}{2: "two", 6: "six", 18: "eighteen"}),
Take(11),
TakeWhile(func(v interface{}) bool {
return v != 300
}),
Drop(1),
DropWhile(IsString),
Remove(IsString),
}

so, one advantage is that you can *see* the whole flow of the transformation. it might take a little while to understand what each step does, but you can still see the entire transform flow right at the top.

additionally, there's the advantage here of composition. i've been chatting with a friend who's thinking about building a streams-oriented web framework in Go. transducers, being composable, present the opportunity to compose together the functionality of a bunch of different logical segments - let's say, from a third-party plugin ecosystem - in a comprehensible, orthogonal, and extensible way.

but yeah...very different than the way Go usually looks.

atd...@gmail.com

unread,
Dec 5, 2014, 7:51:20 PM12/5/14
to golan...@googlegroups.com
At first glance, it looks a bit esoteric to me but is it a processing pipeline ?

In which case I would ask.. how do you handle errors ?

Just so that I can get a clearer picture of what a transducer is. Might need one for my spaceship. :) (ok I'm out ->)

Sam Boyer

unread,
Dec 5, 2014, 7:53:50 PM12/5/14
to golan...@googlegroups.com, da...@natulte.net, samuel....@gmail.com
hi Jason - thanks!

sure, i hear that. this does work with channels, though :) input can be a channel, or a slice, or...

the original goal in clojure was to provide a way of operating on collections, in whatever form they take - an array-ish thing, or a stream of values coming through a channel.

the way i tend to think about the relationship between transducers and channels is that you just attach the transform to the channel. so it goes from "messaging primitive" to "message + transform primitive-ish"

Sam Boyer

unread,
Dec 5, 2014, 7:56:44 PM12/5/14
to golan...@googlegroups.com
yep, it's a system for composing processing pipelines.

don't handle errors at all right now :) not *within* the pipeline, anyway. i thought about doing it a bit, but i was going for a pretty strict 1:1 parity with clojure as a proof of concept, so i didn't do it. i do have a sense of how we might, though, and it could fit pretty nicely with the existing implementation.

Sam Boyer

unread,
Dec 5, 2014, 8:05:16 PM12/5/14
to golan...@googlegroups.com
if it helps, i actually grokked the basic idea behind what a transducer is (a function that transforms a reducing function) by reading through this breakdown in javascript:



On Friday, December 5, 2014 7:51:20 PM UTC-5, atd...@gmail.com wrote:

Jason Playne

unread,
Dec 6, 2014, 3:04:16 AM12/6/14
to Sam Boyer, golang-nuts
After seeing the example you posted - I can definitely see the benefit of being able to see the whole code flow/transformation at once. 

My thought here though is that you can write these transformations cleaner/clearer using normal patterns and deal with errors at to boot.

My question is - What benefits do you feel this design pattern can bring (other than being another way to solve a problem) over well structured idiomatic code?



Sam Boyer

unread,
Dec 8, 2014, 1:57:48 AM12/8/14
to golan...@googlegroups.com, samuel....@gmail.com
yeah, error handling is probably something i should have done more with before announcing. but there's no reason THAT Go idiom can't be accommodated, at least within the core of the transducing system - it's just another return value on Reducer.Step() and Reducer.Complete(). i think that'll fit pretty nicely.

there's no question that for simple transformations/operations, it's simpler to just range over a structure and do the work with traditional idiomatic Go. i can imagine ways that these approaches could be equally fluid if they were syntax instead of expressions, but done in userland, they're too awkward for the simple cases.

so, understanding that this is solely in reference to larger/more complex cases - to your main question about added benefits:
  1. transducers force you to break your logic down into discrete, composable segments. i think the only serious criticism to be made of composition is that it takes deliberate, conscious effort, whereas it's much easier to splat everything together. splatting is mostly fine for simple cases...but i already said this is probably not appropriate for simple cases :)
  2. this sort of composition could be particularly useful if, as i think i mentioned before, someone's building a system where other, unknown segments of code will be plugged in later. letting things add to a transducer stack could be a good way of distributing very narrow, specific control through (pluggable/contrib/third-party) application components.
  3. i updated the example a bit at https://github.com/sdboyer/transducers-go#pudding---proof to emphasize one of the really important things about transducers: you create the transformation stack once, and it can then be reused by any processor (eager or lazy, concurrent or not, etc.). this is because transducers have strong constraints and expectations - lack of side effects, stateful only in specific ways - and strong constraints yield strong guarantees.
  4. if you need to need to write an engine powering a query language or certain kinds of DSLs, it's potentially useful to have a generative, composable transformation system onto which you can map the instructions from the higher-level form. transducers could probably fit that bill.
at a certain point, it becomes a sum of the different benefits together - it's not that you can't accomplish these tasks through conventional Go code, but...well, look at http://blog.golang.org/pipelines . creating pipelines does require a fair bit of knowledge and expertise, at least some boilerplate - and that's to create a single-purpose, marginally-reusable pipeline. if other things are acceptable about them (type un-safety, performance), transducers might be able to provide a common foundation.

would folks find it helpful if i worked up some apples-to-apples comparisons of slightly more real-world cases? i'd be happy to do a couple of those (or guide someone else interested in tinkering). suggestions for things to build analogues to would be welcome - all that's currently springing to mind is https://github.com/bmizerany/perks/tree/master/quantile :)

Jason Playne

unread,
Dec 8, 2014, 3:33:47 AM12/8/14
to Sam Boyer, golan...@googlegroups.com
I would actually be very keen for some more examples!

As for ideas, I am currently writing a MODE-S (aircraft) decoder. Each aircraft transmits a bunch of MODE-S packets while it is operation. Depending on the type/age of the MODE-S transmitter you get differing amounts of information from it. 

Each packet may be of a different type (DF-0 -> DF31)
Each package may contain a given set of information like ICAO code/Altitude/ Lat+Long part 1/ Lat+Long part 2/ etc etc etc etc

long story short - lots of differing types of information in lots of different sorts of packets.

I need to funnel these down and reconstruct which plane sent what, keep the state of any plane currently transmitting data and make a decision on where/when to send that complete plane data upstream

Do you think that these transducers would be a good fit for this data flow?

My current implementation has a fan out->fan in approach and a lot of "let's get this working" code so it is ripe for refactoring.

I welcome your thoughts!


atd...@gmail.com

unread,
Dec 8, 2014, 4:43:03 PM12/8/14
to golan...@googlegroups.com, samuel....@gmail.com
Me too, I'm curious :)

Sam Boyer

unread,
Dec 8, 2014, 5:11:38 PM12/8/14
to golan...@googlegroups.com, samuel....@gmail.com
sure, i can work something up. might take me a day, pretty busy atm, but i'll put it up when i've got it.

Jason Playne

unread,
Dec 8, 2014, 7:09:19 PM12/8/14
to Sam Boyer, golang-nuts
I am looking forward to it!

Sam Boyer

unread,
Dec 11, 2014, 4:08:57 AM12/11/14
to golan...@googlegroups.com, samuel....@gmail.com
did a little googling. there's a lot to this space, it seems :) i've got no experience dealing with this kind of raw signal processing, so i'm probably gonna make some poor guesses here - hopefully it won't undermine the utility of the example. 

if i'm correctly understanding what you described and what i've found, it seems like there are several distinct parts to tackling this. here's some basics i've figured out:
  • a MODE S packet is guaranteed to be in one of the 25 downlink formats (DFs).
  • the message packet is and is further guaranteed to be either 56 or 112 bits long. both message lengths have 8-bit preambles, though it seems that information is not useful except to the PPM signal receiver itself, so i'm gonna treat it as discarded by the time it makes it to your software.
  • the front of the message (after the preamble) is always a 5-bit value indicating the DF type.
  • at the tail end of all messages is the ICAO address, a 24-bit value issued to each transponder at the time of its registration with its country of origin's national registry. basically, this is our unique identifier. it's also overlaid with parity information...somehow (https://www.ll.mit.edu/mission/aviation/publications/publication-files/atc-reports/Gertz_1984_ATC-117_WW-15318.pdf). i don't really get how that works - like i said, i don't usually work this close to raw signals :)

i've only been able to find one thing about varying MODE-S transmitter/transponder capabilities, and it suggests that not all transponders support all DF types; there's a leveling classification for transponders based on what they support. i think that would ultimately pertain to a part of this software which involves sending a signal out to something else to request more data from the aircraft, and that's kinda out of the scope of discussion here, so i'm gonna mostly skip it.

i don't have criteria for "complete plane data," of course, but clearly it entails that it takes multiple, (presumably) different types of DF messages from an aircraft to assemble the complete data. what's not clear is whether the sending along of this complete plane data should result in a termination of data processing for that aircraft, or if it's more the sort of thing where you continuously periodically send out completed state. because it's simpler for example purposes, i designed it around the former model - a pipeline is created for an aircraft when the message comes in, and it tears down when the aircraft state is sent upstream.

with these constraints in mind, it seems like there are five steps in the application:
  1. basic validation/parity checking 
  2. aircraft identifier extraction (basically, reading the ICAO address out the last 24 bits)
  3. DF type determination and conversion from raw data into a well-typed data
  4. reconciliation of new message into existing plane state, or creation of new plane state if does not currently exist
  5. evaluation of plane state 'completeness', then send along to 'upstream' if so


not brief, but there's a lot of meat there, and it's commented every step of the way. i think this does give a reasonably good sense of what transducers can do.

if you're just scanning, make sure to catch the "HEY THIS IS WHERE IT GETS COOL" part. fwiw, when i was writing the first half (the entry transformation), it all felt like more of a hindrance than a help. the latter half (the per-aircraft transduction pipelines), seems to fit as well as could be hoped. the easy and immediate composition afforded by the transducers approach - and the way that composition aligns with what i'm decently-ish sure to be *just* where this use case requires both structure and flexibility - is pretty nifty.

Jason Playne

unread,
Dec 11, 2014, 5:28:54 AM12/11/14
to golang-nuts
Oh wow - this is gonna take some time to digest!

Thanks man!

*starts digging in*

Jason Playne

unread,
Dec 11, 2014, 6:55:03 AM12/11/14
to golang-nuts
Hi Sam,

That definitely looks like a very interesting way to tackle the problem, and to answer some of your ponderings: 

* The ICAO address is xor'd with the CRC...
* The altitude is encoded in Gilham code format and depending on format can be either 12 or 13 bits long :)
* Lat and Lon are encoded in 2 different DF17 packets (Extended Squitters)

There is quite a lot of work to do to get all the information out!

I think I might have a crack at using the transducer method in my app and see what I can come up with

Thank you again for your awesome example :)

Sam Boyer

unread,
Dec 11, 2014, 9:41:57 AM12/11/14
to golan...@googlegroups.com
sure, happy to do it. i fear it grew overlong, to the point where it maybe wasn't so helpful for others interested in understanding this, but hey, another fun little learning experience for me either way.

it did push me further in the direction of believing that there is a niche into which transducers could fit nicely with go. but still, i'd still love more feedback.

maybe i could direct folks' attention to ValueStream? https://github.com/sdboyer/transducers-go/blob/master/iter.go#L21

it's foundational to the library, but the implementation feels wasteful and awkward.

atd...@gmail.com

unread,
Dec 11, 2014, 12:27:41 PM12/11/14
to golan...@googlegroups.com
I'm impressed that you took the time to apply the concept to a completely new topic.

But that makes two things I do not understand now as far as I am concerned :D

I will be really honest : it really "looks" complicated to me.

Now, one way I would have seen transducers in a simple way is just as a matrix in which elements are functions. So basically piling slices of functions.

But you can tell me if it's more than that.

Also are there any very very simple use cases ? DSP ? Is it just some kind of formalism ? Thank you for the link btw.

Sam Boyer

unread,
Dec 11, 2014, 1:39:38 PM12/11/14
to golan...@googlegroups.com


On Thursday, December 11, 2014 12:27:41 PM UTC-5, atd...@gmail.com wrote:
I'm impressed that you took the time to apply the concept to a completely new topic.

thanks!
 

But that makes two things I do not understand now as far as I am concerned :D

I will be really honest : it really "looks" complicated to me.

me too.
 

Now, one way I would have seen transducers in a simple way is just as a matrix in which elements are functions. So basically piling slices of functions.

But you can tell me if it's more than that.

yes! that's one of the building blocks. or, very nearly that. if []Transducer were just a slice of functions, then there'd need to be some kind of loop that ranges over that slice, calling each function in turn. that loop would be responsible for connecting the slice of functions end to end. instead, Transducers connect to each other directly, basically via function composition (http://en.wikipedia.org/wiki/Function_composition).
 

Also are there any very very simple use cases ? DSP ? Is it just some kind of formalism ? Thank you for the link btw.

yeah, i'm trying to figure out a case that is a) simple b) illustrative and c) realistic/practical. i'm still coming up empty, unfortunately. maybe i'll ask some clojure folks to hop in here, see if they can't help with such a thing...though from my googling around, it does seem like most of the discussion there is still fairly abstract, too.

here's one article, though, that talks about building a twitter processing pipeline out of transducers: http://matthiasnehlsen.com/blog/2014/10/06/Building-Systems-in-Clojure-2/ .

i know nothing about DSP, unfortunately, so that probably won't be of help.

i think it's a bit more than a formalism (though my definition for that is loose). if it is, then it's probably a bunch of layers of formalisms. but the foundation is this observation about operating on collections: all the operations we perform on collections (e.g. map, filter, reduce) can be expressed as reductions. almost everything else in this whole system follows from that observation.

even if that makes sense as a general statement, it's hard to just "see". so, here's the early code i wrote when i was first wrapping my head around this:


in that early code, map is implemented four ways:
  1. literally, with DirectMap(). that function takes the Mapper predicate, and a collection, and returns the final value.
  2. directly through reduction, with MapThruReduce(). that function also takes a Mapper predicate, and a collection, and returns the final value.
  3. indirectly through reduction, with MapFunc(). that function takes a Mapper predicate, but no collection. instead, it returns a Reducer - a func that can then later be applied to a collection to perform that mapping operation. it's then the responsibility of a processor to apply that Reducer to collections.
  4. indirectly through transduction, with Map(). this takes a Mapper predicate, and no collection. it returns a func that takes a Reducer and returns a Reducer. and that's the essence of a transducer: it transforms a reducing function. and again, it's the responsibility of a processor to apply the transform to a collection.
there's a strongly logical progression into meta-ness through the four of these, but if you can latch on, you've got the core of transducers. there's also a great writeup in javascript (http://phuu.net/2014/08/31/csp-and-transducers.html) about this progression. that was my primary guide for grokking it.

maybe i should try to do something simple overtop of net/http...that would be more adjacent to peoples' experience and interest, i think?

cheers
s
Reply all
Reply to author
Forward
0 new messages