How to drive a network of Go "generators" (e.g. looping over multiple chans)?

418 views
Skip to first unread message

Samuel Lampa

unread,
Jul 17, 2015, 11:06:28 PM7/17/15
to golan...@googlegroups.com
I'm experimenting with a dataflow-inspired syntax for stitching "generator processes" together. What I mean with "generator process" is, the "generator pattern" demonstrated by Rob Pike in [1] using the range operator to loop over incoming channels, but encapsulated in structs, with potentially multiple input and output channels, stored as struct fields.

I have a working code example for this, here:
http://play.golang.org/p/e3Bun_q643 (gist: https://gist.github.com/samuell/4d9625dbc3623fed771d#file-dataflow_syntax_example-go-L16-L32 )

So, the actual connection code works excellent, that is (in the main() function), this part:

// Init processes
hs := NewHiSayer()
ss := NewStringSplitter()
lc := NewLowerCaser()
uc := NewUpperCaser()
 
// Network definition
ss.In = hs.Out
lc.In = ss.OutLeft
uc.In = ss.OutRight

The problematic part in the one just below this, which is supposed to drive this whole network of "generator processes", by consuming the most downstream "ends" of the network:

// Drive the processing
for i := 0; i < 10; i++ {
l := <-lc.Out
r := <-uc.Out
println(l, r)
}

I got this to work in this static example, but in the general case I would not know that I should loop over and read from the final output channels exactly 10 times. So, optimally I would want to use something like the range operator on the channels. But I have ran into two problems, for which I'm looking for solutions:

  1. The range operator does not seem to be able to drive the network in either case.
  2. I don't see how to range-loop over two channels at the same time. Is this even possible?
Regarding 1, for a concrete example, changing the for loop above into (thus only consuming the first component in the network):

for item := range hs.Out {
println(item)
}

... does produce no output at all (See playground code for this: http://play.golang.org/p/hnxRLz8RHG )

So, my question is, how do I best go about this (E.g. consuming multiple channels, without knowing how many items they will "yield")? ... or is there some completely different solution to driving such a network of generators?


Many thanks in advance
Best
// Samuel

jonathan...@gmail.com

unread,
Jul 18, 2015, 12:03:29 AM7/18/15
to golan...@googlegroups.com
I only see one close and more than one channel.

Samuel Lampa

unread,
Jul 18, 2015, 7:49:47 AM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
On Saturday, July 18, 2015 at 6:03:29 AM UTC+2, jonathan... wrote:
I only see one close and more than one channel.

Ouch, that's very true, thanks for spotting that!! But I still don't get any output from ranging over a single channel, after fixing this:
// Samuel

Samuel Lampa

unread,
Jul 18, 2015, 8:19:47 AM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
And a new version again, fixing data races pointed out by Yann Hodique on Google+ [1]:
http://play.golang.org/p/EYmgU13847

Still the same problem with looping over a single channel:
http://play.golang.org/p/xQFUIr1MCx

Michael Jones

unread,
Jul 18, 2015, 9:50:48 AM7/18/15
to Samuel Lampa, golan...@googlegroups.com, jonathan...@gmail.com
HiSayer.Init() closes its output channel immediately. That is a problem.

— 
Michael Jones, CEO  •  mic...@wearality.com  •  +1 650 656-6989 
Wearality Corporation  •  289 S. San Antonio Road  •  Los Altos, CA 94022

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Egon

unread,
Jul 18, 2015, 9:54:12 AM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
Use clearer naming then the problem should become obvious


PS, if you are experimenting with FBP you might be better off declaring a supporting framework for it:

+ Egon

Samuel Lampa

unread,
Jul 18, 2015, 9:56:07 AM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com, samuel...@gmail.com
On Saturday, July 18, 2015 at 3:50:48 PM UTC+2, Michael Jones wrote:
HiSayer.Init() closes its output channel immediately. That is a problem.

Ah, right ... because the buffer allows it to empty its content without blocking until further reads - indeed!
(Confirmed that with BUFSIZE = 0 I could range-loop over hs.Out !)

Thanks!

// Samuel

Samuel Lampa

unread,
Jul 18, 2015, 10:16:19 AM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
On Saturday, July 18, 2015 at 3:54:12 PM UTC+2, Egon wrote:

Use clearer naming then the problem should become obvious


Ah, yes, had forgot defer!!
 
PS, if you are experimenting with FBP you might be better off declaring a supporting framework for it:


Nice, these both examples look really really interesting! I have to go now, but will get back and study this in more detail in the night.


Cheers
// Samuel
 

Samuel Lampa

unread,
Jul 18, 2015, 7:05:01 PM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
On Saturday, July 18, 2015 at 3:54:12 PM UTC+2, Egon wrote:
PS, if you are experimenting with FBP you might be better off declaring a supporting framework for it:

So, in short, this is exciting that a framework can be written with this little code, and allowing such a nice and terse syntax!

I guess you know about GoFlow [1] though? I have been playing around a bit with that (e.g. creating a toy component or two [2]).

What I have been trying with the examples listed above though, is to see how far I can go completely without a framework, by just following a determined pattern, and relying on the in-built concurrency primitives in Go.

Your mini-framework looks really nice though! (You wrote that in under 1 hour?! :) ) It will be good learning material for me :)

One thing I wonder though, that I wasn't able to figure out from the code yet (newbie as I am), is whether the communications are happening across normal go-channels, or whether a reflection call is needed on receiving from the in-ports? (I'll study the code more, but thought it be faster to get some pointers :) ).

(Why I wonder is since this is, IIRC how it works in GoFlow, and which also seems to be what is causing a performance degradation compared to pure go channels. This was a bit unfortunate, since the pure channels, especially with adequately sized buffers, actually have decent performance, comparable to unix pipe buffers, according to my benchmarks. Thus, keeping close to that "gold standard" would mean it is much easier for me to propose using go's channels rather than unix pipes for stitching together stream-processing bioinformatics components.)

Best
// Samuel

Samuel Lampa

unread,
Jul 18, 2015, 8:16:45 PM7/18/15
to golan...@googlegroups.com
Ok, I think I finally found out a better way to write the driver loop now as well (ugh, my go-skills have became so rusty):
// Drive the processing
for {
	l, okl := <-lc.Out
	r, okr := <-uc.Out
	if !okl && !okr {
		break
	}
	println(l, r)
}
If you know a better way still, let me know!

Samuel Lampa

unread,
Jul 18, 2015, 8:17:23 PM7/18/15
to golan...@googlegroups.com
Oh, and I think I forgot to mention that I've blogged about this too:

Samuel Lampa

unread,
Jul 18, 2015, 9:55:59 PM7/18/15
to golan...@googlegroups.com, jonathan...@gmail.com
On Saturday, July 18, 2015 at 3:54:12 PM UTC+2, Egon wrote:

Hey, after pondering a bit, I've came to conclude I *really* like your suggestion to have the go statements inside the main loop instead! (And I like most of the naming changes as well :) )

It sure might look a bit better, but the best part - I realized - is that by just leaving out the "go" statement for the last process in a chain, removes the need for a final "driver loop" altogether!! Might seem like a simple thing for you veterans, but I was banging my head with this a lot before!

Thus, with the addition of a zipper and a printer (more suited as the terminating process), I can now write:

// Init processes
hisay := NewHiSayer()
split := NewStringSplitter()
lower := NewLowerCaser()
upper := NewUpperCaser()
zippr := NewZipper()
prntr := NewPrinter()
 
// Network definition *** This is where to look! ***
split.In = hisay.Out
lower.In = split.OutLeft
upper.In = split.OutRight
zippr.In1 = lower.Out
zippr.In2 = upper.Out
prntr.In = zippr.Out
 
// Set up processes for running (spawn go-routines)
go hisay.Run()
go split.Run()
go lower.Run()
go upper.Run()
go zippr.Run()
prntr.Run() // So this is what drives the chain!

Heey, this is Niiice! Thank you thank you Egon for this suggestion! :)


Updating the blog post in a second too:
  http://bionics.it/posts/how-i-would-like-to-write-golang

Egon

unread,
Jul 19, 2015, 1:02:57 AM7/19/15
to golan...@googlegroups.com, jonathan...@gmail.com


On Sunday, 19 July 2015 02:05:01 UTC+3, Samuel Lampa wrote:
On Saturday, July 18, 2015 at 3:54:12 PM UTC+2, Egon wrote:
PS, if you are experimenting with FBP you might be better off declaring a supporting framework for it:

So, in short, this is exciting that a framework can be written with this little code, and allowing such a nice and terse syntax!

I guess you know about GoFlow [1] though? I have been playing around a bit with that (e.g. creating a toy component or two [2]).

Few years ago, yes, I looked at it.
 

What I have been trying with the examples listed above though, is to see how far I can go completely without a framework, by just following a determined pattern, and relying on the in-built concurrency primitives in Go.

Your mini-framework looks really nice though! (You wrote that in under 1 hour?!

Yup :)

:) ) It will be good learning material for me :)

One thing I wonder though, that I wasn't able to figure out from the code yet (newbie as I am), is whether the communications are happening across normal go-channels, or whether a reflection call is needed on receiving from the in-ports? (I'll study the code more, but thought it be faster to get some pointers :) ).


Currently it relies heavily on reflection in the driving loop:

For performance, instead of wiring things together at runtime you can generate the code. I.e. generate the same code you have in the "frameworkless" approach, it shouldn't be too difficult.

You can avoid the channels by writing the nodes as a function/method:

func Split(v string) (left, right string) {
    m := len(v)/2
    return v[:m], v[m:]
}

Then use some declaration, such as:
Node{Func: Split, Ports: {"In": 1, "Left":-1, "Right":-2}}

Here you have to use numbers to pick out the values, reflect doesn't provide a way to get the names of the in and out params. Once you have declared such functions you can wire them together. But this will still have the func call overhead, so it might make sense to work on batches instead of strings. I.e. have a node that converts 100 string-s and outputs []string... basically all the nodes work on a batch of strings instead of a single string.

But if performance is a real consideration I would combine channels at the top-level (without a framework) and try to create heavy-loops inside the nodes. Each node slows things down.

I'm not sure why particularly you are using FBP and what problems it solves for you, I find the "regular Go code" nicer to work with.

+ Egon

Samuel Lampa

unread,
Jul 21, 2015, 8:41:45 AM7/21/15
to golan...@googlegroups.com, jonathan...@gmail.com
On Sunday, July 19, 2015 at 7:02:57 AM UTC+2, Egon wrote:
For performance, instead of wiring things together at runtime you can generate the code. I.e. generate the same code you have in the "frameworkless" approach, it shouldn't be too difficult.

That is a very interesting approach, that I'll ponder!
 
I'm not sure why particularly you are using FBP and what problems it solves for you, I find the "regular Go code" nicer to work with.

It is mostly that we aim to create a repository of packaged "workflow components", for common tasks in our specific problem area (machine learning in pharmaceutical bioinformatics), that we can easily stitch together into new workflows.

Having them defined as structs with the channels as fields, seems to give both a lot of flexibility in how to wire things (even changing dynamically), with a readable and declarative syntax.

The pre-defined struct fields also works well with auto-completion, so that we don't need to look up the implementation in order to do the wiring (well, function signatures do too, but we feel the struct field approach makes for even more readable and easier-to-work-with code).

It is a bit of an experiment still though, so we will evaluatet the approach after we've tried it a bit, and see whether we like it or not.

Again, thanks a lot for all the great feedback and hints! I have really learned a lot from it.

Best
// Samuel
Reply all
Reply to author
Forward
0 new messages