Right way to fan out work loads

106 views
Skip to first unread message

David Belle-Isle

unread,
Sep 8, 2021, 12:02:34 PM9/8/21
to golang-nuts

Hi,

I've been facing this question for a while and never managed to find the "right" answer. Hopefully this forum will be able to enlighten me a little bit.

Given a very simple pattern: Consume some data, transform it, store it (ETL). The storing part is slow(er) and needs to be fanned out.

The question: What's the best (correct? idiomatic?) way to implement that in Go?

A) From "main", buffer the incoming data and launch a goroutine and pass it the data to store. Similar to how you could implement a web server handling an incoming connection in Go.

OR

B) From "main", create N channels and spin up N goroutines to send down data to workers. Round-robin writes to the N channels.
B1) Do you buffer the data in "main" or after the channel, in the goroutine?

I understand that (A) can spin out of control and launch too many goroutines and (B) can run into a bottle neck. Each of these problems can be easily addressed. I'm more interested in hearing what you think is the "right" way to solve this problem?

Thanks

David

burak serdar

unread,
Sep 8, 2021, 12:06:32 PM9/8/21
to David Belle-Isle, golang-nuts
On Wed, Sep 8, 2021 at 10:02 AM David Belle-Isle <dbell...@gmail.com> wrote:

Hi,

I've been facing this question for a while and never managed to find the "right" answer. Hopefully this forum will be able to enlighten me a little bit.

Given a very simple pattern: Consume some data, transform it, store it (ETL). The storing part is slow(er) and needs to be fanned out.

The question: What's the best (correct? idiomatic?) way to implement that in Go?

A) From "main", buffer the incoming data and launch a goroutine and pass it the data to store. Similar to how you could implement a web server handling an incoming connection in Go.

OR

B) From "main", create N channels and spin up N goroutines to send down data to workers. Round-robin writes to the N channels.

How about creating N goroutines, with one channel. All goroutines listen to the channel. Main goroutine writes to the channel as it receives data, so any available goroutine picks it up.

 
B1) Do you buffer the data in "main" or after the channel, in the goroutine?

I understand that (A) can spin out of control and launch too many goroutines and (B) can run into a bottle neck. Each of these problems can be easily addressed. I'm more interested in hearing what you think is the "right" way to solve this problem?

Thanks

David

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/6ba8cf62-8711-4c39-bf6c-f255727385ffn%40googlegroups.com.

Matt KØDVB

unread,
Sep 8, 2021, 12:10:03 PM9/8/21
to David Belle-Isle, golang-nuts
I’d like to point you to a video I made where I work through different ways to divide up work, both the “work pool” approach and just creating lots of goroutines:

In it I explain how to manage/limit concurrency when the work is I/O bound (which is your case).

Matt
Reply all
Reply to author
Forward
0 new messages