Best batch processing example in go

1,862 views
Skip to first unread message

Peng Yu

unread,
Jan 23, 2018, 9:03:25 AM1/23/18
to golang-nuts
Hi,

I'd like to implement batch processing in go. There should be a goroutine for reading input and several worker goroutines for processing the input in chunks. The workers may not finish  processing the chunks in the same order as they are in the input. But the processing results should be output in the same order as the chunks in the input.

I see a number of resources that are relevant to the above goal. But I am not sure what is the best solution in terms of how easy the code is and how much less performance overhead there is. Does anyone have any advice on the best solution to this problem? Thanks.

P.S., here is one webpage, but it does not allow multiple workers.

--
Regards,
Peng

matthe...@gmail.com

unread,
Jan 23, 2018, 7:13:53 PM1/23/18
to golang-nuts
I don’t have an example but a concurrent solution seems straightforward to reason about. Here’s a description of where I’d start:

Perhaps have a type to send on the buffered chan from the one producer goroutine?

type Chunk struct {
 
Index int
 
Data  []byte
}

Then make any number of worker goroutines that read from the buffered chan Chunk (maybe a count equal to https://golang.org/pkg/runtime/#NumCPU) and do the processing. Then have a single consumer goroutine that reads processed Chunks from these workers via an unbuffered channel. This consumer orders the work output and returns the data once complete. Alternatively the workers could write their output into a slice or array where each index can only be written by one goroutine.

The best solution depends on your application. For a network server with multiple clients a serial approach may be simpler than a concurrent one and have similar performance. How will your batch processing be used?

Matt
Reply all
Reply to author
Forward
0 new messages