Race condition between two goroutines using a []byte channel; Seems like they are using the same memory for the buffer... huh???

934 views
Skip to first unread message

Mike

unread,
Jul 29, 2018, 10:31:57 PM7/29/18
to golang-nuts
Hello,

I am trying to figure out an issue when I am streaming bytes over a channel between two goroutines: file reading; file writing.  I've encountered a race condition, but it doesn't make sense to me because it seems that the buffer that the reading goroutine creates is being accessed by the writing goroutine.  The scope for the buffer doesn't permit it...  I suspect that I am doing something wrong with the buffer since it is a slice and is sent over the channel, but I am missing something...  I'd really appreciate any suggestions or nudges in the right direction.

The race info and the small app follow this note...

Thank you for your time and interest!

Mike





/**************************************************** MAIN ************************************************************/

// Function main is the entry point for the application and is responsible for configuring its environment.
func main() {

// Variable wg is main's WaitGroup, which detects when all of the goroutines that were launched have completed.
var wg sync.WaitGroup

// Start our timer...
start := time.Now()

// Create the channel that will be used by the file reader and file write goroutines.
ch := make(chan []byte, 4096)

// File Reading Goroutine
wg.Add(1)
go func(outch chan<- []byte) {
defer wg.Done()

// Open the file at kFromPath for reading...
file, ero := os.Open(kFromPath)
if ero != nil {
log.Fatal(ero)
return
}
// Automatically close the file when exiting this method.
defer file.Close()

// Cumulative counters
nBytes := uint64(0)
nChunks := uint64(0)

// The buffer for data that is read from the file.
buf := make([]byte, kMaxBufferSize)

// Loop through the file reading chunks of data, which is sent over the channel.
for {
n, err := file.Read(buf[:cap(buf)])
// Did we read any data from the file? Was there an error?
if n == 0 && err != io.EOF {
log.Fatal(err)
return
}

// Update the cumulative counters.
nChunks++
nBytes += uint64(len(buf))

// Send the data over the channel.
outch <- buf[:n]
}

// Signal to the receiving goroutines that there is no more data.
close(outch)

// When there is no more data to process, display the sender's status.
fmt.Printf("Recv:\t\tnBytes: %d, nChunks: %d\n", nBytes, nChunks)

}(ch)

// File Writing Goroutine.
wg.Add(1)
go func(inch <-chan []byte) {
defer wg.Done()

// Cumulative counters
nBytes := uint64(0)
nChunks := uint64(0)

// Create the output file.
file, erc := os.Create(kToPath)
if erc != nil {
log.Fatal(erc)
return
}
// Automatically close the file when exiting this method.
defer file.Close()

// While there is data to read in the channel, we will get it and writing it to the output file.
for buf := range inch {

// Determine the length of the chunk of data that is available.
bytesRead := len(buf)

// Write the chunk to the output file.
file.Write(buf[:bytesRead])

// Update the cumulative counters...
nBytes += uint64(bytesRead)
nChunks++
}

// When there is no more data to process, display the receiver's status.
fmt.Printf("Sent:\t\tnBytes: %d, nChunks: %d\n", nBytes, nChunks)

}(ch)

// Wait here until all goroutines have completed their work.
wg.Wait()

// Show the duration.
fmt.Printf("Elapsed: %s", time.Since(start))
}



Dan Kortschak

unread,
Jul 29, 2018, 10:48:14 PM7/29/18
to Mike, golang-nuts
You are writing over your input buffer in the read goroutine.

Think about what this does (this is essentially what you are doing with
the channel communication removed):

for {
    r.Read(buf)
}

Each Read call writes bytes into buff, and you are not blocking between
reads, so each time you are mutating the bytes under the writer
goroutine's feet.

On Sun, 2018-07-29 at 19:31 -0700, 'Mike' via golang-nuts wrote:
> Hello,
>
> I am trying to figure out an issue when I am streaming bytes over a
> channel 
> between two goroutines: file reading; file writing.  I've encountered
> a 
> race condition, but it doesn't make sense to me because it seems that
> the 
> buffer that the reading goroutine creates is being accessed by the
> writing 
> goroutine.  The scope for the buffer doesn't permit it...  I suspect
> that I 
> am doing something wrong with the buffer since it is a slice and is
> sent 
> over the channel, but I am missing something...  I'd really
> appreciate any 
> suggestions or nudges in the right direction.
>
> The race info and the small app follow this note...
>
> Thank you for your time and interest!
>
> Mike
>
>
>
> <https://lh3.googleusercontent.com/-1wIGM_0ljU0/W152klCUwQI/AAAAAAAAP
> p4/EzcClDjqR9Y8Tvu2TW7iL6d7plR15nq2gCLcBGAs/s1600/Screenshot%2Bfrom%2
> B2018-07-29%2B19-22-43.png>

Mike

unread,
Jul 30, 2018, 3:40:29 AM7/30/18
to golang-nuts
Hmmmm...  I assumed that the buffer was being copied when it was being sent to the channel, and that blocking would happen on the channel, as needed, since it is buffered...  So, it seems that one solution would be to move the allocation of the buffer inside the top of the for loop, so every trip around it will result in a new buffer allocation.  I wonder what the effect would be on garbage collection and performance, though...  Another option, which is similar to the first one, would be to create a new buffer before sending data to the channel.  Copy the buffer into the new buffer, and send the new buffer to the channel.  Same concerns about GC and performance.  Am I working at too low a level?  I wonder whether I should be using a different approach...

Thanks for the help and hint... It is late here, so I will let your comments percolate overnight... :P

Mike

Jan Mercl

unread,
Jul 30, 2018, 3:53:08 AM7/30/18
to Mike, golang-nuts
On Mon, Jul 30, 2018 at 9:40 AM 'Mike' via golang-nuts <golan...@googlegroups.com> wrote:

> Hmmmm... I assumed that the buffer was being copied when it was being sent to the channel, ...

Assume only what the specs say ;-)

It may be appropriate in some situations to use an array (possibly in a struct with an additional len field). That's effectively a slice that has its backing array copied on assignment as well as on send receive operations.

--

-j

Henrik Johansson

unread,
Jul 30, 2018, 3:54:16 AM7/30/18
to Mike, golang-nuts
Oh I really wish that there were slices that was persistent after slicing so you could safely slice and return without bothering with the underlying storage.
I agree that the overhead might not always be acceptable but the simplicity of slicing and the safety of persistence would be very good.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jake...@gmail.com

unread,
Jul 31, 2018, 5:05:08 PM7/31/18
to golang-nuts
Yes, creating a new buffer for each read is the simplest solution. I think a good workflow is to try that first - simple is good. Do some tests based on real expected and peak workloads. It may be good enough. If you find that memory usage, allocation or GC are bottlenecks, then (and only then) investigate alternatives. The simplest would be to use a sync.Pool to reuse the buffers.
Message has been deleted
Message has been deleted
Message has been deleted

Mike

unread,
Jul 31, 2018, 10:24:24 PM7/31/18
to golang-nuts

So... Here are the results that compare using a FIFO Pipe, Channels, and sync.Pool...


sync.Pool kicked ass, especially on our 3.9GB file size...  Thanks for the suggestion!

Marvin Stenger

unread,
Aug 1, 2018, 5:01:17 AM8/1/18
to golang-nuts
You could also try to use a preallocated array of buffers:
You might have to play a bit with the kMaxBuffers constant.

https://play.golang.org/p/naVtJxKDBOx
Reply all
Reply to author
Forward
0 new messages