Go routines managment

339 views
Skip to first unread message

Rafael Justo

unread,
May 15, 2014, 12:22:36 PM5/15/14
to golan...@googlegroups.com
Hi,

I have a program that while processing a data (implies in network I/O) could generate more data to be processed. I would like to use go routines to make this processing faster, but creating a new go routine (and using sync.WaitGroup) for each new data isn't a good solution (to many go routines can cause the error "too many open files"). I thought on making a pool of go routines with buffered channels and a round robin strategy to distribute the data to be processed, but I will never known when the program is finished (all go routines are waiting for more data and there's no data to process). Is there a pattern to solve this kind of problem?

Thanks in advance!
Rafael

Sameer Ajmani

unread,
May 15, 2014, 1:56:17 PM5/15/14
to Rafael Justo, golang-nuts

Creating a new goroutine per file seems fine.  To avoid opening too many files at once, limit the number of open files using a semaphore (see Effective Go for how to do this using a buffered channel).

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nate Finch

unread,
May 15, 2014, 3:54:11 PM5/15/14
to golan...@googlegroups.com
You just need one controlling goroutine to grab the data and shove it onto a channel, and close the channel when it's done.  All the goroutines will read from the channel, and stop reading when the channel closes. 

func main() {
  var wg *sync.WaitGroup
  ch := make(chan data)
  for x := 0; x < max_goroutines; x++ {
      wg.Add()
      go handleData(wg, ch)
  }

  // shove data into ch
  // until there's no more data

  close(ch)
  wg.Wait()
}

func handleData(wg *sync.WaitGroup, ch <-chan data) {
  for val := range ch {
    // work with data here
  }
  wg.Done()
}

The range over the channel will exit the loop when the channel closes, and the goroutine will call wg.Done(), and then close, which will then signal the main goroutine that we're finished handling all the data, and the program will exit gracefully.

Dmitry Vyukov

unread,
May 16, 2014, 2:40:15 AM5/16/14
to Nate Finch, golang-nuts
And if you have more than 1 request (each resulting in a set of file
operations), then you still can use WaitGroup to track request
completion, even if goroutines are persistent:

var wg sync.WaitGroup
wg.Add(...)
for ... {
workchan <- &WorkItem{file: f, wg: &wg}
}
wg.Wait()

// workers
for w := range workchan {
process(w)
w.wg.Done()
}

I mean that you don't necessary need to terminate goroutines to use
WaitGroup. It can track completion of any abstract task.

Rafael Justo

unread,
May 16, 2014, 9:14:41 AM5/16/14
to Sameer Ajmani, golang-nuts
Thanks Sameer! I will use this approach. There's also a good article about it on:
http://burke.libbey.me/conserving-file-descriptors-in-go/

Best regards,
Rafael
Reply all
Reply to author
Forward
0 new messages