How to be safe in Go?

已查看 336 次
跳至第一个未读帖子

Nick Pavlica

未读,
2016年6月23日 20:27:212016/6/23
收件人 golang-nuts
All,
  While learning Go from the book "The Go Programming Language", by Addison-Wesley I was working through an example and was taken back by how easy it is to cause a data race.  The book states that each incoming request is run in a separate goroutine by the server so that it can handle multiple requests simultaneously. It goes on to explain that if the count++ variable is accessed at the same time by different goroutines that I could have a race and need to wrap it in mutex to lock and unlock the variable.  All of this makes sense and is pretty standard, however my confusion/concern is that there is no apparent mechanism that tells me that there's code that's executing multiple goroutines against my code.  Yes, you could assume that a webserver would do this, but as more and more libraries are added how would you know without reading all the code in every library?  To see if the the race detector would find this potential bug for me, I ran the code with the mutex(s) commented out with the  -race flag on and didn't get a warning.  So how do I catch these mistakes before there is a problem?  It feels a little like walking on egg shells waiting for you code to blow up.  Every programmer I know has had a bad day, and can easily make these types of mistakes.  So how do you protect against them in Go?  Your guidance is greatly appreciated.

Thanks!
--Nick

------------------------------------------------------------------------------------------------------------------------------------
go run -race wserver2.go
------------------------------------------------------------------------------------------------------------------------------------
Web server example code:
------------------------------------------------------------------------------------------------------------------------------------
package main

import (
"fmt"
"log"
"net/http"
"sync"
)

var mu sync.Mutex
var count int

func main() {
http.HandleFunc("/", handler) // Each request calls the handler
http.HandleFunc("/count", counter)
log.Fatal(http.ListenAndServe("localhost:8000", nil))
}

// The handler echoes the Path component of the requested URL
func handler(w http.ResponseWriter, r *http.Request){
//mu.Lock()
count++
//mu.Unlock()
fmt.Fprint(w, "URL.Path = %q\n", r.URL.Path)
}

// Counter echoes the number of calls so far
func counter (w http.ResponseWriter, r *http.Request) {
//mu.Lock()
fmt.Fprint(w, "Count %d\n", count)
//mu.Unlock()
}

Caleb Spare

未读,
2016年6月23日 20:44:222016/6/23
收件人 Nick Pavlica、golang-nuts
> To see if the the
> race detector would find this potential bug for me, I ran the code with the
> mutex(s) commented out with the -race flag on and didn't get a warning.

Did you make some concurrent requests? The race detector only tells
you about races that happen, so you need to excercise the concurrent
code paths in some way (possibly in a test or by selectively turning
on -race in a production-like environment).

A couple of general thoughts:

(1) The primary way to know whether a library you're using will call
your code concurrently from multiple goroutines is via documentation.
The net/http documentation, for instance, explains that Serve creates
a goroutine for each connection. Any other library you use should
clearly explain this if it's the case.

(2) net/http and other server packages are a slightly unusual case --
there are many libraries that exist, but most don't call your code
concurrently. (Even packages that invoke provided callbacks are
themselves a minority.) If I use a package that, say, interfaces with
a database or implements a graph algorithm, it would be quite strange
if it used concurrently-invoked callbacks.

Nick Pavlica

未读,
2016年6月24日 14:27:432016/6/24
收件人 golang-nuts、lin...@gmail.com


On Thursday, June 23, 2016 at 6:44:22 PM UTC-6, Caleb Spare wrote:
> To see if the the
> race detector would find this potential bug for me, I ran the code with the
> mutex(s) commented out with the  -race flag on and didn't get a warning.

Did you make some concurrent requests? The race detector only tells
you about races that happen, so you need to excercise the concurrent
code paths in some way (possibly in a test or by selectively turning
on -race in a production-like environment).

  I just accessed the endpoint from multiple browsers, but no formal tests.
 

A couple of general thoughts:

(1) The primary way to know whether a library you're using will call
your code concurrently from multiple goroutines is via documentation.
The net/http documentation, for instance, explains that Serve creates
a goroutine for each connection. Any other library you use should
clearly explain this if it's the case.

  That makes sense, but seems to be a little fragile.  For example, if the lib wasn't originally concurrent, then is changed, it would be easy to get into a bad situation.  It makes me want to just wrap every variable in a Mutex to be safe.  I'm guessing that the pattern is to keep Go programs as small as possible so you can track all the corner cases effectively in your own mental model. 


(2) net/http and other server packages are a slightly unusual case --
there are many libraries that exist, but most don't call your code
concurrently. (Even packages that invoke provided callbacks are
themselves a minority.) If I use a package that, say, interfaces with
a database or implements a graph algorithm, it would be quite strange
if it used concurrently-invoked callbacks.

As I ponder this; I wounder if some tooling could be added to the compiler or an outside utility that would scan all the used libraries, and notify you that the libraries/functions are concurrent. 

For example:
--------------------------------------------------------------------------------------------------------
go run --chk_concurrent  mylib.go

Notice: "net/http" calls concurrent operations on the Handle function.
--------------------------------------------------------------------------------------------------------

Thanks again for the feedback!
--Nick

adon...@google.com

未读,
2016年6月24日 15:17:182016/6/24
收件人 golang-nuts
On Thursday, 23 June 2016 20:27:21 UTC-4, Nick Pavlica wrote:
While learning Go from the book "The Go Programming Language", by Addison-Wesley I was working through an example and was taken back by how easy it is to cause a data race.

I'm glad you appreciated this risk, but I hope it did not scared you away from concurrent programming in Go.  It takes some care and discipline to avoid data races, but a couple of simple rules and practices can greatly reduce the risk. (1) Avoid mutating variables where possible. Variables whose value is set once and then never updated are inherently concurrency-safe, as our functional programming friends have been saying for years. (2) Encapsulate variables.  By hiding variables so that all the functions that access them are under your control, you make it easier to ensure that they are not accessed concurrently.  For example, you can see that all accesses occur while a mutex lock is held.  Sometimes you can confine the variable to a single goroutine, avoiding concurrent access entirely.  You'll find that Chapter 9 of the book is devoted to the topic of data races and how to avoid them.


Konstantin Khomoutov

未读,
2016年6月24日 15:28:322016/6/24
收件人 Nick Pavlica、golang-nuts
On Fri, 24 Jun 2016 11:27:43 -0700 (PDT)
Nick Pavlica <lin...@gmail.com> wrote:

[...]
> > > mutex(s) commented out with the -race flag on and didn't get a
> > > warning.
> >
> > Did you make some concurrent requests? The race detector only tells
> > you about races that happen, so you need to excercise the
> > concurrent code paths in some way (possibly in a test or by
> > selectively turning on -race in a production-like environment).
>
> I just accessed the endpoint from multiple browsers, but no formal
> tests.

Consider using wrk [1] to stress-test your endpoint.

1. https://github.com/wg/wrk

Justin Israel

未读,
2016年6月24日 17:47:132016/6/24
收件人 Nick Pavlica、golang-nuts


On Sat, 25 Jun 2016, 6:27 AM Nick Pavlica <lin...@gmail.com> wrote:


On Thursday, June 23, 2016 at 6:44:22 PM UTC-6, Caleb Spare wrote:
> To see if the the
> race detector would find this potential bug for me, I ran the code with the
> mutex(s) commented out with the  -race flag on and didn't get a warning.

Did you make some concurrent requests? The race detector only tells
you about races that happen, so you need to excercise the concurrent
code paths in some way (possibly in a test or by selectively turning
on -race in a production-like environment).

  I just accessed the endpoint from multiple browsers, but no formal tests.
 

A couple of general thoughts:

(1) The primary way to know whether a library you're using will call
your code concurrently from multiple goroutines is via documentation.
The net/http documentation, for instance, explains that Serve creates
a goroutine for each connection. Any other library you use should
clearly explain this if it's the case.

  That makes sense, but seems to be a little fragile.  For example, if the lib wasn't originally concurrent, then is changed, it would be easy to get into a bad situation.  It makes me want to just wrap every variable in a Mutex to be safe.  I'm guessing that the pattern is to keep Go programs as small as possible so you can track all the corner cases effectively in your own mental model. 

I feel like it shouldn't necessarily matter if the lib will call your handlers concurrently or not. The focus should be on the fact that you are wanting to share global state within the body of that function. Shouldn't the answer be that you should avoid mutating global state without proper synchronization? I don't feel like you should need to run a tool to find out if it is calling your code concurrently. Rather just use best practices to write safer code in the first place. 



(2) net/http and other server packages are a slightly unusual case --
there are many libraries that exist, but most don't call your code
concurrently. (Even packages that invoke provided callbacks are
themselves a minority.) If I use a package that, say, interfaces with
a database or implements a graph algorithm, it would be quite strange
if it used concurrently-invoked callbacks.

As I ponder this; I wounder if some tooling could be added to the compiler or an outside utility that would scan all the used libraries, and notify you that the libraries/functions are concurrent. 

For example:
--------------------------------------------------------------------------------------------------------
go run --chk_concurrent  mylib.go

Notice: "net/http" calls concurrent operations on the Handle function.
--------------------------------------------------------------------------------------------------------

Thanks again for the feedback!
--Nick

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

oju...@gmail.com

未读,
2016年6月24日 20:21:042016/6/24
收件人 golang-nuts、lin...@gmail.com

On Friday, June 24, 2016 at 3:27:43 PM UTC-3, Nick Pavlica wrote:
It makes me want to just wrap every variable in a Mutex to be safe.
 
A sure path to craziness.
 
 I'm guessing that the pattern is to keep Go programs as small as possible so you can track all the corner cases effectively in your own mental model. 
 
You don't need to keep Go programs small. Make small components inside your Go program. Learn Go idioms.
Limit the visibility of your entities and make use of channels to coordinate parts, avoiding simultaneous access.

 

Henry

未读,
2016年6月24日 22:12:112016/6/24
收件人 golang-nuts
Libraries should be more like black boxes. The user shouldn't need to know about their implementation, be it concurrent or not. It is the libraries' responsibility to ensure any data passed to them is safe.

as....@gmail.com

未读,
2016年6月24日 23:17:532016/6/24
收件人 golang-nuts
I mostly agree, but there are times when abstraction leads to complex code and more reasonable approach involves documented constraints. The map type is not safe either, but it enjoys widespread use. If it were abstracted to handle all possible use-cases, it would be unnecessarily slow for most of the operations its used for.    

There's an illusion of safety in languages that advertise all-encompassing abstractions, in my experience the problem becomes the language or library itself, as the underpinnings acquire code bloat to cover all use-cases. If this happens, the user must now choose between unboxing the black box or writing their own (better/stronger/faster) language/package.

Knowing 5-10% of a package implementation details seems like a reasonable tradeoff for a simpler implementation.

Henry

未读,
2016年6月24日 23:52:402016/6/24
收件人 golang-nuts
The implementation of map is not a valid argument to "unboxing the black boxes".

The question to ask is who does the concurrency. In the case of map, if the user is the one accessing the map concurrently, then it is the user's responsibility to ensure the map is concurrent safe. If the map internally implements some concurrent processing, then the map has the responsibility to ensure data integrity during the concurrent processing. The map's implementation should be invisible to the user.

as....@gmail.com

未读,
2016年6月25日 00:49:272016/6/25
收件人 golang-nuts
I wasn't referring to concurrency within package scope, but data crossing package boundaries (the earlier post says libraries' are responsible to ensure data passed to them is safe). The map type reveals an implementation detail by documenting that it isn't safe for parallel use. 

Tim Hawkins

未读,
2016年6月25日 00:55:122016/6/25
收件人 Henry、golang-nuts

That is a lofty goal, and not always achievable,  many libraries are not self contained and often wrap other non-go libraries or device integration libraries that don't support that model. What people do inside a library should not be a language feature, otherwise the language is becoming overly opiniated.

回复全部
回复作者
转发
0 个新帖子