Go vs. memory sanitizer

2,562 views
Skip to first unread message

Ian Lance Taylor

unread,
Oct 7, 2015, 1:42:20 PM10/7/15
to golan...@googlegroups.com
The memory sanitizer (msan) is a feature of clang and GCC that inserts
automatic detection of memory errors in C, in particular programs that
make choices based on reads of uninitialized memory. This does not
apply to Go, of course, where all memory is initialized. However,
people can use cgo to pass memory back and forth between C and Go, and
people using cgo may want to use msan to detect errors in their C code
(in fact, it's an awfully good idea).

Since msan tracks the current state of all memory, if C creates some
uninitialized memory, passes that memory to Go, Go fills it in and
returns to C, and C then looks at that memory, msan will not know that
Go initialized the memory and will report an error. Here is a test
case, that fails today with an msan error:
http://play.golang.org/p/HoMF9Iq1o9 (I think you need to be running
at least clang 3.6).

I looked into fixing this, and the problem is surprisingly tractable,
because Go's thread sanitizer already instruments all the necessary
code. This test case is fixed by https://golang.org/cl/15494 .

I'd like to see what people think of really implementing this
approach. This would mean adding a -msan option to the go tool, the
compiler, and the linker, that roughly parallels the existing -race
option. There would be some new code in the runtime, and a new very
small package runtime/msan. Compiling with -msan would mean that Go
code would automatically call the appropriate msan library functions
for every memory read and write. This would only be useful when using
cgo (or SWIG) and compiling with -fsanitize=memory.

This is in some ways a fairly specialized use case, but on the other
hand, when debugging a memory crash in a Go program that calls C code,
it would certainly be nice to be able to use msan to make sure that
the C code is moderately well behaved. This would obviously be more
code to maintain in the toolchain, but it's fairly limited since
almost all of it would be right next to existing race code.

Opinions?

Ian

David Crawshaw

unread,
Oct 7, 2015, 2:12:12 PM10/7/15
to Ian Lance Taylor, golan...@googlegroups.com
The memory sanitizer is very nice. Do you have any plans or thoughts
on -fsanitize=address, and maybe even -fsanitize=thread? Could we end
up with -cgosanitize 'args list'?
> --
> You received this message because you are subscribed to the Google Groups "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

David Chase

unread,
Oct 7, 2015, 2:21:50 PM10/7/15
to Ian Lance Taylor, golang-dev

This sounds approximately necessary.  We get too many "bugs" reported against the runtime that turn out to be a C problem, and anything that helps filter these out will be good.  Will msan also be able to track "the Go GC just released all this memory"?

Ian Lance Taylor

unread,
Oct 7, 2015, 2:37:05 PM10/7/15
to David Crawshaw, golan...@googlegroups.com
On Wed, Oct 7, 2015 at 11:12 AM, David Crawshaw <craw...@golang.org> wrote:
> The memory sanitizer is very nice. Do you have any plans or thoughts
> on -fsanitize=address, and maybe even -fsanitize=thread? Could we end
> up with -cgosanitize 'args list'?

I think that -fsanitize=address (ASan) is less interesting here. It
detects a number of memory errors that can occur in C code, such as
use-after-free, out-of-bounds access, and memory leaks. In general,
these can not occur in Go code, and, more importantly, there is
nothing Go code can do to cause a false positive in C code.
Integrating ASan with Go would mean instrumenting Go code so that it
detects cases where C passes it an invalid pointer, causing the Go
code to have a use-after-free or out-of-bounds error. This is useful,
but less useful than MSan.

In any case, I believe that we can instrument ASan along exactly the
same lines as MSan. So, yes, if -msan looks like a good idea, we
should probably think in terms of -asan as well. The instrumentation
would be essentially identical, though the runtime support would be
slightly different.

We could use -cgosanitize instead, but my understanding (which could
be wrong) is that you can't use multiple C sanitizers simultaneously.
You have to pick one.

I think it would be nice to integrate -race with -fsanitize=thread,
and I don't see any fundamental objections, but I will let Dmitry
think about that.

Ian

Ian Lance Taylor

unread,
Oct 7, 2015, 2:39:00 PM10/7/15
to David Chase, golang-dev
On Wed, Oct 7, 2015 at 11:21 AM, David Chase <drc...@google.com> wrote:
> Will msan also be able to track "the Go GC
> just released all this memory"?

That had not occurred to me, but, yes, I think so. One problem is
that if we release memory, and then the Go GC allocates it again, then
I think that MSan will think that all is well.

Ian

Peter Collingbourne

unread,
Oct 7, 2015, 3:34:23 PM10/7/15
to Ian Lance Taylor, golan...@googlegroups.com, Evgenii Stepanov
Bear in mind that the LLVM implementation of MSan tracks uninitializedness through the program's values at a bit-level granularity and only reports an error if an uninitialized bit is observable. This is needed in order to correctly handle things like uninitialized bit fields and, perhaps more relevant for Go, uninitialized structure padding bytes. This makes the instrumentation (at least for C and C++) rather more involved than for TSan. I'm having trouble coming up with a case where this would matter for Go (eugenis may have some ideas though), so you'd probably be fine instrumenting only memory reads and writes.

Peter


Ian

Evgenii Stepanov

unread,
Oct 7, 2015, 6:39:50 PM10/7/15
to Peter Collingbourne, Ian Lance Taylor, golan...@googlegroups.com
This looks like a great idea and a surprisingly simple implementation.
One question: what happens when an object is moved by GC? Is it
visible to the tool via read/write callbacks? I've heard Go can
move/reallocate stacks - that must be handled as "copy" and not
read/write to avoid tripping on uninitialized parts.

If that is handled correctly, I don't see any problems with the
approach. If struct padding is left uninitialized in C, the Go side is
not supposed to read it, and vice versa.

All read checks in your patch are limited to the first byte of the
object, that's ok for the race detector, but should be
extended/replaced with an entire object check for msan.

AddressSanitizer in Go is also an interesting topic, mainly for "the
Go GC just released all this memory" case. It's not going to be as
simple as MSan, because for more-or-less reliable detection we'll need
a quarantine in the GC to delay memory reuse.

Multiple sanitizers sometimes can be used together, but Go only needs
special support for asan, msan and tsan, and by coincidence they are
the 3 sanitizers that are exclusive, so -msan and -asan flags would do
the trick. It may be easier for the users to mirror clang names for
sanitizer flags (-fsanitize=) through - just something to consider.

Ian Lance Taylor

unread,
Oct 7, 2015, 6:47:13 PM10/7/15
to Evgenii Stepanov, Peter Collingbourne, golan...@googlegroups.com
On Wed, Oct 7, 2015 at 3:29 PM, Evgenii Stepanov <eug...@google.com> wrote:
> This looks like a great idea and a surprisingly simple implementation.
> One question: what happens when an object is moved by GC? Is it
> visible to the tool via read/write callbacks? I've heard Go can
> move/reallocate stacks - that must be handled as "copy" and not
> read/write to avoid tripping on uninitialized parts.

We currently do not have a moving GC. The stacks do move, but it's
impossible to share a stack address between Go and C--all addresses
passed to C are required to be on the heap, because otherwise they
might (confusingly) move if the C code calls back into Go. So this is
not an issue at the moment. When and if we do implement a moving GC,
then it will become an issue, but that is at least a year away, if it
ever happens at all.

Is there an MSan interface to note a memory copy? I don't see one in
msan_interface.h.


> All read checks in your patch are limited to the first byte of the
> object, that's ok for the race detector, but should be
> extended/replaced with an entire object check for msan.

Yes.


> AddressSanitizer in Go is also an interesting topic, mainly for "the
> Go GC just released all this memory" case. It's not going to be as
> simple as MSan, because for more-or-less reliable detection we'll need
> a quarantine in the GC to delay memory reuse.

Yes. I''m not sure how that would work but I expect we can sort it out.


> Multiple sanitizers sometimes can be used together, but Go only needs
> special support for asan, msan and tsan, and by coincidence they are
> the 3 sanitizers that are exclusive, so -msan and -asan flags would do
> the trick. It may be easier for the users to mirror clang names for
> sanitizer flags (-fsanitize=) through - just something to consider.

True. The go tool doesn't currently have any -f flags, though. I'm
not sure what is best.

Thanks for the info.

Ian

Evgenii Stepanov

unread,
Oct 7, 2015, 7:14:13 PM10/7/15
to Ian Lance Taylor, Peter Collingbourne, golan...@googlegroups.com
On Wed, Oct 7, 2015 at 3:47 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> On Wed, Oct 7, 2015 at 3:29 PM, Evgenii Stepanov <eug...@google.com> wrote:
>> This looks like a great idea and a surprisingly simple implementation.
>> One question: what happens when an object is moved by GC? Is it
>> visible to the tool via read/write callbacks? I've heard Go can
>> move/reallocate stacks - that must be handled as "copy" and not
>> read/write to avoid tripping on uninitialized parts.
>
> We currently do not have a moving GC. The stacks do move, but it's
> impossible to share a stack address between Go and C--all addresses
> passed to C are required to be on the heap, because otherwise they
> might (confusingly) move if the C code calls back into Go. So this is
> not an issue at the moment. When and if we do implement a moving GC,
> then it will become an issue, but that is at least a year away, if it
> ever happens at all.
>
> Is there an MSan interface to note a memory copy? I don't see one in
> msan_interface.h.

There is __msan_memcpy (with the same signature as memcpy), but it
both copies the memory and updates MSan state to reflect that. I'll
add something to just update the MSan state.

Evgenii Stepanov

unread,
Oct 12, 2015, 7:25:54 PM10/12/15
to Ian Lance Taylor, Peter Collingbourne, golan...@googlegroups.com
FTR, I've added a memory copy interface to MSan here:
http://reviews.llvm.org/rL250124

Ian Lance Taylor

unread,
Oct 19, 2015, 10:50:52 PM10/19/15
to golan...@googlegroups.com
I've updated the CL for memory sanitizer support,
https://golang.org/cl/15494, to something that I think can reasonably
be submitted. Tomorrow I will still start breaking it down into
smaller CLs for review.

Ian
Reply all
Reply to author
Forward
0 new messages