Random panic in production with Sprintf

501 views
Skip to first unread message

XXX ZZZ

unread,
May 2, 2019, 1:30:51 PM5/2/19
to golang-nuts
Hello,

We are having a random panic on our go application that is happening once every million requests or so, and so far we haven't been able to reproduce it nor to even grasp what's going on.

Basically our code goes like:

type Subid_info struct{
    Affiliate_subid     string
    Second_subid        string
    Second_subid_8        string
    S2                    string
    Internal_subid        string
    Internal_subid_9    string
    Internal_subid_12     string
    Result                 string
}

func (r *Subid_info) Prepare_subid_logic(){
    r.Second_subid_8=fmt.Sprintf("1%07v", r.Second_subid) ----> panic happens here.
}

And the trace we get is:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x466b6e]

goroutine 17091 [running]:
unicode/utf8.RuneCountInString(0x0, 0x5, 0xc048c275a8)
    /usr/local/go/src/unicode/utf8/utf8.go:411 +0x2e
fmt.(*fmt).padString(0xc023c17780, 0x0, 0x5)
    /usr/local/go/src/fmt/format.go:113 +0x134
fmt.(*fmt).fmtS(0xc023c17780, 0x0, 0x5)
    /usr/local/go/src/fmt/format.go:347 +0x61
fmt.(*pp).fmtString(0xc023c17740, 0x0, 0x5, 0xc000000076)
    /usr/local/go/src/fmt/print.go:448 +0x132
fmt.(*pp).printArg(0xc023c17740, 0x9978e0, 0xc016a68a30, 0x76)
    /usr/local/go/src/fmt/print.go:684 +0x880
fmt.(*pp).doPrintf(0xc023c17740, 0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1)
    /usr/local/go/src/fmt/print.go:1112 +0x3ff
fmt.Sprintf(0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1, 0x80, 0xa36200)
    /usr/local/go/src/fmt/print.go:214 +0x66
code/sharedobjects/sources.(*Subid_info).Prepare_subid_logic(0xc019292f80, 0x2)

Given that we can't reproduce it, what's the logical way to debug this and find out what's happening?

Thanks!

Ian Lance Taylor

unread,
May 2, 2019, 1:54:42 PM5/2/19
to XXX ZZZ, golang-nuts
The first thing to try is running your program under the race detector.

Ian

Burak Serdar

unread,
May 2, 2019, 2:13:01 PM5/2/19
to XXX ZZZ, golang-nuts
Right here in fmtString, the function gets a 0x0, 0x5 arg, which is I
believe a string of length 5 with a nil slice. So it looks like
somehow r.Second_subid has nil buffer here. When a string is used as
an interface{}, afaik, the interface keeps the value, not the pointer
to the string. So I can't see how this is possible. But I wonder if
copying the value before sprintf could fix it:

x:=r.Second_subid
r.Second_subid_8=fmt.Sprintf("1%07v", x)




> /usr/local/go/src/fmt/print.go:448 +0x132
> fmt.(*pp).printArg(0xc023c17740, 0x9978e0, 0xc016a68a30, 0x76)
> /usr/local/go/src/fmt/print.go:684 +0x880
> fmt.(*pp).doPrintf(0xc023c17740, 0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1)
> /usr/local/go/src/fmt/print.go:1112 +0x3ff
> fmt.Sprintf(0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1, 0x80, 0xa36200)
> /usr/local/go/src/fmt/print.go:214 +0x66
> code/sharedobjects/sources.(*Subid_info).Prepare_subid_logic(0xc019292f80, 0x2)
>
> Given that we can't reproduce it, what's the logical way to debug this and find out what's happening?
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

XXX ZZZ

unread,
May 2, 2019, 2:17:09 PM5/2/19
to golang-nuts
I did but nothing detected.

However there aren't any goroutined involved (except for the http request), other than that, this variable isn't shared among routines.

Marcin Romaszewicz

unread,
May 2, 2019, 2:19:02 PM5/2/19
to Burak Serdar, XXX ZZZ, golang-nuts
If that's the actual problem, you'd just be masking it, and producing an invalid "x". Look here:


func (r *Subid_info) Prepare_subid_logic(){
    r.Second_subid_8=fmt.Sprintf("1%07v", r.Second_subid) ----> panic happens here.
}


r.Second_subid is in an invalid state which normal Go code could not create. This means that some other goroutine might be in the middle of changing its value at the same time, and you have a race condition, so Ian Lance Taylor's suggestion to run using the race detector is probably the best bet.

-- Marcin

Burak Serdar

unread,
May 2, 2019, 2:21:23 PM5/2/19
to XXX ZZZ, golang-nuts
^^^^^^
This is a dumb idea, sorry. If you had a race, you'd still have a race.

XXX ZZZ

unread,
May 2, 2019, 2:41:33 PM5/2/19
to golang-nuts
I'm testing race conditions again as we speak, however this object is created WITHIN the goroutine (the http request), there is no way, afaik, that is being used from another routine.
> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Anthony Martin

unread,
May 2, 2019, 5:50:24 PM5/2/19
to XXX ZZZ, golang-nuts
What version of Go are you using?

XXX ZZZ <emarti...@gmail.com> once said:
> fmt.(*pp).fmtString(0xc023c17740, 0x0, 0x5, 0xc000000076)
> /usr/local/go/src/fmt/print.go:448 +0x132
> fmt.(*pp).printArg(0xc023c17740, 0x9978e0, 0xc016a68a30, 0x76)
> /usr/local/go/src/fmt/print.go:684 +0x880
> fmt.(*pp).doPrintf(0xc023c17740, 0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1)
> /usr/local/go/src/fmt/print.go:1112 +0x3ff
> fmt.Sprintf(0xa6e22f, 0x5, 0xc048c27818, 0x1, 0x1, 0x80, 0xa36200)
> /usr/local/go/src/fmt/print.go:214 +0x66

This shows signs of memory corruption. The last argument passed to
fmtString (0xc000000076) should be the same as the last argument
passed to printArg (0x76 or 'v') but it has some high bits set. Also,
the pointer to the format string data changes from 0xa6e22f (which is
probably in the .rodata section of the binary) to 0x0.

Something is amiss.

Anthony

Ian Lance Taylor

unread,
May 2, 2019, 5:50:53 PM5/2/19
to Marcin Romaszewicz, Burak Serdar, XXX ZZZ, golang-nuts
On Thu, May 2, 2019 at 11:18 AM Marcin Romaszewicz <mar...@gmail.com> wrote:
>
> If that's the actual problem, you'd just be masking it, and producing an invalid "x". Look here:
>
> func (r *Subid_info) Prepare_subid_logic(){
> r.Second_subid_8=fmt.Sprintf("1%07v", r.Second_subid) ----> panic happens here.
> }
>
> r.Second_subid is in an invalid state which normal Go code could not create. This means that some other goroutine might be in the middle of changing its value at the same time, and you have a race condition, so Ian Lance Taylor's suggestion to run using the race detector is probably the best bet.

I agree. Look very closely at how r.Second_subid is set. The
backtrace shows that it is a value of type string that is invalid.

Ian

XXX ZZZ

unread,
May 2, 2019, 5:53:31 PM5/2/19
to golang-nuts
using go version go1.12.4 linux/amd64

Ian Lance Taylor

unread,
May 2, 2019, 5:56:51 PM5/2/19
to Anthony Martin, XXX ZZZ, golang-nuts
The change from 0x76 to 0xc000000076 does not necessarily indicate a
problem. The stack backtrace does not know the types. The value here
is a rune, which is 32 bits. The compiler will only set the low order
32 bits on the stack, leaving the high order 32 bits unset. So the
0xc000000000 could just be garbage left on the stack.

I don't *think* the format string is changing. I think the 0 is from
the string being printed, not the format string. They both happen to
be length 5.

Ian

Anthony Martin

unread,
May 2, 2019, 6:09:30 PM5/2/19
to Ian Lance Taylor, XXX ZZZ, golang-nuts
Ian Lance Taylor <ia...@golang.org> once said:
> I don't *think* the format string is changing. I think the 0 is from
> the string being printed, not the format string. They both happen to
> be length 5.

Misled by the pair of fives. Mea culpa.

Anthony

Michael Jones

unread,
May 2, 2019, 7:09:46 PM5/2/19
to Anthony Martin, Ian Lance Taylor, XXX ZZZ, golang-nuts
Is any of this string data touched/from C via CGO?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Michael T. Jones
michae...@gmail.com

Burak Serdar

unread,
May 2, 2019, 7:44:33 PM5/2/19
to Ian Lance Taylor, Anthony Martin, XXX ZZZ, golang-nuts
There's something that doesn't make sense here. The 0 is from the
string being printed, it is not the format string. But how can that
be?

Even if there is a race, the string cannot have a 0 for the slice, can
it? So the other option is when Sprintf is called, the string being
printed is already corrupt. Can there be an overflow somewhere that is
somehow undetected? Any unsafe use in the program?


>
> Ian

XXX ZZZ

unread,
May 2, 2019, 8:02:41 PM5/2/19
to golang-nuts
No use of C via CGO at all.

Afaik, there isn't any unsafe use of the string, we are basically reading it from a get parameter (fasthttp server) on an http request and then adding it into this structure, most of the times is just a 5 char string. Out of several millions requests, this panic happens.

I failed to find any kind of race using go race detector, I'm currently doing some more debugging, hopefuly I should have more info/tests soon.
> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Burak Serdar

unread,
May 2, 2019, 8:16:55 PM5/2/19
to XXX ZZZ, golang-nuts
On Thu, May 2, 2019 at 6:02 PM XXX ZZZ <emarti...@gmail.com> wrote:
>
> No use of C via CGO at all.
>
> Afaik, there isn't any unsafe use of the string, we are basically reading it from a get parameter (fasthttp server) on an http request and then adding it into this structure, most of the times is just a 5 char string. Out of several millions requests, this panic happens.

Does this "fasthttp" have any unsafe pointers?
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Tyler Compton

unread,
May 2, 2019, 8:34:39 PM5/2/19
to Burak Serdar, XXX ZZZ, golang-nuts
I took a quick look and yes, it uses unsafe to convert between byte slices and strings. I don't know enough to say that it's the problem but here's an example:

Robert Engels

unread,
May 2, 2019, 8:36:28 PM5/2/19
to Burak Serdar, XXX ZZZ, golang-nuts
Whenever I see fast* I think someone took shortcuts to make something “faster” without fully implementing the spec (or external constraints, like safe data access)

Burak Serdar

unread,
May 3, 2019, 12:23:58 AM5/3/19
to Tyler Compton, XXX ZZZ, golang-nuts
On Thu, May 2, 2019 at 6:34 PM Tyler Compton <xav...@gmail.com> wrote:
>
> I took a quick look and yes, it uses unsafe to convert between byte slices and strings. I don't know enough to say that it's the problem but here's an example:
>
> https://github.com/valyala/fasthttp/blob/645361952477dfc16938fb2993065130ed7c02b9/bytesconv.go#L380

You could experiment by using that corrupt string with something
trivial, like calling strings.ToLower(r.Second_subid) before Sprintf
and see if that panics inside ToLower. And then you can add more
ToLower calls backtracking the stack to pinpoint where that string
gets corrupt.

Or, you can just use the "slow" http, see if that works, and not look back.

tsi...@adinsert.tech

unread,
May 3, 2019, 9:23:17 AM5/3/19
to golang-nuts
Are you using the fasthttp timeout handler?

If its the case you could have a race condition once the timeout is triggered.
Tiago
Reply all
Reply to author
Forward
0 new messages