glibc-2.19

476 views
Skip to first unread message

Vincent Batts

unread,
Apr 14, 2014, 2:06:57 PM4/14/14
to golang-nuts
Have folks had any issues with glibc-2.19 and golang?
I am having new issues that are only existing in the fedora rawhide (not rhel6, f19 or f20), which is using glibc-2.19.90-10.fc21

First issue, SIGABRT during the ../misc/cgo/test that sets gid
http://koji.fedoraproject.org/koji/getfile?taskID=6728204&name=build.log&offset=-4000
(and corresponding src.rpm http://kojipkgs.fedoraproject.org//work/tasks/8196/6728196/golang-1.2.1-4.fc21.src.rpm)

Next issue, also in ../misc/cgo/test on i686,
unexpected GOT reloc for non-dynamic symbol _cgoexp_
http://koji.fedoraproject.org/koji/getfile?taskID=6728923&name=build.log&offset=-4000
(and corresponding src.rpm http://kojipkgs.fedoraproject.org//work/tasks/8913/6728913/golang-1.2.1-5.fc21.src.rpm)


Reproducibility is consistent, when building inside the chroot mockbuild. For local builds and on the koji build server.

Ideas?

Ian Lance Taylor

unread,
Apr 14, 2014, 2:38:31 PM4/14/14
to Vincent Batts, golang-nuts
On Mon, Apr 14, 2014 at 11:06 AM, Vincent Batts <vba...@gmail.com> wrote:
> Have folks had any issues with glibc-2.19 and golang?
> I am having new issues that are only existing in the fedora rawhide (not
> rhel6, f19 or f20), which is using glibc-2.19.90-10.fc21
>
> First issue, SIGABRT during the ../misc/cgo/test that sets gid
> http://koji.fedoraproject.org/koji/getfile?taskID=6728204&name=build.log&offset=-4000
> (and corresponding src.rpm
> http://kojipkgs.fedoraproject.org//work/tasks/8196/6728196/golang-1.2.1-4.fc21.src.rpm)

Hmmm, I hope they haven't changed setgid handling again. Can you try
running the program under gdb to see if gdb can identify where the
SIGABRT is coming from? The stack trace in the log suggests that is
coming from C code.


> Next issue, also in ../misc/cgo/test on i686,
>
> unexpected GOT reloc for non-dynamic symbol _cgoexp_
>
> http://koji.fedoraproject.org/koji/getfile?taskID=6728923&name=build.log&offset=-4000
> (and corresponding src.rpm
> http://kojipkgs.fedoraproject.org//work/tasks/8913/6728913/golang-1.2.1-5.fc21.src.rpm)

That makes no sense at all. That error should only occur on Darwin.
I have no explanation.

Ian

Vincent Batts

unread,
Apr 14, 2014, 3:55:00 PM4/14/14
to Ian Lance Taylor, golang-nuts
On Mon, Apr 14, 2014 at 2:38 PM, Ian Lance Taylor <ia...@golang.org> wrote:
On Mon, Apr 14, 2014 at 11:06 AM, Vincent Batts <vba...@gmail.com> wrote:
> Have folks had any issues with glibc-2.19 and golang?
> I am having new issues that are only existing in the fedora rawhide (not
> rhel6, f19 or f20), which is using glibc-2.19.90-10.fc21
>
> First issue, SIGABRT during the ../misc/cgo/test that sets gid
> http://koji.fedoraproject.org/koji/getfile?taskID=6728204&name=build.log&offset=-4000
> (and corresponding src.rpm
> http://kojipkgs.fedoraproject.org//work/tasks/8196/6728196/golang-1.2.1-4.fc21.src.rpm)

Hmmm, I hope they haven't changed setgid handling again.  Can you try
running the program under gdb to see if gdb can identify where the
SIGABRT is coming from?  The stack trace in the log suggests that is
coming from C code.

I'll look to trace this down further.

> Next issue, also in ../misc/cgo/test on i686,
>
> unexpected GOT reloc for non-dynamic symbol _cgoexp_
>
> http://koji.fedoraproject.org/koji/getfile?taskID=6728923&name=build.log&offset=-4000
> (and corresponding src.rpm
> http://kojipkgs.fedoraproject.org//work/tasks/8913/6728913/golang-1.2.1-5.fc21.src.rpm)

That makes no sense at all.  That error should only occur on Darwin.
I have no explanation.

That's what it looked like to me as well. That's why I wanted to crowd-source for ideas. :-\

Ian Lance Taylor

unread,
Apr 25, 2014, 2:27:43 PM4/25/14
to Vincent Batts, golang-nuts
On Mon, Apr 14, 2014 at 11:06 AM, Vincent Batts <vba...@gmail.com> wrote:
>
> Have folks had any issues with glibc-2.19 and golang?
> I am having new issues that are only existing in the fedora rawhide (not
> rhel6, f19 or f20), which is using glibc-2.19.90-10.fc21
>
> First issue, SIGABRT during the ../misc/cgo/test that sets gid
> http://koji.fedoraproject.org/koji/getfile?taskID=6728204&name=build.log&offset=-4000
> (and corresponding src.rpm
> http://kojipkgs.fedoraproject.org//work/tasks/8196/6728196/golang-1.2.1-4.fc21.src.rpm)

Vincent sent a build log for this off-line with a gdb backtrace. The
backtrace at the point of failure is below.

The SIGSETXID signal handler is calling abort. This abort is almost
certainly coming from this diff:

https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=13f7fe35ae2b0ea55dc4b9628763aafdc8bdc30c

The way setgid works is that it sets up __xidcmd and then sends a
signal to every thread telling it to run the syscall stored in
__xidcmd. Running this syscall is failing for some reason.
Previously the result of the syscall was not checked, and now it is,
so it may have always been failing.

I tried running the test program under strace -f on my eglibc 2.15
system. I do in fact see that the system call fails:

[pid 19733] setgid(0 <unfinished ...>
[pid 19733] <... setgid resumed> ) = -1 EPERM (Operation not permitted)
[pid 19732] setgid(0 <unfinished ...>
[pid 19732] <... setgid resumed> ) = -1 EPERM (Operation not permitted)
[pid 19731] setgid(0 <unfinished ...>
[pid 19731] <... setgid resumed> ) = -1 EPERM (Operation not permitted)
[pid 19730] setgid(0 <unfinished ...>
[pid 19730] <... setgid resumed> ) = -1 EPERM (Operation not permitted)
[pid 19729] setgid(0) = -1 EPERM (Operation not permitted)
[pid 19728] setgid(0 <unfinished ...>
[pid 19728] <... setgid resumed> ) = -1 EPERM (Operation not permitted)

And, you know what? That's exactly correct. The test is running
setgid(0). It's supposed to fail. The point of the test is not to
see whether setgid succeeds, it's to see whether it doesn't hang.

This is a bug introduced by the above patch to glibc. The effect of
the change is that any invalid call to setgid is going to abort the
program. That is not right. This change to glibc should be reverted.

You can almost certainly recreate the bug yourself by running a
multi-threaded program in C and calling setgid(0). I don't have an
easy way to do that since I'm not running the new glibc. Let me know
if you have trouble creating a C test case.

Ian

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffeffff700 (LWP 15230)]
0x00007ffff731ebf7 in raise () from /lib64/libc.so.6
#0 0x00007ffff731ebf7 in raise () from /lib64/libc.so.6
#1 0x00007ffff732085a in abort () from /lib64/libc.so.6
#2 0x00007ffff76b4cb6 in sighandler_setxid () from /lib64/libpthread.so.0
#3 <signal handler called>
#4 runtime.futex () at /usr/lib/golang/src/pkg/runtime/sys_linux_amd64.s:268
#5 0x0000000000417587 in runtime.futexsleep (
addr=<error reading variable: Attempt to dereference a generic pointer.>,
val=<error reading variable: Attempt to dereference a generic pointer.>,
ns=<error reading variable: Attempt to dereference a generic pointer.>)
at /usr/lib/golang/src/pkg/runtime/os_linux.c:49
#6 0x0000000000410056 in runtime.notesleep (
n=<error reading variable: Attempt to dereference a generic pointer.>)
at /usr/lib/golang/src/pkg/runtime/lock_futex.c:134
#7 0x000000000041b6c1 in stopm ()
at /usr/lib/golang/src/pkg/runtime/proc.c:932
#8 0x000000000041bc91 in startlockedm (
gp=<error reading variable: Attempt to dereference a generic pointer.>)
at /usr/lib/golang/src/pkg/runtime/proc.c:1078
#9 0x000000000041c57b in schedule ()
at /usr/lib/golang/src/pkg/runtime/proc.c:1327
#10 0x000000000041b006 in runtime.mstart ()
at /usr/lib/golang/src/pkg/runtime/proc.c:606
#11 0x0000000000405673 in crosscall_amd64 ()
at /builddir/build/BUILD/go/src/pkg/runtime/cgo/gcc_amd64.S:35
#12 0x00007fffeffff9c0 in ?? ()
#13 0x00007fffeffff700 in ?? ()
#14 0x0000000000001000 in ?? ()
#15 0x00007ffff72e7c00 in ?? ()
#16 0x0000000000000000 in ?? ()
A debugging session is active.
Inferior 1 [process 15222] will be killed.
Quit anyway? (y or n) [answered Y; input not from terminal]
+ ./test.test
SIGABRT: abort
PC=0x7f1edd315bf7
goroutine 1 [chan receive]:
testing.RunTests(0x5ffca8, 0x9bbd20, 0x29, 0x29, 0x1)
/usr/lib/golang/src/pkg/testing/testing.go:472 +0x8d5
testing.Main(0x5ffca8, 0x9bbd20, 0x29, 0x29, 0x9b5790, ...)
/usr/lib/golang/src/pkg/testing/testing.go:403 +0x84
main.main()
_/builddir/build/BUILD/go/misc/cgo/test/_test/_testmain.go:129 +0x9c
goroutine 3 [syscall]:
os/signal.loop()
/usr/lib/golang/src/pkg/os/signal/signal_unix.go:21 +0x1e
created by os/signal.init·1
/usr/lib/golang/src/pkg/os/signal/signal_unix.go:27 +0x31
goroutine 4 [syscall]:
runtime.goexit()
/usr/lib/golang/src/pkg/runtime/proc.c:1394
goroutine 5 [syscall]:
_/builddir/build/BUILD/go/misc/cgo/test._Cfunc_usleep(0x7f1e00002710, 0x424239)
_/builddir/build/BUILD/go/misc/cgo/test/_test/_cgo_defun.c:778 +0x31
created by _/builddir/build/BUILD/go/misc/cgo/test.lockOSThreadCallback
/usr/lib/golang/misc/cgo/test/issue3775.go:35 +0x36
goroutine 6 [select]:
_/builddir/build/BUILD/go/misc/cgo/test.testSetgid(0xc21006d000)
/usr/lib/golang/misc/cgo/test/setgid_linux.go:27 +0x1ce
_/builddir/build/BUILD/go/misc/cgo/test.TestSetgid(0xc21006d000)
/usr/lib/golang/misc/cgo/test/cgo_linux_test.go:9 +0x27
testing.tRunner(0xc21006d000, 0x9bbd20)
/usr/lib/golang/src/pkg/testing/testing.go:391 +0x8b
created by testing.RunTests
/usr/lib/golang/src/pkg/testing/testing.go:471 +0x8b2
goroutine 7 [syscall]:
_/builddir/build/BUILD/go/misc/cgo/test._Cfunc_setgid(0x0, 0x0)
_/builddir/build/BUILD/go/misc/cgo/test/_test/_cgo_defun.c:643 +0x31
_/builddir/build/BUILD/go/misc/cgo/test.func·014()
/usr/lib/golang/misc/cgo/test/setgid_linux.go:24 +0x2e
created by _/builddir/build/BUILD/go/misc/cgo/test.testSetgid
/usr/lib/golang/misc/cgo/test/setgid_linux.go:26 +0x7f
rax 0x0
rbx 0x9ca180
rcx 0xffffffffffffffff
rdx 0x6
rdi 0x3b7f
rsi 0x3b84
rbp 0x0
rsp 0x7f1eda9da838
r8 0x0
r9 0x0
r10 0x8
r11 0x202
r12 0x7f1edd2dec00
r13 0x0
r14 0x7f1eda9db700
r15 0x7f1eda9db9c0
rip 0x7f1edd315bf7
rflags 0x202
cs 0x33
fs 0x0
gs 0x0

Florian Weimer

unread,
Apr 25, 2014, 2:47:31 PM4/25/14
to golan...@googlegroups.com
* Ian Lance Taylor:

> This is a bug introduced by the above patch to glibc. The effect of
> the change is that any invalid call to setgid is going to abort the
> program. That is not right. This change to glibc should be reverted.

The abort is totally intentional. The reason for it is that at this
point, it is impossible to complete sucessfully (because some thread
could not switch privileges), and it is also impossible to return an
error (some threads have likely changed their privileges and cannot
turn back).

There is a bug somewhere, but the glibc change only exposed it, it is
not itself the cause.

> You can almost certainly recreate the bug yourself by running a
> multi-threaded program in C and calling setgid(0). I don't have an
> easy way to do that since I'm not running the new glibc. Let me know
> if you have trouble creating a C test case.

The patch you referenced contains a C test case which shows one way to
trigger the abort: call some privilege-changing system call directly,
bypassing the glibc wrapper.

It is still possible to do this, but then you must not call any of
glibc's set*id functions, ever.

Ian Lance Taylor

unread,
Apr 25, 2014, 5:06:33 PM4/25/14
to Florian Weimer, golang-nuts
On Fri, Apr 25, 2014 at 11:47 AM, Florian Weimer <f...@deneb.enyo.de> wrote:
> * Ian Lance Taylor:
>
>> This is a bug introduced by the above patch to glibc. The effect of
>> the change is that any invalid call to setgid is going to abort the
>> program. That is not right. This change to glibc should be reverted.
>
> The abort is totally intentional. The reason for it is that at this
> point, it is impossible to complete sucessfully (because some thread
> could not switch privileges), and it is also impossible to return an
> error (some threads have likely changed their privileges and cannot
> turn back).
>
> There is a bug somewhere, but the glibc change only exposed it, it is
> not itself the cause.

Please try to write the C test case I described.

As far as I can see, if you call setgid(0) from a non-setuid
multi-threaded program, that program will abort. That is wrong.
setgid(0) should return an error in that case, not abort.

If setgid(0) in a multi-threaded program does return an error
correctly, and the program does not abort, let me know. At present I
do not understand how that could happen.


>> You can almost certainly recreate the bug yourself by running a
>> multi-threaded program in C and calling setgid(0). I don't have an
>> easy way to do that since I'm not running the new glibc. Let me know
>> if you have trouble creating a C test case.
>
> The patch you referenced contains a C test case which shows one way to
> trigger the abort: call some privilege-changing system call directly,
> bypassing the glibc wrapper.
>
> It is still possible to do this, but then you must not call any of
> glibc's set*id functions, ever.

The Go test case does not do that.

Ian

Ian Lance Taylor

unread,
Apr 25, 2014, 11:46:45 PM4/25/14
to Florian Weimer, golang-nuts
On Fri, Apr 25, 2014 at 11:47 AM, Florian Weimer <f...@deneb.enyo.de> wrote:
> * Ian Lance Taylor:
>
>> This is a bug introduced by the above patch to glibc. The effect of
>> the change is that any invalid call to setgid is going to abort the
>> program. That is not right. This change to glibc should be reverted.
>
> The abort is totally intentional. The reason for it is that at this
> point, it is impossible to complete sucessfully (because some thread
> could not switch privileges), and it is also impossible to return an
> error (some threads have likely changed their privileges and cannot
> turn back).
>
> There is a bug somewhere, but the glibc change only exposed it, it is
> not itself the cause.

Here is how to see that the current glibc is wrong: what if you make a
setgid call that you do not have permission to run? Then the call in
sighandler_setxid is sure to fail. There is no inconsistency in the
program: no thread will have switched privileges, since every
sighandler_setxid will fail. There is no need to abort, and in fact
aborting is incorrect.

Ian

Florian Weimer

unread,
Apr 26, 2014, 3:38:16 AM4/26/14
to Ian Lance Taylor, golang-nuts
* Ian Lance Taylor:

> Here is how to see that the current glibc is wrong: what if you make a
> setgid call that you do not have permission to run? Then the call in
> sighandler_setxid is sure to fail. There is no inconsistency in the
> program: no thread will have switched privileges, since every
> sighandler_setxid will fail. There is no need to abort, and in fact
> aborting is incorrect.

Ah, I see now. Sorry about that, I'll fix it.

Vincent Batts

unread,
May 6, 2014, 12:13:27 PM5/6/14
to Ian Lance Taylor, golang-nuts
On Mon, Apr 14, 2014 at 2:38 PM, Ian Lance Taylor <ia...@golang.org> wrote:

The reproduction for this is a bit to frame up, but I just put https://gist.github.com/vbatts/d4519a7063367de032b7 together. If you `sh run.sh` then you'll see the error and end up with a shell in the ./misc/cgo/test directory


Vincent Batts

unread,
May 6, 2014, 3:12:58 PM5/6/14
to Ian Lance Taylor, golang-nuts
sorry,

the call to `go test -ldflags '-linkmode=internal'` is from ./src/run.bash:108 of go1.2.1


Reply all
Reply to author
Forward
0 new messages