racket 6.3 build failures on armel

39 views
Skip to first unread message

David Bremner

unread,
Dec 10, 2015, 5:00:53 PM12/10/15
to racke...@googlegroups.com, debia...@lists.debian.org

I'm stuck figuring out some build-failures on armel.

On the autobuilders, I get (twice, on two different autobuilders)

,----
| Copying /«PKGBUILDDIR»/collects/racket/private/kernstruct.rkt to /«PKGBUILDDIR»/build/racket/gc2/xform-collects/racket/private/kernstruct.rkt
| Copying /«PKGBUILDDIR»/collects/racket/private/norm-arity.rkt to /«PKGBUILDDIR»/build/racket/gc2/xform-collects/racket/private/norm-arity.rkt
| Copying /«PKGBUILDDIR»/collects/racket/private/top-int.rkt to internal error in JIT;
| ending address 0xb35bf650 not in [0xb35bf548,0xb35bf648] (0)
| internal error: JIT buffer overflow
`----

There are several also a few warnings, I'm not sure if there is some
kind of architecture detection failure.

https://buildd.debian.org/status/fetch.php?pkg=racket&arch=armel&ver=6.3-1&stamp=1449355720

On the porterbox (abel.debian.org), I can't duplicate that failure, but
I get occasional (less than 1 build in 10) segfaults in racketcgc

,----
| env XFORM_USE_PRECOMP=xsrc/precomp.h ../racketcgc -cqu /home/bremner/racket-6.3/src/racket/gc2/xform.rkt --setup . --depends \
| --cpp "gcc -E -I./.. -I/home/bremner/racket-6.3/src/racket/gc2/../include -D_FORTIFY_SOURCE=2 -DUSE_SENORA_GC -D_LARGEFILE\
| _SOURCE -D_FILE_OFFSET_BITS=64 " --keep-lines -o xsrc/dynext.c /home/bremner/racket-6.3/src/racket/gc2/../src/dynext.c
| Segmentation fault
| Makefile:216: recipe for target 'xsrc/compile.c' failed
| make[5]: *** [xsrc/compile.c] Error 139
`----

Finally, once (out of about 25 builds), racketcgc was still running 9.5h
later, and ignoring SIGTERM. It was aparently processing char.c

Matthew Flatt

unread,
Dec 10, 2015, 5:20:43 PM12/10/15
to David Bremner, racke...@googlegroups.com, debia...@lists.debian.org
Is it possible to get the output of

gcc -E -dM - < /dev/null

on that machine?

The JIT is sensitive to a number of preprocessor definitions, and I
might be able to provoke the buffer overflow by using the same values.

Segfaults or freezes could be the same problem, so the porterbox values
might also be useful.

Thanks!
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to racket-dev+...@googlegroups.com.
> To post to this group, send email to racke...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-dev/87h9jryeod.fsf%40zancas.localnet.
> For more options, visit https://groups.google.com/d/optout.

David Bremner

unread,
Dec 10, 2015, 8:02:30 PM12/10/15
to Matthew Flatt, racke...@googlegroups.com, debia...@lists.debian.org
Matthew Flatt <mfl...@cs.utah.edu> writes:

> Is it possible to get the output of
>
> gcc -E -dM - < /dev/null
>
> on that machine?
>
> The JIT is sensitive to a number of preprocessor definitions, and I
> might be able to provoke the buffer overflow by using the same values.
>
> Segfaults or freezes could be the same problem, so the porterbox values
> might also be useful.

The porterbox is easy, and attached. The autobuilder is a bit more work,
but I can do it if the porterbox values don't suggest anything to try
(it would somehow be more efficient to try an experimental patch and get
the preprocessor values at the same time)

Cheers,

d

preproc.txt

David Bremner

unread,
Dec 11, 2015, 10:27:15 AM12/11/15
to Matthew Flatt, racke...@googlegroups.com, debia...@lists.debian.org
David Bremner <da...@tethera.net> writes:

> The porterbox is easy, and attached. The autobuilder is a bit more work,
> but I can do it if the porterbox values don't suggest anything to try
> (it would somehow be more efficient to try an experimental patch and get
> the preprocessor values at the same time)

A helpful sysadmin (hi pabs!) ran the command on the autobuilder as well, and got
_almost_ the same output.

It most likely is significant that the gcc version is different. I'll
try upgrading the porterbox chroot and see if it duplicates the
autobuilder failure.

--- abel.txt 2015-12-11 11:22:53.794883202 -0400
+++ 343324 2015-12-11 11:19:12.565432819 -0400
@@ -165,7 +165,7 @@
#define __SFRACT_MIN__ (-0.5HR-0.5HR)
#define __UTQ_FBIT__ 128
#define __FLT_MANT_DIG__ 24
-#define __VERSION__ "5.2.1 20151125"
+#define __VERSION__ "5.3.1 20151207"
#define __UINT64_C(c) c ## ULL
#define __ULLFRACT_FBIT__ 64
#define __FRACT_EPSILON__ 0x1P-15R
@@ -315,7 +315,7 @@
#define __INTMAX_TYPE__ long long int
#define __DEC128_MAX_EXP__ 6145
#define __ATOMIC_CONSUME 1
-#define __GNUC_MINOR__ 2
+#define __GNUC_MINOR__ 3
#define __UINTMAX_MAX__ 0xffffffffffffffffULL
#define __DEC32_MANT_DIG__ 7
#define __HA_FBIT__ 7

David Bremner

unread,
Dec 11, 2015, 1:50:57 PM12/11/15
to Matthew Flatt, racke...@googlegroups.com, debia...@lists.debian.org
David Bremner <da...@tethera.net> writes:

> A helpful sysadmin (hi pabs!) ran the command on the autobuilder as well, and got
> _almost_ the same output.
>
> It most likely is significant that the gcc version is different. I'll
> try upgrading the porterbox chroot and see if it duplicates the
> autobuilder failure.
>

I managed this earlier than expected, and indeed I can confirm the jit
buffer overflow in racketcgc is repeatable on the porterbox (even the
address that triggers the error is the same) in my experiments.

So in principle I could run racketcgc under gdb, if it would help.

Sorry for making this more complicated than necessary, I missed the
variation in gcc version. In some sense it's a blessing that the crash
is more deterministic under gcc 5.3

Matthew Flatt

unread,
Dec 11, 2015, 1:59:55 PM12/11/15
to David Bremner, racke...@googlegroups.com, debia...@lists.debian.org
At Fri, 11 Dec 2015 14:50:52 -0400, David Bremner wrote:
> David Bremner <da...@tethera.net> writes:
>
> > A helpful sysadmin (hi pabs!) ran the command on the autobuilder as well,
> and got
> > _almost_ the same output.
> >
> > It most likely is significant that the gcc version is different. I'll
> > try upgrading the porterbox chroot and see if it duplicates the
> > autobuilder failure.
> >
>
> I managed this earlier than expected, and indeed I can confirm the jit
> buffer overflow in racketcgc is repeatable on the porterbox (even the
> address that triggers the error is the same) in my experiments.
>
> So in principle I could run racketcgc under gdb, if it would help.

That's great news, since I haven't made any progress in my attempts.

Can you try changing, in "jit.h" around line 1324

#define PAST_LIMIT() ....
#define CHECK_LIMIT() ....
#if 1

to

#define PAST_LIMIT() ....
#define CHECK_LIMIT() ....
#if 0

?

Hopefully, the crash will then provide more useful information.

David Bremner

unread,
Dec 11, 2015, 2:51:49 PM12/11/15
to Matthew Flatt, racke...@googlegroups.com, debia...@lists.debian.org
Matthew Flatt <mfl...@cs.utah.edu> writes:

>
> #define PAST_LIMIT() ....
> #define CHECK_LIMIT() ....
> #if 0
>
> ?
>
> Hopefully, the crash will then provide more useful information.

With that change I get

Copying /home/bremner/racket-6.3/collects/racket/private/kw-file.rkt to /home/bremner/racket-6.3/build/racket/gc2/xform-collects/racket/private/kw-file.rkt
Copying way past /home/bremner/racket-6.3/src/racket/src/jit.c 2588
Aborted

David Bremner

unread,
Dec 12, 2015, 2:41:45 PM12/12/15
to Matthew Flatt, racke...@googlegroups.com
Matthew Flatt <mfl...@cs.utah.edu> writes:

> Meanwhile, commit c6b8ba7c4a is another shot (still mostly in the dark)
> at this problem. It might work to try it as a patch:
>
> https://github.com/racket/racket/commit/c6b8ba7c4a40e9a0933df2661332167d55c8bf80.patch

This yields the same abort as before (at jit.c:2588). Expanding the
to 200 buffer seemed to work. Should I just patch the Debian package, or
do you want to adjust the ifdef in jit.h?

cheers,

d

Matthew Flatt

unread,
Dec 12, 2015, 3:56:03 PM12/12/15
to David Bremner, racke...@googlegroups.com
I'm glad we've found something that has an effect, but I'm puzzled by
the need for a larger amount of JIT-buffer padding. The fact that the
problem shows up after upgrading the compiler makes me worry that the
JIT is doing something undefined at the C level.

Does compiling with `-fno-strict-aliasing` have any effect?

David Bremner

unread,
Dec 12, 2015, 7:45:16 PM12/12/15
to Matthew Flatt, racke...@googlegroups.com
Matthew Flatt <mfl...@cs.utah.edu> writes:


> I'm glad we've found something that has an effect, but I'm puzzled by
> the need for a larger amount of JIT-buffer padding. The fact that the
> problem shows up after upgrading the compiler makes me worry that the
> JIT is doing something undefined at the C level.
>
> Does compiling with `-fno-strict-aliasing` have any effect?

That also makes the build complete.

d

Matthew Flatt

unread,
Dec 15, 2015, 2:20:42 PM12/15/15
to David Bremner, racke...@googlegroups.com
I haven't figured out why that makes the build complete. Increasing the
buffer padding to 200 is sensible, though, and since that solves the
crashing problem, I'm going to set the issue aside for now.

Thanks!

Reply all
Reply to author
Forward
0 new messages