[LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

10 views
Skip to first unread message

Ramkumar Ramachandra

unread,
May 19, 2015, 10:26:15 AM5/19/15
to LLVMdev, Dale Martin
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

Reid Kleckner

unread,
May 19, 2015, 12:10:18 PM5/19/15
to Ramkumar Ramachandra, Dale Martin, LLVMdev
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


Ramkumar Ramachandra

unread,
May 22, 2015, 5:14:10 PM5/22/15
to Reid Kleckner, Dale Martin, Peng....@mathworks.com, LLVMdev
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

Keno Fischer

unread,
May 22, 2015, 7:16:57 PM5/22/15
to Ramkumar Ramachandra, Peng....@mathworks.com, LLVMdev, Dale Martin
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

Lang Hames

unread,
May 22, 2015, 7:58:44 PM5/22/15
to Keno Fischer, Dale Martin, Peng....@mathworks.com, LLVMdev

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <kfis...@college.harvard.edu> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

Dale Martin

unread,
May 23, 2015, 10:28:13 AM5/23/15
to Lang Hames, Keno Fischer, Peng Cheng, LLVMdev

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <lha...@gmail.com>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

Eric Christopher

unread,
May 23, 2015, 5:15:27 PM5/23/15
to Dale Martin, Lang Hames, Keno Fischer, Tom Stellard, Peng Cheng, LLVMdev
Hi Dale,

I don't think that Keno's rewrite is applicable for a bug fix release. We have, in the last year, moved to having some dot releases for our older releases, but these are definitely bug fix only and low risk as we don't want to break anything new. 

The release documentation is located here:


for future reference. There's no official long term support strategy past the information on that page, previously we released every 6 mos without dot releases at all so this is a fairly new trial for us. Backporting of patches is at the discretion of the author, the code owner, and the release manager.

Keno: perfectly happy to entertain a backport of your patch if you want to do such a thing, but IIRC it was a bit more than a simple bug fix.

-eric

Keno Fischer

unread,
May 23, 2015, 5:30:53 PM5/23/15
to Eric Christopher, Dale Martin, Peng Cheng, LLVMdev
The commits in question are r234839 and you'll probably also want r236341. I don't think these are the kinds of commits that should generally be back ported. It's not really a small self-contained commit. If you're willing you can probably carry these patches yourself (we will be doing so on top of 3.6 until 3.7 is released), but do note that in my experience using MCJIT with the large code model does not quite work yet (it's on my todo list to work out exactly why and fit). Also, I believe the memory allocation scheme for MCJIT was rewritten slightly between 3.5 and trunk, so there may be additional problems I don't know about.

Dale Martin

unread,
May 23, 2015, 6:14:27 PM5/23/15
to Keno Fischer, Peng Cheng, LLVMdev
That implies that 3.6 is not really useable on Windows, doesn't it, since the legacy JIT was removed? (At least if you could need the large code model.)

Sent from my iPhone

Keno Fischer

unread,
May 23, 2015, 6:20:26 PM5/23/15
to Dale Martin, Peng Cheng, LLVMdev
Correct, though this is certainly not the only issue preventing LLVM 3.6 from being usable on Windows. I think we got a few of them backported to 3.6.1, but there's a few more still remaining. 

Nicholas Chapman

unread,
May 24, 2015, 6:40:48 PM5/24/15
to llv...@cs.uiuc.edu
There was a crash on Windows when more than 4K is allocated on the stack at once, due to chkstk call offset being too large.  Maybe you are seeing that.
It's fixed in trunk, but not in any stable release.

Nick C.
Reply all
Reply to author
Forward
0 new messages