I think I plan to kill several (as many as 3) birds with one stone in regards to split lock warnings.
When calling from immobile code to immobile code through an immobile fdefn, we can emit it as "call [rip+n]" where rip+n is an fdefn-raw-addr slot.
This call format removes the address range restriction, as long as the immobile code space and immobile fdefns are within +/- 2Gb. So I'll separate immobile fixed objects into a space of fdefns and a space of symbols ensuring that fdefns remain near enough to pages of code.
This proposal basically re-introduces the level of indirection conferred by fdefns, while still not requiring a load of RAX. This would also eliminate the so-called "static linking" which may not have done much anyway. In fact I would guess that if there is any slowdown, we can do some different optimization similar in spirit to the unboxed call convention. Such as we can avoid loading the arg-count-passing registers for known-fixed-arg receivers.
And of course then the split-lock problem goes away because we'll never overwrite an instruction to change an fdefn's function.
For elfinated core files, I'll perhaps still change the call instruction from "call [rip+n]" to call rel32 form, but that's a separate question. As it happens, some of our cloud machines in fact disallow writable code in '.text' segments, which is where lisp code goes after conversion to ELF.
Also as a totally unrelated issue, but drawing on a similar principle, it may be possible to relax the requirement for immobile symbols to be mapped sub-4Gb because they could be computed with "LEA Rn, [rip+n]". But this tends to cost 1 extra instruction. Symbols really want to be imm32 operands, which would require that the in-register representation of a symbol be its displacement from the base of a hypothetical "symbol space". This is equivalent to a JVM's compressed pointer representation. We can do the same thing with compact-instance-header layouts being 32-bit pointers but allowing the layouts to be anywhere. It adds one instruction to lots of things that use layouts (to compute an actual address whence to load the inherited layouts) but it doesn't add any instruction for comparison of layouts for EQness.