In general, I am supportive of this direction. It seems like an entirely reasonable solution. I do have some comments below, but they're mostly of the "how do we generalize this?" variety.
First, let's touch on the attribute.
My first concern is naming; I think the use of "statepoint" here is problematic as this doesn't relate to lowering strategy needed (e.g. statepoints), but the conceptual support (e.g. a safepoint). This could be resolved by simply tweaking to require-safepoint.
But that brings us to a broader point. We've chosen to build in
the fact intrinsics don't require safepoints. If all we want is
for some intrinsics *to* require safepoints, why isn't this simply
a tweak to the existing code? callsGCLeafFunction already has a
small list of intrinsics which can have safepoints.
I think you can completely remove the need for this attribute by
a) adding the atomic memcpy variants to the exclude list in
callsGCLeafFunction, and b) using the existing "gc-leaf-function"
on most calls the frontend generates.
Second, let's discuss the signature for the runtime function.
I think you should use a signature for the runtime call which
takes base pointers and offsets, not base pointers and derived
pointers. Why? Because passing derived pointers in registers for
arguments presumes that the runtime knows how to map a record in
the stackmap to where a callee might have shuffled the argument
to. Some runtimes may support this, others may not. Given using
the offset scheme is just as simple to implement, being
considerate and minimizing the runtime support required seems
worthwhile.
On x86, the cost of a subtract (to produce the offset in the
worst case), and an LEA (to produce the derived pointer again
inside the runtime routine) is pretty minimal. Particular since
the former is likely to be optimized away and the later folded
into the addressing mode.
Finally, it's also worth noting that some (but not all) GCs can
convert from an interior derived pointer to the base of the
containing object. With the memcpy family we know that either the
pointers are all interior derived, or the length must be zero.
This is not true for all GCs and thus we don't want to rely on it.
Philip
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
On Sep 28, 2020, at 10:56 AM, Philip Reames <list...@philipreames.com> wrote:
In general, I am supportive of this direction. It seems like an entirely reasonable solution. I do have some comments below, but they're mostly of the "how do we generalize this?" variety.
First, let's touch on the attribute.
My first concern is naming; I think the use of "statepoint" here is problematic as this doesn't relate to lowering strategy needed (e.g. statepoints), but the conceptual support (e.g. a safepoint). This could be resolved by simply tweaking to require-safepoint.
But that brings us to a broader point. We've chosen to build in the fact intrinsics don't require safepoints. If all we want is for some intrinsics *to* require safepoints, why isn't this simply a tweak to the existing code? callsGCLeafFunction already has a small list of intrinsics which can have safepoints.
I think you can completely remove the need for this attribute by a) adding the atomic memcpy variants to the exclude list in callsGCLeafFunction, and b) using the existing "gc-leaf-function" on most calls the frontend generates.
Second, let's discuss the signature for the runtime function.
I think you should use a signature for the runtime call which takes base pointers and offsets, not base pointers and derived pointers. Why? Because passing derived pointers in registers for arguments presumes that the runtime knows how to map a record in the stackmap to where a callee might have shuffled the argument to. Some runtimes may support this, others may not. Given using the offset scheme is just as simple to implement, being considerate and minimizing the runtime support required seems worthwhile.
On x86, the cost of a subtract (to produce the offset in the worst case), and an LEA (to produce the derived pointer again inside the runtime routine) is pretty minimal. Particular since the former is likely to be optimized away and the later folded into the addressing mode.
Finally, it's also worth noting that some (but not all) GCs can convert from an interior derived pointer to the base of the containing object. With the memcpy family we know that either the pointers are all interior derived, or the length must be zero. This is not true for all GCs and thus we don't want to rely on it.
Thanks for the feedback.
I think both of the suggestions are very reasonable. I’ll incorporate them.
Given there were no objections for two weeks, I’m going to go ahead with posting individual patches for review.
One small question inline:
On Sep 28, 2020, at 10:56 AM, Philip Reames <list...@philipreames.com> wrote:
In general, I am supportive of this direction. It seems like an entirely reasonable solution. I do have some comments below, but they're mostly of the "how do we generalize this?" variety.
First, let's touch on the attribute.
My first concern is naming; I think the use of "statepoint" here is problematic as this doesn't relate to lowering strategy needed (e.g. statepoints), but the conceptual support (e.g. a safepoint). This could be resolved by simply tweaking to require-safepoint.
But that brings us to a broader point. We've chosen to build in the fact intrinsics don't require safepoints. If all we want is for some intrinsics *to* require safepoints, why isn't this simply a tweak to the existing code? callsGCLeafFunction already has a small list of intrinsics which can have safepoints.
I think you can completely remove the need for this attribute by a) adding the atomic memcpy variants to the exclude list in callsGCLeafFunction, and b) using the existing "gc-leaf-function" on most calls the frontend generates.
Second, let's discuss the signature for the runtime function.
I think you should use a signature for the runtime call which takes base pointers and offsets, not base pointers and derived pointers. Why? Because passing derived pointers in registers for arguments presumes that the runtime knows how to map a record in the stackmap to where a callee might have shuffled the argument to. Some runtimes may support this, others may not. Given using the offset scheme is just as simple to implement, being considerate and minimizing the runtime support required seems worthwhile.
On x86, the cost of a subtract (to produce the offset in the worst case), and an LEA (to produce the derived pointer again inside the runtime routine) is pretty minimal. Particular since the former is likely to be optimized away and the later folded into the addressing mode.
Finally, it's also worth noting that some (but not all) GCs can convert from an interior derived pointer to the base of the containing object. With the memcpy family we know that either the pointers are all interior derived, or the length must be zero. This is not true for all GCs and thus we don't want to rely on it.
Do you think it makes sense to control this aspect of lowering (derived pointers vs base+offset in memcpy args) using GCStrategy?