| Commit-Queue | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Can you expand a bit on what this does, how it works, etc? Are you actually deoptimizing the optimized code, or do you only want to print what the interpreter values would have been if you had deoptimized it?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Can you expand a bit on what this does, how it works, etc? Are you actually deoptimizing the optimized code, or do you only want to print what the interpreter values would have been if you had deoptimized it?
Added the explanation to the CL description. Also modified code a bit since your comment.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Danylo MocherniukCan you expand a bit on what this does, how it works, etc? Are you actually deoptimizing the optimized code, or do you only want to print what the interpreter values would have been if you had deoptimized it?
Added the explanation to the CL description. Also modified code a bit since your comment.
I don't feel good about this approach -- it's hooking pretty intrusively into the deoptimisation, and effectively doing this weird kind of half-deopt, writing it to the stack, and then undoing it by smashing over the deopted stack with the old optimized stack? I don't think this is even GC safe, if `MaterializeHeapObjects` triggers a GC then the saved stack in dumpling will be invalid (since objects could have moved).
I don't think you can, or should, get around using a separate Dumpling Deoptimizer-like class, which has its own TranslatedState initialisation and TranslattedFrame iteration. Then, you can do your own register state copy into that DumplingDeoptimizer (much like you're already copying registers into dumpling), and you can then simply drop that frame afterward, instead of trying to save state, deoptimize, print, and then restore state. This would be more consistent with what e.g. the debugger does -- it will cost a bit of duplicate logic but result in a much clearer separation of concerns.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Danylo MocherniukCan you expand a bit on what this does, how it works, etc? Are you actually deoptimizing the optimized code, or do you only want to print what the interpreter values would have been if you had deoptimized it?
Leszek SwirskiAdded the explanation to the CL description. Also modified code a bit since your comment.
I don't feel good about this approach -- it's hooking pretty intrusively into the deoptimisation, and effectively doing this weird kind of half-deopt, writing it to the stack, and then undoing it by smashing over the deopted stack with the old optimized stack? I don't think this is even GC safe, if `MaterializeHeapObjects` triggers a GC then the saved stack in dumpling will be invalid (since objects could have moved).
I don't think you can, or should, get around using a separate Dumpling Deoptimizer-like class, which has its own TranslatedState initialisation and TranslattedFrame iteration. Then, you can do your own register state copy into that DumplingDeoptimizer (much like you're already copying registers into dumpling), and you can then simply drop that frame afterward, instead of trying to save state, deoptimize, print, and then restore state. This would be more consistent with what e.g. the debugger does -- it will cost a bit of duplicate logic but result in a much clearer separation of concerns.
Ok, thanks for suggesting the alternative approach with DumplingDeoptimizer, I'll try it.
You've got the idea right and I agree it is not GC safe, original code had some hacks to disable GC during this, now I know why they needed it.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Danylo MocherniukCan you expand a bit on what this does, how it works, etc? Are you actually deoptimizing the optimized code, or do you only want to print what the interpreter values would have been if you had deoptimized it?
Leszek SwirskiAdded the explanation to the CL description. Also modified code a bit since your comment.
Danylo MocherniukI don't feel good about this approach -- it's hooking pretty intrusively into the deoptimisation, and effectively doing this weird kind of half-deopt, writing it to the stack, and then undoing it by smashing over the deopted stack with the old optimized stack? I don't think this is even GC safe, if `MaterializeHeapObjects` triggers a GC then the saved stack in dumpling will be invalid (since objects could have moved).
I don't think you can, or should, get around using a separate Dumpling Deoptimizer-like class, which has its own TranslatedState initialisation and TranslattedFrame iteration. Then, you can do your own register state copy into that DumplingDeoptimizer (much like you're already copying registers into dumpling), and you can then simply drop that frame afterward, instead of trying to save state, deoptimize, print, and then restore state. This would be more consistent with what e.g. the debugger does -- it will cost a bit of duplicate logic but result in a much clearer separation of concerns.
Ok, thanks for suggesting the alternative approach with DumplingDeoptimizer, I'll try it.
You've got the idea right and I agree it is not GC safe, original code had some hacks to disable GC during this, now I know why they needed it.
I uploaded the prototype of the new implementation, it is very dirty at the moment, but if we agree that this is a direction, I can clean it up.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Michael, see comment about GC safety of pushing maybe-tagged registers -- we might need to allow some sort of conservative scanning of these stack slots to support this.
bool is_dumping_mode = false) {when is this true? Is this left over from the previous approach?
static constexpr int kCallDumpFrameSize = 5;nit: this isn't a frame size, it's an instruction size
static constexpr int kXmmRegsSize = kSimd128Size * XMMRegister::kNumRegisters;nit: constexpr doesn't need to be static
const RegisterConfiguration* config = RegisterConfiguration::Default();
for (int i = 0; i < config->num_allocatable_simd128_registers(); ++i) {
int code = config->GetAllocatableSimd128Code(i);
XMMRegister xmm_reg = XMMRegister::from_code(code);
int offset = code * kSimd128Size;
__ movdqu(Operand(rsp, offset), xmm_reg);
}
static constexpr int kNumberOfRegisters = Register::kNumRegisters;
for (int i = 0; i < kNumberOfRegisters; i++) {
__ pushq(Register::from_code(i));
}
hm, I think this might not be GC safe either now that I think about it -- I believe that the GC assumes that everything between the end of the fixed-size frame and the next frame is tagged function arguments, so untagged register values here will likely segfault in the GC. Marking the stack as non-iterable doesn't fix this, this is a flag for the profiler not the GC.
I'm not sure this is actually solvable with the metadata available to us, the stack maps don't include information about registers. Maybe for now we have to live without GC-ing operations like value materialization, and/or have to put the GC into a conservative scanning mode during this operation.
cc Michael
for (int offset = 0; offset < frame_size; offset += kSystemPointerSize) {
intptr_t value = desc->GetFrameSlot(offset);
std::memcpy(stack_pointer + offset, &value, kSystemPointerSize);
}why copy into the virtual stack instead of using the FrameDescription's addresses directly?
int bytecode_offset = trans_frame->bytecode_offset().ToInt();you could feed the bytecode offset into the DumplingJSFrame too
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Leszek and Michael, please note that this is a prototype yet. Very dirty and not ready for proper review, just a rough idea of what I want to do.
const RegisterConfiguration* config = RegisterConfiguration::Default();
for (int i = 0; i < config->num_allocatable_simd128_registers(); ++i) {
int code = config->GetAllocatableSimd128Code(i);
XMMRegister xmm_reg = XMMRegister::from_code(code);
int offset = code * kSimd128Size;
__ movdqu(Operand(rsp, offset), xmm_reg);
}
static constexpr int kNumberOfRegisters = Register::kNumRegisters;
for (int i = 0; i < kNumberOfRegisters; i++) {
__ pushq(Register::from_code(i));
}
hm, I think this might not be GC safe either now that I think about it -- I believe that the GC assumes that everything between the end of the fixed-size frame and the next frame is tagged function arguments, so untagged register values here will likely segfault in the GC. Marking the stack as non-iterable doesn't fix this, this is a flag for the profiler not the GC.
I'm not sure this is actually solvable with the metadata available to us, the stack maps don't include information about registers. Maybe for now we have to live without GC-ing operations like value materialization, and/or have to put the GC into a conservative scanning mode during this operation.
cc Michael
Yes, I was wondering the same thing.
I'll try to see what I can do to skip GC here. In original code they control GC based on variable for this purpose during HeapObjectMaterialization: https://screenshot.googleplex.com/3Zcw7Zo7hBPoh7D.
const RegisterConfiguration* config = RegisterConfiguration::Default();
for (int i = 0; i < config->num_allocatable_simd128_registers(); ++i) {
int code = config->GetAllocatableSimd128Code(i);
XMMRegister xmm_reg = XMMRegister::from_code(code);
int offset = code * kSimd128Size;
__ movdqu(Operand(rsp, offset), xmm_reg);
}
static constexpr int kNumberOfRegisters = Register::kNumRegisters;
for (int i = 0; i < kNumberOfRegisters; i++) {
__ pushq(Register::from_code(i));
}
Danylo Mocherniukhm, I think this might not be GC safe either now that I think about it -- I believe that the GC assumes that everything between the end of the fixed-size frame and the next frame is tagged function arguments, so untagged register values here will likely segfault in the GC. Marking the stack as non-iterable doesn't fix this, this is a flag for the profiler not the GC.
I'm not sure this is actually solvable with the metadata available to us, the stack maps don't include information about registers. Maybe for now we have to live without GC-ing operations like value materialization, and/or have to put the GC into a conservative scanning mode during this operation.
cc Michael
Yes, I was wondering the same thing.
I'll try to see what I can do to skip GC here. In original code they control GC based on variable for this purpose during HeapObjectMaterialization: https://screenshot.googleplex.com/3Zcw7Zo7hBPoh7D.
Ok, speaking to Michael, it seems reasonable that we could introduce a new constructor for the conservative pinning scope (https://source.chromium.org/chromium/chromium/src/+/main:v8/src/heap/heap.cc;l=7989;drc=b87c7d07f63e19e4d34afd15c0df5cd496b15d37;bpv=1;bpt=1) which starts conservative scanning from the start of the pushed registers -- then GCs shouldn't be a problem.