If a given liveness arg of STATEPOINT is at a fixed frame index
(e.g. a function argument passed on stack), prefer to use this
fixed location even the address is also in a register. If we use
the register it will generate a spill, which is not necessary
since the fixed frame index can be directly recorded in the stack
map.
Details
Diff Detail
Event Timeline
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
1383 | is this just being conservative against a corner case? |
Thanks for the review!
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | ||
---|---|---|
1383 | INT_MAX is a sentinel value that getArgumentFrameIndex returns meaning "no". Other uses of getArgumentFrameIndex have similar checks. |
lib/CodeGen/SelectionDAG/StatepointLowering.cpp | ||
---|---|---|
527 | Can we fold this into getValue/getValueImpl? IIUC, this is a reasonable expectation for getValue for these inputs and it'd be nicer |
test/CodeGen/X86/statepoint-stack-usage.ll | ||
---|---|---|
143 | Can you expand the check statements here? And check the actual debug output for the stackmap? I can't judge whether the encoding is correct from the test. |
lib/CodeGen/SelectionDAG/StatepointLowering.cpp | ||
---|---|---|
527 | I tried this, but it turns out a number of tests failing, mostly on 32-bit architectures handling 64-bit values. I think I'll do it in the current way here for this patch, as a targeted change. And I can look into the 32-bit architecture issue and do a cleanup later. Thanks. | |
test/CodeGen/X86/statepoint-stack-usage.ll | ||
143 | Done. Added CHECK for stack map output, and moved to statepoint-stackmap-format.ll where it emits and checks the stack map output. |
lib/CodeGen/SelectionDAG/StatepointLowering.cpp | ||
---|---|---|
529 | Hm, I'm really uncomfortable with this bit of code. For two reasons:
Is it possible to let the virtual register be lowered and then peak back through to the source? Basically, I'm looking to be convinced this is the right approach. |
lib/CodeGen/SelectionDAG/StatepointLowering.cpp | ||
---|---|---|
529 | You're right that allocas already work. I'll remove that case. I'm happy to try other approach, either trying harder in doing this in getValue, or some other way. |
In addition to the detailed implementation comment below, I thought of a possible semantic problem. Per the ABI, who "owns" the memory for the arguments? Does either the callee or caller assume it's immutable? If so, then your optimization needs to be restricted to a non-relocating collector since otherwise the collector might be updating a memory location assumed by AA to be immutable and bad things could happen...
I'm 97% sure we do assume arguments are immutable within a function...
lib/CodeGen/SelectionDAG/StatepointLowering.cpp | ||
---|---|---|
529 | I have two alternatives I'd like you to try. Both start with removing the code from the generic getNonRegisterValue since I'm unsure about the implications of that. Option 1 - Right here, just check to see if the Value V is an argument of the right type and inline the logic from getNonRegisterValue into the only user. Option 2 - Inside lowerIncomingStatepointValue, add a special case which matches the DAG pattern Load(FrameIndex) and uses the optimized lowering. There are examples of things like this in the argument handling in SelectionDAG, the rough structure will look something like: if (N.getNode()) // Check if frame index is available. if (LoadSDNode *LNode = dyn_cast<LoadSDNode>(N.getNode())) if (FrameIndexSDNode *FINode = dyn_cast<FrameIndexSDNode>(LNode->getBasePtr().getNode())) Op = MachineOperand::CreateFI(FINode->getIndex()); |
I think the argument, as an SSA Value, should be immutable. But here, argument at fixed frame index is, in the IR level, a pointer argument with byval attribute. We don't modify the pointer. A function could modify the content of the memory that argument points to, just like a regular pointer argument. The only difference is that we know the pointer points to a fixed stack location.
If the GC were to modify the content of the pointed-to memory, it would do it with or without this change. So I don't think this change affects that.
I tried Option 1. It works, thanks!
I also got another approach, which handles arguments at fixed stack locations more general in getValue. See https://reviews.llvm.org/D55072. I have to tweak some tests slightly. Let me know which one do you think is better. Thanks!
LGTM in the current form. The alternate framing might also be good, but we can come back and undo this later if that works out.
is this just being conservative against a corner case?