When see a terminator like tail call, useOrDefCSROrFI should return false Immediately.
When checking if a register is a CSR, useOrDefCSROrFI should use getCurrentCSRs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 36997 Build 36996: arc lint + arc unit
Event Timeline
Could you, please, give some code examples of the issues that you see?
Also, the patch needs regression tests, which fail without the patch applied (these can server as examples too).
llvm/lib/CodeGen/ShrinkWrap.cpp | ||
---|---|---|
275 | I don't believe this is correct. A terminator instruction can well use and/or def a CSR, e.g. an indirect branch, a table jump, a hw loop instruction, ... | |
293 | Why is that needed? The set of actual CSRs ought to be always a subset of the set of possible CSRs. |
llvm/lib/CodeGen/ShrinkWrap.cpp | ||
---|---|---|
275 | Oh, that's a mistake. How about MI.isReturn()? I just want to handle sibling calls. | |
293 | I am actually making LLVM a code generator for V8 turbofan. V8 needs to save a specific set of registers at the entry of a function depending on the type of this function. For example, a JS function needs to save r0, r1,r7 along with fp, lr. And a wasm function needs to spill r3 at sp - 16. At this context, I need to add additional CSRs at frame lowering's assignCalleeSavedSpillSlots. So getCurrentCSRs maybe a super set of the set of possible CSRs. |
llvm/lib/CodeGen/ShrinkWrap.cpp | ||
---|---|---|
293 | Have you considered adding a custom calling convention in the backend, instead ? That may also remove the need for specially handing sibling calls in your case. |
llvm/lib/CodeGen/ShrinkWrap.cpp | ||
---|---|---|
293 | I do add new calling convections: def CSR_V8SBCC : CalleeSavedRegs<(add LR, R11, R10, R9, R8, R7, R6, R5, (sequence "D%u", 31, 0))>; The problem is the CSRs added in assignCalleeSavedSpillSlots function may still get clobbered in the function. That is when a function returns, it will not restore any of these CSRs. And a call will not list them in the regmask, so they still considered be clobbered. But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble. Here's some background information: |
Thanks for upstreaming your changes.
I am not sure I understand what you are trying to fix, similar to @chill's request, can you please provide an example (ideally, as a test case) of what this fixes? Using MIR with llc -run-pass=shrink-wrap might help you creating a more precise test case for shrink-wrapping.
Thank you. Just see this message. I am on vacation, so I will send an example 2 weeks later.
You need to clone https://github.com/linzj/llvm-toy, then checkout the branch named arm-tf-6_9_427_23. Then apply the patch llvm-patch-by-far.patch to llvm 8.0. So that you can compile the test case as follow. Because of a new calling convention v8cc.
the test case:
; ModuleID = 'main' source_filename = "main" target triple = "armv7-unknown-unknown-v8" %TaggedStruct = type opaque define v8cc %TaggedStruct addrspace(1)* @ArrayPrototypePop(i32, %TaggedStruct addrspace(1)**, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, i8*, %TaggedStruct addrspace(1)**, i8**) #0 gc "coreclr" { entry: %ptr0 = bitcast %TaggedStruct addrspace(1)** %1 to i8* %ptr1 = getelementptr i8, i8* %ptr0 , i32 -1147 store i8 0, i8* %ptr1 %cmp0 = icmp eq i32 %0, 0 br i1 %cmp0, label %exit, label %call_function exit: ret %TaggedStruct addrspace(1)* null call_function: %call_result = call v8cc %TaggedStruct addrspace(1)* @bar(%TaggedStruct addrspace(1)** %1) ret %TaggedStruct addrspace(1)* %call_result } declare %TaggedStruct addrspace(1)* @bar(%TaggedStruct addrspace(1)**) attributes #0 = { "js-function-call" "no-jump-tables"="true" "target-features"="+armv7-a,+dsp,+neon,+vfp3,-crypto,-d16,-fp-armv8,-fp-only-sp,-fp16,-thumb-mode,-vfp4" }
Without the patch this review mentioned, the entry of the function uses the r1, r0, wihout saving the csrs.
You can try to reproduce the behavior of the calling convention through a MIR test: https://llvm.org/docs/MIRLangRef.html, using $REG = IMPLICIT_DEF to clobber a CSR.
Not that simple. Your solution just handles the call instructions. The the prologue is inserted by the prologue inserter. I can't simply add a prologue to a function's MIR script, right?
What I was suggesting is to write a MIR test that starts the compilation from the ShrinkWrap pass and goes through PrologEpilogInserter (or all the way to the end), then check the output.
To do that, you can generate the MIR test with llc -stop-before shrink-wrap with your patch applied, and use $REG = IMPLICIT_DEF to strip out everything related to your downstream code. Then using llc -start-before shrink-wrap -stop-after prologepilog should allow you to check if the prologue/epilogue has been inserted at the right place.
Another solution would be to use the debug output of the ShrinkWrap pass (llc -run-pass shrink-wrap -debug-only=shrink-wrap), but that will only be tested with asserts enabled.
I may be getting all of this wrong, so let me know if that doesn't make sense for your use case.
V8 needs to save a specific set of registers at the entry of a function depending on the type of this function. For example, a JS function needs to save r0, r1,r7 along with fp, lr. And a wasm function needs to spill r3 at sp - 16.
So, these registers essentially hold argument values?
But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble.
I think ShrinkWrap pass is the wrong place to solve this problem. Could you, please, look at the how the registers, which correspond to vararg arguments of a variadic function are saved to
designated slots on the stack (check all mentions of ArgRegsSaveSize in ARMFrameLowering.cpp)? Perhaps you could do something similar?
Yes. They are argument values.
But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble.
I think ShrinkWrap pass is the wrong place to solve this problem. Could you, please, look at the how the registers, which correspond to vararg arguments of a variadic function are saved to
designated slots on the stack (check all mentions of ArgRegsSaveSize in ARMFrameLowering.cpp)? Perhaps you could do something similar?
I check that place, but my solution also works well in current situation.
I don't believe this is correct. A terminator instruction can well use and/or def a CSR, e.g. an indirect branch, a table jump, a hw loop instruction, ...