This is an archive of the discontinued LLVM Phabricator instance.

Make ShrinkWrap more consistent.
Needs RevisionPublic

Authored by linzj on Aug 19 2019, 6:48 PM.

Details

Summary

When see a terminator like tail call, useOrDefCSROrFI should return false Immediately.
When checking if a register is a CSR, useOrDefCSROrFI should use getCurrentCSRs.

Event Timeline

linzj created this revision.Aug 19 2019, 6:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2019, 6:48 PM
chill requested changes to this revision.Aug 20 2019, 2:41 AM

Could you, please, give some code examples of the issues that you see?

Also, the patch needs regression tests, which fail without the patch applied (these can server as examples too).

llvm/lib/CodeGen/ShrinkWrap.cpp
275

I don't believe this is correct. A terminator instruction can well use and/or def a CSR, e.g. an indirect branch, a table jump, a hw loop instruction, ...

293

Why is that needed? The set of actual CSRs ought to be always a subset of the set of possible CSRs.

This revision now requires changes to proceed.Aug 20 2019, 2:41 AM
linzj marked 2 inline comments as done.Aug 20 2019, 6:08 PM
linzj added inline comments.
llvm/lib/CodeGen/ShrinkWrap.cpp
275

Oh, that's a mistake. How about MI.isReturn()? I just want to handle sibling calls.
They are essentially the same as returns.

293

I am actually making LLVM a code generator for V8 turbofan. V8 needs to save a specific set of registers at the entry of a function depending on the type of this function. For example, a JS function needs to save r0, r1,r7 along with fp, lr. And a wasm function needs to spill r3 at sp - 16.

At this context, I need to add additional CSRs at frame lowering's assignCalleeSavedSpillSlots. So getCurrentCSRs maybe a super set of the set of possible CSRs.

chill added inline comments.
llvm/lib/CodeGen/ShrinkWrap.cpp
293

Have you considered adding a custom calling convention in the backend, instead ?
https://llvm.org/docs/WritingAnLLVMBackend.html#calling-conventions

That may also remove the need for specially handing sibling calls in your case.

linzj marked an inline comment as done.Aug 21 2019, 12:53 AM
linzj added inline comments.
llvm/lib/CodeGen/ShrinkWrap.cpp
293

I do add new calling convections:
def CSR_V8CC : CalleeSavedRegs<(add LR, R11)>;

def CSR_V8SBCC : CalleeSavedRegs<(add LR, R11, R10, R9, R8, R7, R6, R5,

(sequence "D%u", 31, 0))>;

The problem is the CSRs added in assignCalleeSavedSpillSlots function may still get clobbered in the function. That is when a function returns, it will not restore any of these CSRs. And a call will not list them in the regmask, so they still considered be clobbered.

But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble.

Here's some background information:
I am a developer from UC Browser, and LLVM for V8 turbofan has released to users for months without troubles. I am trying to make the LLVM patch back to the upstream.

Thanks for upstreaming your changes.

I am not sure I understand what you are trying to fix, similar to @chill's request, can you please provide an example (ideally, as a test case) of what this fixes? Using MIR with llc -run-pass=shrink-wrap might help you creating a more precise test case for shrink-wrapping.

linzj added a comment.Sep 6 2019, 5:46 AM

Thanks for upstreaming your changes.

I am not sure I understand what you are trying to fix, similar to @chill's request, can you please provide an example (ideally, as a test case) of what this fixes? Using MIR with llc -run-pass=shrink-wrap might help you creating a more precise test case for shrink-wrapping.

Thank you. Just see this message. I am on vacation, so I will send an example 2 weeks later.

linzj added a comment.Sep 15 2019, 6:27 PM

You need to clone https://github.com/linzj/llvm-toy, then checkout the branch named arm-tf-6_9_427_23. Then apply the patch llvm-patch-by-far.patch to llvm 8.0. So that you can compile the test case as follow. Because of a new calling convention v8cc.
the test case:

; ModuleID = 'main'
source_filename = "main"
target triple = "armv7-unknown-unknown-v8"

%TaggedStruct = type opaque

define v8cc %TaggedStruct addrspace(1)* @ArrayPrototypePop(i32, %TaggedStruct addrspace(1)**, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, %TaggedStruct addrspace(1)*, i8*, %TaggedStruct addrspace(1)**, i8**) #0 gc "coreclr" {
entry:
  %ptr0 = bitcast %TaggedStruct addrspace(1)** %1 to i8*
  %ptr1 = getelementptr i8, i8* %ptr0 , i32 -1147
  store i8 0, i8* %ptr1
  %cmp0 = icmp eq i32 %0, 0
  br i1 %cmp0, label %exit, label %call_function
exit:
  ret %TaggedStruct addrspace(1)* null
call_function:
  %call_result = call v8cc %TaggedStruct addrspace(1)* @bar(%TaggedStruct addrspace(1)** %1)
  ret %TaggedStruct addrspace(1)* %call_result
}

declare %TaggedStruct addrspace(1)* @bar(%TaggedStruct addrspace(1)**)
attributes #0 = { "js-function-call" "no-jump-tables"="true" "target-features"="+armv7-a,+dsp,+neon,+vfp3,-crypto,-d16,-fp-armv8,-fp-only-sp,-fp16,-thumb-mode,-vfp4" }

Without the patch this review mentioned, the entry of the function uses the r1, r0, wihout saving the csrs.

You can try to reproduce the behavior of the calling convention through a MIR test: https://llvm.org/docs/MIRLangRef.html, using $REG = IMPLICIT_DEF to clobber a CSR.

You can try to reproduce the behavior of the calling convention through a MIR test: https://llvm.org/docs/MIRLangRef.html, using $REG = IMPLICIT_DEF to clobber a CSR.

Not that simple. Your solution just handles the call instructions. The the prologue is inserted by the prologue inserter. I can't simply add a prologue to a function's MIR script, right?

You can try to reproduce the behavior of the calling convention through a MIR test: https://llvm.org/docs/MIRLangRef.html, using $REG = IMPLICIT_DEF to clobber a CSR.

Not that simple. Your solution just handles the call instructions. The the prologue is inserted by the prologue inserter. I can't simply add a prologue to a function's MIR script, right?

What I was suggesting is to write a MIR test that starts the compilation from the ShrinkWrap pass and goes through PrologEpilogInserter (or all the way to the end), then check the output.

To do that, you can generate the MIR test with llc -stop-before shrink-wrap with your patch applied, and use $REG = IMPLICIT_DEF to strip out everything related to your downstream code. Then using llc -start-before shrink-wrap -stop-after prologepilog should allow you to check if the prologue/epilogue has been inserted at the right place.
Another solution would be to use the debug output of the ShrinkWrap pass (llc -run-pass shrink-wrap -debug-only=shrink-wrap), but that will only be tested with asserts enabled.

I may be getting all of this wrong, so let me know if that doesn't make sense for your use case.

linzj added a comment.Sep 18 2019, 5:55 PM

You can try to reproduce the behavior of the calling convention through a MIR test: https://llvm.org/docs/MIRLangRef.html, using $REG = IMPLICIT_DEF to clobber a CSR.

Not that simple. Your solution just handles the call instructions. The the prologue is inserted by the prologue inserter. I can't simply add a prologue to a function's MIR script, right?

What I was suggesting is to write a MIR test that starts the compilation from the ShrinkWrap pass and goes through PrologEpilogInserter (or all the way to the end), then check the output.

To do that, you can generate the MIR test with llc -stop-before shrink-wrap with your patch applied, and use $REG = IMPLICIT_DEF to strip out everything related to your downstream code. Then using llc -start-before shrink-wrap -stop-after prologepilog should allow you to check if the prologue/epilogue has been inserted at the right place.
Another solution would be to use the debug output of the ShrinkWrap pass (llc -run-pass shrink-wrap -debug-only=shrink-wrap), but that will only be tested with asserts enabled.

I may be getting all of this wrong, so let me know if that doesn't make sense for your use case.

Thank you for your suggestion. Let me give it a try to see what I get.

chill added a comment.Nov 10 2019, 2:46 AM

V8 needs to save a specific set of registers at the entry of a function depending on the type of this function. For example, a JS function needs to save r0, r1,r7 along with fp, lr. And a wasm function needs to spill r3 at sp - 16.

So, these registers essentially hold argument values?

But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble.

I think ShrinkWrap pass is the wrong place to solve this problem. Could you, please, look at the how the registers, which correspond to vararg arguments of a variadic function are saved to
designated slots on the stack (check all mentions of ArgRegsSaveSize in ARMFrameLowering.cpp)? Perhaps you could do something similar?

linzj added a comment.Nov 11 2019, 8:28 PM

V8 needs to save a specific set of registers at the entry of a function depending on the type of this function. For example, a JS function needs to save r0, r1,r7 along with fp, lr. And a wasm function needs to spill r3 at sp - 16.

So, these registers essentially hold argument values?

Yes. They are argument values.

But they are CSRs in the ShrinkWrap context. For example, If a JS function use/def r1 in the a basic block but ShrinkWrap move the save block to one of its successor, that will be a trouble.

I think ShrinkWrap pass is the wrong place to solve this problem. Could you, please, look at the how the registers, which correspond to vararg arguments of a variadic function are saved to
designated slots on the stack (check all mentions of ArgRegsSaveSize in ARMFrameLowering.cpp)? Perhaps you could do something similar?

I check that place, but my solution also works well in current situation.