This is an archive of the discontinued LLVM Phabricator instance.

[X86] Support of no_caller_saved_registers attribute (LLVM part) - restart
ClosedPublic

Authored by oren_ben_simhon on Apr 10 2017, 1:50 AM.

Details

Summary

This patch implements the LLVM part for no_caller_saved_registers attribute as appear in interrupt and exception handler proposal.

The review is a update for an old review:
Support of no_caller_saved_registers attribute (LLVM part).

The main difference from the old review is that in this patch we use the dynamic CSR mechanism in order to remove returned /passed arguments from the function regmask/CSR list.

Diff Detail

Repository
rL LLVM

Event Timeline

A kind reminder for reviewing this patch :-) Thanks in advance.

Updated the test

aaboud edited edge metadata.Apr 25 2017, 1:58 AM

Thanks Oren for resuming the no_caller_saved_registers patch.
Few comments below.

lib/Target/X86/X86FastISel.cpp
3189 ↗(On Diff #95760)

Why do we need to check the caller function?
Is there a good reason why we would disable fast lowering in such case?

lib/Target/X86/X86ISelLowering.cpp
3810 ↗(On Diff #95760)

Can you add a comment explaining this line?
For example,
// NoCallerSavedRegisters has same BASE preserved registers as X86_INTR calling convention.

3833 ↗(On Diff #95760)

No need for this variable, see comment below.
Just fold this check into the "if-condition".

3909 ↗(On Diff #95760)

No need to pass the flag "RemovePassedReturnedRegs"
If "RemovePassedReturnedRegs=false" then "RegMask=nullptr", so it is enough to check that "RegMask!=nullptr" as we do already.

Please remove "RemovePassedReturnedRegs" from this function.

lib/Target/X86/X86RegisterInfo.cpp
276 ↗(On Diff #95760)

Add a comment here too.

oren_ben_simhon marked 5 inline comments as done.Apr 26 2017, 4:15 AM

Implemented comments posted until 04/25 (Thanks Amjad)

aaboud added inline comments.Apr 27 2017, 5:00 AM
lib/Target/X86/X86FastISel.cpp
3185 ↗(On Diff #96708)

You are not using this variable anymore, right?

lib/Target/X86/X86ISelLowering.cpp
2260 ↗(On Diff #96708)

You are using this check 3 times in this function.
It worth moving this to the top of the function, and create a bool variable with well defined name.
E.g.,
bool ShouldDisableCalleeSavedRegister = CallConv == CallingConv::X86_RegCall || MF.getFunction()->hasFnAttribute("no_caller_saved_registers");

lib/Target/X86/X86RegisterInfo.cpp
277 ↗(On Diff #96708)

[Style]: It is up to you, but I prefer that you align the comment lines to 80 chars, i.e., you can still move words from 2nd line to 1st one.

test/CodeGen/X86/x86-no_caller_saved_registers.ll
1 ↗(On Diff #96708)

I would add one or two examples regarding disabling preserved registers, similar to the regcall calling convention.
These examples dose not need to have x86_intrcc calling convention, but you should use the no_caller_saved_registers attribute with other calling convention, make sure to have functions with parameters to emphasize the difference.

oren_ben_simhon marked 4 inline comments as done.Apr 27 2017, 8:14 AM

Implemented comments posted until 04/27 (Thanks Amjad)

aaboud added inline comments.Apr 27 2017, 8:35 AM
lib/Target/X86/X86ISelLowering.cpp
2242 ↗(On Diff #96922)

[Style]: Please run clang-format.

test/CodeGen/X86/x86-no_caller_saved_registers-preserve.ll
9 ↗(On Diff #96922)

Why do we need the "local_unnamed_addr" attribute in this test?

21 ↗(On Diff #96922)

Please do not use code that you might not control its outcome.
Instead try the inline assembly that allows you to force register usage.

E.g., see other LIT tests you have in the patch:
call void asm sideeffect "", "~{rax},~{rbx},~{rbp},~{r11},~{xmm0}"()

31 ↗(On Diff #96922)

I would prefer second function that have the "no_caller_saved_registers" but that has no parameters passed in registers, and then you can show that it pushes all used registers.

test/CodeGen/X86/x86-no_caller_saved_registers.ll
1 ↗(On Diff #96922)

Why did you remove the -O0 command line?
It is needed to check the behavior of FastLowerCall (X86FastISel).

MatzeB edited edge metadata.Apr 27 2017, 9:46 AM

Why isn't this "just" a new calling convention at the LLVM level? What are the benefits of having this as an attribute on top of the calling convention?

Why isn't this "just" a new calling convention at the LLVM level? What are the benefits of having this as an attribute on top of the calling convention?

Calling convention determines not only what the callee and the caller function saved registers are, but also how we move parameters to the called function.

"no_caller_saved_registers" attribute can be attached to a function with any calling convention, thus it is not a calling convention by itself.
However, this attribute overrides the calling convention callee/caller saved register list.

It is well explained in the Clang part documentation.

Why isn't this "just" a new calling convention at the LLVM level? What are the benefits of having this as an attribute on top of the calling convention?

Calling convention determines not only what the callee and the caller function saved registers are, but also how we move parameters to the called function.

"no_caller_saved_registers" attribute can be attached to a function with any calling convention, thus it is not a calling convention by itself.
However, this attribute overrides the calling convention callee/caller saved register list.

It is well explained in the Clang part documentation.

Ok, may still be worth linking/referencing this in the llvm documentation.

But I am still curious: What motivated the introduction of this attribute, do you have an example where it is used/would be used if it existed?

Ok, may still be worth linking/referencing this in the llvm documentation.

But I am still curious: What motivated the introduction of this attribute, do you have an example where it is used/would be used if it existed?

interrupt handler calling convention force the interrupt handler function to save all modified registers, as there is no caller that would do that, right?

Now, assume a case where the interrupt handler function "foo()" calls a helper function "bar()", then it needs to save all the registers preserved by callers according to the calling convention of "bar()", even if "bar()" is not going to modify any of them.
However, if we add "no_caller_saved_registers" attribute to "bar()", then we move the responsibility of saving the modified registers to "bar()" which is the function that knows what registers are going to be modified and will save only them.

I hope I explained the example well enough.

Ok, may still be worth linking/referencing this in the llvm documentation.

But I am still curious: What motivated the introduction of this attribute, do you have an example where it is used/would be used if it existed?

interrupt handler calling convention force the interrupt handler function to save all modified registers, as there is no caller that would do that, right?

Now, assume a case where the interrupt handler function "foo()" calls a helper function "bar()", then it needs to save all the registers preserved by callers according to the calling convention of "bar()", even if "bar()" is not going to modify any of them.
However, if we add "no_caller_saved_registers" attribute to "bar()", then we move the responsibility of saving the modified registers to "bar()" which is the function that knows what registers are going to be modified and will save only them.

I hope I explained the example well enough.

So this is the old discussion about choosing the right balance between caller and callee saved registers. Sounds to me like whoever uses this should have rather spend the time making sure the compiler can inline his calls and mark all his methods static so the compiler can optimize the calling convention automatically.

Makes me wonder a bit where this should start and end. To play devils advocate: Why stop here and not allow specifying the number or exact list of caller saved registers, whether flags are preserved, what registers to use for arguments, etc. IMO that just increases compiler complexity and hence potential bugs for very little gain.

Anyway seems like others find this useful and are willing to implement and maintain this, so I won't stand in the way.

oren_ben_simhon marked 4 inline comments as done.EditedApr 27 2017, 2:13 PM
This comment has been deleted.
test/CodeGen/X86/x86-no_caller_saved_registers-preserve.ll
31 ↗(On Diff #96922)

We don't need a function without parameters in order to show that the function pushes all used registers.

This is already tested in the function bar1 and bar0 (in the other test file).
In bar0 you can see that it pushes xmm0 because it uses it.
In bar1 you can see that it pushes rdx and xmm1 because it uses it.

So there is no additional value in a function without arguments.

Implemented comments posted until 04/28 (Thanks Amjad and Matthias)

aaboud added inline comments.Apr 30 2017, 7:27 AM
test/CodeGen/X86/x86-no_caller_saved_registers-preserve.ll
13 ↗(On Diff #97000)

I am afraid that the order is important here.
Assume that the code would look like this:

pushq %rdx
pushq %rdi
pushq %rsi
movaps %xmm1
...

Then this test will still pass, right?

20 ↗(On Diff #97000)

I assume that the return value will be passed by %eax, so please check that %rax is not saved either.

31 ↗(On Diff #96922)

I do not see "bar1" in the other file, and indeed it worth having test that shows that callee do preserve GPR, not only XMM resgister.

What I want to have here is a test that shows we do preserve same registers as in "bar1" above, once they are not used to pass parameters. Otherwise, how can we be sure that the implementation does not always skip these registers you used above?

define x86_64_sysvcc void @bar2() #1 {
; CHECK-LABEL: bar1
; CHECK:     pushq %rdx
; CHECK:     pushq %rax
; CHECK:     pushq  %rdi
; CHECK:     pushq  %rsi
; CHECK:     movaps  %xmm1
; CHECK:     movaps  %xmm0
; CHECK:     movaps {{.*}}, %xmm0
; CHECK:     movaps {{.*}}, %xmm1
; CHECK:     popq %rsi
; CHECK:     popq %rdi
; CHECK:     popq %rax
; CHECK:     popq %rdx
; CHECK:     retq
  call void asm sideeffect "", "~{rdx},~{rax},~{xmm1},~{rdi},~{rsi},~{xmm0}"()
ret void
}

Either way, I do not see the added value of "foo1", why are we testing a function that has no "no_caller_saved_registers" attribute here? it must be checked somewhere else, right?

test/CodeGen/X86/x86-no_caller_saved_registers.ll
24 ↗(On Diff #97000)

move this to line 27 and add CHECK-LABEL: for @foo0.

oren_ben_simhon marked 3 inline comments as done.May 1 2017, 12:03 AM
oren_ben_simhon added inline comments.
test/CodeGen/X86/x86-no_caller_saved_registers-preserve.ll
13 ↗(On Diff #97000)

That's correct. I will use update_llc_test_checks.py for this function instead.

31 ↗(On Diff #96922)

I do not see "bar1" in the other file

  • bar1 is in this file.
  • bar0 is the the other file.

It was probably misunderstood, In all three functions (bar0, bar1 and foo1) we check that parameters that are passed as arguments are not preserved.

For example, in foo1, we see that xmm0 is preserved. Why is it preserved (you ask)?
because it is passed to bar1, which has no_caller_saved_registers, and as such the caller needs to preserve the passed arguments.
This is why foo1 doesn't need to have the attribute no_caller_saved_registers. But still it tests the attribute of the callee.

Either way, I do not see the added value of "foo1", why are we testing a function that has no "no_caller_saved_registers" attribute here? it must be checked somewhere else, right?

Every time you test a calling convention, you need to test what happens in the caller of that CC function (foo0/foo1) and inside the CC function itself (bar1/bar0).

What I want to have here is a test that shows we do preserve same registers as in "bar1" above, once they are not used to pass parameters.

Again, this is already tested, please see my previous comment:

This is already tested in the function bar1 and bar0 (in the other test file).
In bar0 you can see that it pushes xmm0 because it uses it.
In bar1 you can see that it pushes rdx and xmm1 because it uses it.

oren_ben_simhon marked an inline comment as done.

Implemented comments posted until 04/30 (Thanks Amjad)

Updated the tests.

aaboud accepted this revision.May 3 2017, 5:28 AM

Thanks Oren,
Code looks much better now.

This revision is now accepted and ready to land.May 3 2017, 5:28 AM
This revision was automatically updated to reflect the committed changes.