This is an archive of the discontinued LLVM Phabricator instance.

[X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs
ClosedPublic

Authored by AntonBikineev on Jan 4 2023, 4:44 PM.

Details

Summary

Currently both calling conventions preserve registers that are used to store
a return value. This causes the returned value to be lost:

define i32 @bar() {
  %1 = call preserve_mostcc i32 @foo()
   ret i32 %1
}

define preserve_mostcc i32 @foo() {
  ret i32 2
  ; preserve_mostcc will restore %rax, whatever it was before the call.
}

This contradicts the current documentation (preserve_allcc "behaves
identical to the C calling conventions on how arguments and return
values are passed") and also breaks [[clang::preserve_most]].

This change makes CSRs be preserved iff they are not used to store a
return value (e.g. %rax for scalars, {%rax:%rdx} for __int128, %xmm0
for double). For void functions no additional registers are
preserved, i.e. the behaviour is backward compatible with existing
code.

Juergen, can you please take a look, since you added the CCs originally?
Phoebe, I added you as a reviewer as well, since you touched the X86 CC code recently :)

Thanks!

Diff Detail

Event Timeline

AntonBikineev created this revision.Jan 4 2023, 4:44 PM
AntonBikineev requested review of this revision.Jan 4 2023, 4:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2023, 4:44 PM
AntonBikineev edited reviewers, added: lebedev.ri; removed: pengfei.Jan 9 2023, 3:33 AM

Roman, could you please take a look at it (or suggest a reviewer)?

(The context: we want to use the attribute for cold calls in Chromium/V8. Early experiments are quite promising (e.g. showed to improve the hot path by up to 2x)).

pengfei added a subscriber: pengfei.Jan 9 2023, 6:01 AM

I took a look but fogot to comment. TBH, I know little about preserve_mostcc. I checked the existing test cases all of which declare void return with it. So is it not legal to return value with the CC? If we do not preserve rax, we probably have compatibility issue with pre-built code since it assumes rax is saved by callee.

So is it not legal to return value with the CC?

According to the documentation, it is legal. In the current implementation it isn't though.

If we do not preserve rax, we probably have compatibility issue with pre-built code since it assumes rax is saved by callee.

That's true - the change may break the existing precompiled code.

As I see it, we should either

  • have consistency between the implementation and the docs and early diagnose attribute application on non-void functions, or
  • allow %rax to be clobbered by the callee.
lebedev.ri added a subscriber: efriedma.

This seems to be missing the code change itself.

This seems to be missing the code change itself.

(the code change is essentially in X86CallingConv.td)

Having a register be both preserved and used to store a return value is obviously a contradiction. However, it's consistent to say that a register is preserved if and only if it is not used to return a value. (Not sure off the top of my head what changes are required to make that work correctly.)

The C calling convention uses RAX/RDX/XMM0/XMM1 to return values, and LLVM additionally allows using RCX/XMM2/XMM3 for certain IR signatures. So I don't think just removing RAX from the list does the right thing.

Having a register be both preserved and used to store a return value is obviously a contradiction. However, it's consistent to say that a register is preserved if and only if it is not used to return a value. (Not sure off the top of my head what changes are required to make that work correctly.)

The C calling convention uses RAX/RDX/XMM0/XMM1 to return values, and LLVM additionally allows using RCX/XMM2/XMM3 for certain IR signatures. So I don't think just removing RAX from the list does the right thing.

Agreed! FPRs is not a problem as the doc says they are not preserved. But we do use RDX sometime, e.g.: https://godbolt.org/z/sWKKbjMvz
So yes, we may need a method to override the preserve list when RAX/RDX are used as return registers. Not sure if the current framework support or not.

FPRs is not a problem as the doc says they are not preserved

MostRegs doesn't, but AllRegs does?

Not sure if the current framework support or not.

X86RegisterInfo::getCalleeSavedRegs has enough information to figure out the correct set of registers; not sure if there's any convenient way to compute it, though. Not sure X86RegisterInfo::getCallPreservedMask has enough information to tell; it looks like the actual signature isn't passed. The callers should have that information available, though.

If this can be fixed without affecting any potential clients, then I am all for it. In hindsight it might have been better to make a use case specific calling convention instead and not a generic one that everyone could depend on. Oen possible solution would be to make a Chromium specific calling convention and also name it as such. This would provide clear expectations what it is intended for and how is supposed to use it. Furthermore this would give you the option of changing it in the future if the need should arise.

Thanks everybody for the feedback! Sorry for the delayed response - I was occupied with the regular work.

However, it's consistent to say that a register is preserved if and only if it is not used to return a value.

I thought about it first, but just found cloberring %rax as a quick fix. But I 100% agree - having a flexible CC would be the best solution:

  • would allow callers to use more registers when callees are void,
  • would stay backward compatible, assuming the clients are not using the CCs with the return value (as otherwise their code would be broken).

MostRegs doesn't, but AllRegs does?

Yeah, all return registers must be addressed..

Oen possible solution would be to make a Chromium specific calling convention and also name it as such.

I thought about this as well, but preserve_most is conceptually already what we need. It would be better to save a chromium-specific CC for a future use case :)

There is already logic that nicely allows to disable CSRs dynamically, which was introduced for the Intel's regcall CC (here https://reviews.llvm.org/D28566). So I'll just piggyback on it.

AntonBikineev retitled this revision from [X86][ABI] Clobber %RAX for preserve_allcc and preserve_mostcc CCs to [X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs.
AntonBikineev edited the summary of this revision. (Show Details)

Ptal. Now the change preserves all the declared callee-saved-regs as long as they are not used to return a value. The change should be backward compatible, assuming the user code was correct.

Looks great!

llvm/test/CodeGen/X86/preserve_allcc64.ll
9

I think it's better to put these 2 tests into a seperate file. It just waste time to run multi times on the other 2 tests.

11–23

This will brings difficulties in future updates, though I think it's rare this file will be affected.

AntonBikineev marked an inline comment as done.Jan 19 2023, 1:11 PM
AntonBikineev added inline comments.
llvm/test/CodeGen/X86/preserve_allcc64.ll
9

Sure (moved tests returning double into separate files)

11–23

What exactly could break? The order of CSRs?

AntonBikineev marked an inline comment as done.

Moved unparametrized tests (i.e. ones returning doubles) into separate files.

(friendly ping...)

We've already started using the CC on void-functions in Chromium/V8, which helped us to improve the runtime (as well as the binary size). Looking forward to apply it more broadly.

(friendly ping)

pengfei accepted this revision.Jan 23 2023, 3:35 AM

LGTM.

This revision is now accepted and ready to land.Jan 23 2023, 3:35 AM

Thanks, everybody!

hans added a subscriber: hans.Jan 28 2023, 7:48 AM

We bisected Chromium crashes to this revision, see https://crbug.com/1410225#c4 for how to reproduce.

I'll revert this for now.

hans added a subscriber: tstellar.Jan 30 2023, 12:54 AM

@tstellar We may want to consider the revert for the 16 branch.