Page MenuHomePhabricator

Disable Callee Saved Registers
ClosedPublic

Authored by oren_ben_simhon on Jan 11 2017, 8:20 AM.

Details

Summary

Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller.
Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list.

The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee.

The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee.

Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span).

The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments.

The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC.

Diff Detail

Repository
rL LLVM

Event Timeline

oren_ben_simhon retitled this revision from to Dynamic Callee Saved Registers.
oren_ben_simhon updated this object.
oren_ben_simhon set the repository for this revision to rL LLVM.
oren_ben_simhon added a subscriber: llvm-commits.
oren_ben_simhon retitled this revision from Dynamic Callee Saved Registers to Disable Callee Saved Registers.

Using the MF Allocator to allocate the register mask.

kparzysz edited edge metadata.Jan 12 2017, 6:46 AM

Can't you just change X86FrameLowering::determineCalleeSaves? The registers used to pass arguments to a function are already listed as live-in to the function in MRI.

oren_ben_simhon edited edge metadata.

Populated CalleeSaveDisableRegs list from LiveIns list (if in regcall calling convention).

Thanks Krzysztof for pointing it out. I updated the patch to use LiveIns to populate CalleeSaveDisableRegs List.
Notice that they are not the same because:

  1. Each target may populate different Callee Saved Registers that should be disabled (not necessarily arguments).
  2. For x86, it includes LiveIns and also return values - only for Regcall Calling Convention.

Updated lit test.

A kindly reminder. I will appreciate your inputs on the review.

If there are no additional comments please finish the review.

MatzeB edited edge metadata.Jan 30 2017, 11:10 AM

This is going to make the code confusing. We keep TRI::getCalleeSavedRegs() just that now it doesn't always give you the right answer and you have to check that other thing as well. We really should find a way so that getCalleeSavedRegs() (or some other new function that replaces the former) gives you a single correct answer.

include/llvm/CodeGen/RegisterClassInfo.h
56 ↗(On Diff #84553)

There are several targets out there with more than 256 registers! AMDGPU already has > 1000, and I know of even crazier out-of-tree targets.

I will create a wrapper function to be used instead of getCalleeSavedRegs().

include/llvm/CodeGen/RegisterClassInfo.h
56 ↗(On Diff #84553)

The size here is just the initial size. The size is updated according to the number of registers.

Notice that following code exmaple that actually changes the size (using push_back):

SmallVector<uint8_t, 16> ActualCalleeSaved;
...

for (unsigned I = 0; unsigned Reg = CSR[I]; ++I)
  ActualCalleeSaved.push_back(Reg);

...

CalleeSavedRegs = ActualCalleeSaved;

Implemented comments posted until 01/31

MatzeB added inline comments.Feb 1 2017, 11:00 AM
include/llvm/CodeGen/RegisterClassInfo.h
56 ↗(On Diff #84553)

You hold a register number in an uint8_t with UINT8_MAX==256, don't you?
You should use MCPhysReg (or unsigned if you want to store phys+virtual regs)

oren_ben_simhon marked an inline comment as done.

Implemented comments submitted until 02/01 (Thank you Matthias).

Please assist in reviewing the patch.

qcolombet edited edge metadata.Feb 9 2017, 12:34 PM

Hi,

High level comment: if we are going to use getUpdatedCalleeSavedReg everywhere in place of getCalleeSavedReg, shouldn't we change directly the implementation of getCalleeSavedReg?
I am afraid that having the distinction is going to be confusing and error-prone.

Cheers,
-Quentin

include/llvm/CodeGen/MachineRegisterInfo.h
65 ↗(On Diff #86774)

Don't repeat the name of the field in the comment.

72 ↗(On Diff #86774)

*contrary

> As opposed to?

Erratum: I missed that we switched from TRI to MRI with the new API. I still believe the names are confusing and I would at least expect some comment update on TRI::getCalleeSavedRegs to point out the MRI variant and explain when we should use one or the other.

MatzeB added a comment.Feb 9 2017, 5:32 PM

I completely agree with Quentin that we have to change the comment (or even better the name) of TargetRegisterInfo::getUpdatedCalleeSavedRegs() to discourage users to use the "wrong" function.

include/llvm/CodeGen/MachineRegisterInfo.h
220 ↗(On Diff #86774)

Maybe disableCalleeSavedRegister() and improve the comment to explain the effects this has for a user (rather than explaining that a bit is set in a private datastructure).

222 ↗(On Diff #86774)

Typo: invalif

229–233 ↗(On Diff #86774)

Should be a doxygen comment (///)

This seems to be THE way to get a list of callee saved registers now. So at least the brief description should reflect that and avoid words like "updated" or explanations on what is happening internally. I'd expect something like this (the lazy allocation bits seem like an implementation detail that can be left out):

/// Return list of callee saved registers.
///
/// This returns targets default callee saved register list without the registers explicitely disabled by disableCalleeSavedRegister().
include/llvm/CodeGen/RegisterClassInfo.h
56 ↗(On Diff #86774)

I didn't follow the code completely, but do we still need an extra SmallVector if there is MachineRegisterInfo::getUpdatedCalleeSavedRegs() that allocates storage itself?

lib/CodeGen/AggressiveAntiDepBreaker.cpp
161–162 ↗(On Diff #86774)

The remark in braces is probably not necessary? My understanding is that registers that pass/return arguments simply are not callee saved registers so there shouldn't be a need to mention this here.

lib/CodeGen/CriticalAntiDepBreaker.cpp
69–70 ↗(On Diff #86774)

dito.

lib/CodeGen/MachineRegisterInfo.cpp
550–551 ↗(On Diff #86774)

Could you move the lazy memory allocation to the addDisableCalleeSavedReg() function so this can function can become const?

There are a bunch of not so nice const_casts now because of this.

lib/Target/X86/X86ISelLowering.cpp
3198–3200 ↗(On Diff #86774)
const MachineRegisterInfo &MRI = MF.getRegInfo();
for (std::pair<unsigned,unsigned> P : make_range(MRI.livein_begin(), MRI.livein_end()) { ... }
```?
oren_ben_simhon marked 12 inline comments as done.Feb 15 2017, 8:28 AM

Implemented comments posted until 2/14 (Thank You Matthias & Quentin).

The main difference from last revision is that i moved the allocation and calculation of the updates CSR list.
The calculation will be performed every time a register is disabled and not when the list is requested.
This way we don't need a const cast when retrieving the CSR list.

Hi Guys, I would kindly like to remind of the review. Thanks in advance.

aaboud added inline comments.Mar 1 2017, 8:55 AM
include/llvm/CodeGen/MachineRegisterInfo.h
212 ↗(On Diff #88546)

Typos:

CSR - Callee Saved Registers.
So, do not write CSR register mask, either say "CSR mask", or "callee saved register mask"

"allocating a the updated lisy" -> "allocating the updated list"

include/llvm/CodeGen/RegisterClassInfo.h
59 ↗(On Diff #88546)

should be:

SmallVector<MCPhysReg, 4> CalleeSavedAliases;

In the previous implementation CSRNum[i] was an index into CalleeSaved, and the assumption was that CalleeSaved will not have more than 2^8 = 256 entries.
This is not the case with your new implementation where CalleeSavedAliases[i] is not an index into CalleeSavedRegs, but the actual MCPhysReg.

lib/CodeGen/LivePhysRegs.cpp
180 ↗(On Diff #88546)

You may move this down to line 185, just before the line it is used at.

lib/CodeGen/MachineInstr.cpp
267 ↗(On Diff #88546)

It is still worth to check if both have the same pointer, in such case we can return true.
Only if we have different pointers we need to check each element.

if(getRegMask() == Other.getRegMask())
  return true;
lib/CodeGen/MachineRegisterInfo.cpp
561 ↗(On Diff #88546)

UpdatedCalleeSavedRegs can be of type SmallVector<MCPhysReg, 16>.
And at getUpdatedCalleeSavedRegs just return UpdatedCalleeSavedRegs.data();

In this case you can get rid of all these:

allocateRegisterList
CSRVector
CalleeSaveDisableRegs

Just make sure that last entry in UpdatedCalleeSavedRegs is a zero.
And you can remove registers from this buffer using

UpdatedCalleeSavedRegs.erase(std::remove(UpdatedCalleeSavedRegs.begin(), UpdatedCalleeSavedRegs.end(), Reg)

There will not be many registers to remove, and we only add registers once, at first time we entry this function.

lib/Target/X86/X86ISelLowering.cpp
3721 ↗(On Diff #88546)

Initialize this to nullptr.

3746 ↗(On Diff #88546)

sink this into the above if-else.
This case, there will be no case where we need to do const_cast, and for sure will not be sending the static "Mask" as "RegMask" to the LowerCallResult function.

oren_ben_simhon marked 7 inline comments as done.

Implemented comments posted until 03/11 (Thank You, Amjad).

aaboud edited edge metadata.Mar 13 2017, 2:47 AM

LGTM.
Just few minor comments below.

include/llvm/CodeGen/MachineRegisterInfo.h
70 ↗(On Diff #91496)

These comment lines are not aligned to 80 characters.
Please, run clang-format on this?

209 ↗(On Diff #91496)

Why did you remove the dot at the end of this line?

test/CodeGen/X86/DynamicCalleeSavedRegisters.ll
9 ↗(On Diff #91496)

"retuning"-> "returning"
"used" -> "modified"

10 ↗(On Diff #91496)

"calle" -> "callee"

13 ↗(On Diff #91496)

Can you mentioned how this "hipe CC" function get's its arguments: a0, b0, c0, d0, and e0?

35 ↗(On Diff #91496)

Please explain how the parameters a0, b0, c0, d0, and e0 are expected to be passed to this function, i.e., what register is used for each one?

test/CodeGen/X86/sse-regcall.ll
40 ↗(On Diff #91496)

I believe these tests should be improved, the regular expression is misused in this test.
However, I think it can be done in separate patch.

oren_ben_simhon marked 6 inline comments as done.

Implemented comments posted until 03/13 (Thanks Amjad).

Updated the dynamic CSR test.

The code looks much better now.
Craig, do you have any more comments?
If not, can you approve this patch?

Fixed a typo in dynamic CSR test.

MatzeB added inline comments.Mar 13 2017, 9:40 AM
include/llvm/CodeGen/MachineRegisterInfo.h
217 ↗(On Diff #91547)

Please call this getCalleeSavedRegs() to make it clear that this should be the default function to call now to get the list of callee saved registers!
At the same time please add a sentence in the documentation of TargetRegisterInfo::getCalleeSavedRegs() to state that in most cases people should rather call the version in MachineRegisterInfo.

lib/CodeGen/MachineInstr.cpp
274–278 ↗(On Diff #88546)

This can be simplified to
return std::equal(RegMask, RegMask+RegMaskSize, OtherRegMask);

oren_ben_simhon marked 2 inline comments as done.Mar 13 2017, 12:22 PM

Implemented comments posted by Matthias (Thank you).

MatzeB accepted this revision.Mar 13 2017, 1:15 PM

LGTM.

include/llvm/Target/TargetRegisterInfo.h
429–430 ↗(On Diff #91603)

How about "This function does not consider calling conventions adapting to parameters; consider using MachineRegisterInfo::getCalleeSavedRegs() instead."?

This revision is now accepted and ready to land.Mar 13 2017, 1:15 PM
oren_ben_simhon marked an inline comment as done.Mar 14 2017, 1:16 AM
This revision was automatically updated to reflect the committed changes.