This is an archive of the discontinued LLVM Phabricator instance.

[RegAllocFast] properly handle STATEPOINT instruction.
ClosedPublic

Authored by dantrushin on Mar 24 2021, 11:04 AM.

Details

Summary

STATEPOINT is a fancy and complex pseudo instruction which
has both tied defs and regmask operand.

Basic FastRA algorithm is as follows:

  1. Mark registers used by defs as free
  2. If instruction has regmask operand displace clobbered registers according to regmask.
  3. Assign registers for use operands.

In case of tied defs step 1 is replaced with allocation of registers
for them. But regmask is still processed, which may displace already
allocated registers. As a result, tied use and def will get assigned
to different registers.

This patch makes FastRA to process instruction's RegMask (if any) when
checking for physical registers interference.
That way tied operands won't get registers clobbered by regmask.

Diff Detail

Event Timeline

dantrushin created this revision.Mar 24 2021, 11:04 AM
dantrushin requested review of this revision.Mar 24 2021, 11:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2021, 11:04 AM

Do I understand correctly that with this change tied pair of registers can be assigned to caller saved register?

This is not a correct generally. For statepoint you can probably rely on FixedCallerSaveRegister pass.

I don't think this should special case statepoints. Any regmask with tied operands should behave the same way

llvm/test/CodeGen/X86/statepoint-fastregalloc.mir
4–8

Can remove this. Should also have a comment explaining the test

26–67

Don't need most of this

arsenm requested changes to this revision.Mar 30 2021, 3:20 PM
This revision now requires changes to proceed.Mar 30 2021, 3:20 PM

Address review comments:

  • Do this for all tied operands with regmask instructions;
  • Strip all unnecessary data from test, add comment;
dantrushin marked 2 inline comments as done.Mar 31 2021, 8:11 AM
arsenm added inline comments.Mar 31 2021, 3:08 PM
llvm/lib/CodeGen/RegAllocFast.cpp
1234

I don't like relying on findLiveVirtReg getting called a second time returning the same result as during the actual assignment

1265

Can you track the register from here instead?

1280–1281

Should get a comment explaining the tied behavior

1281

Checking for exact register matches makes me nervous because of register aliases / subregisters. What if you had something like

FOO tied-def $reg0_reg1, tied-use $reg0_reg1, mask-clobber(reg1)?

What is the correct behavior here and does it matter? Should it be a verifier error?

skatkov requested changes to this revision.Mar 31 2021, 11:04 PM

I believe this is not right fix for general problem of handling tied-defs wrt reg mask.

Firs of all, running your test I see that tied def/use now uses the same register rax but it is clabbered by the call. So compilation is not correct at all.
As I said before for statepoint instruction you can rely on post fix-up of this case but it is better to get a full support in fast regalloc itself.

I guess that the right way to fix this should look as follows:

  1. defineLiveThroughVirtReg for tied-def should check not only whether the register is used in instruction but also it is not in regmask.
  2. allocVirtReg should be aware about reg mask and tied operands to avoid allocation of register which is in reg mask.
This revision now requires changes to proceed.Mar 31 2021, 11:04 PM
dantrushin updated this revision to Diff 334698.Apr 1 2021, 8:00 AM

Rework algorithm a bit:

  • Store intruction RegMask (if any) in a global;
  • Look at RegMask when allocating register if we care about physical register use. (Re)Use LookAtPhysRegUses for this.

With this change, registers clobbered by regmask should not be assigned
to tied defs and AssignedTiedDefs set is not strictly needed. I use
it for assertion check, but can eliminate it.

dantrushin marked 2 inline comments as done.Apr 1 2021, 8:09 AM
dantrushin added inline comments.
llvm/lib/CodeGen/RegAllocFast.cpp
1281

My understanding is that RegMask containts all registers and subregisters.
I'm not so sure about aliases.
As for verifier, it was recently enhanced to better track tied registers. There is also in-progress review D97054 which tries to enhance it further...

skatkov accepted this revision.Apr 1 2021, 11:01 PM

lgtm. Please wait for @arsenm feedback.

llvm/lib/CodeGen/RegAllocFast.cpp
1148

you do not need to collect all tied-defs assigned. As soon as it is under an assert, you can just iterate over defs of MI and check that if it is def it is not assigned to clobbered register.
Moreover you can move an assert to end of an allocation and also check that tied-use is allocated to the same register.

If you decide to leave it in this way - please ensure that this declaration and inserting in the set is not used in product mode.

1286

consider usage of all_of instead. So you will not need #ifdef then.

please update the commit message.

skatkov requested changes to this revision.Apr 1 2021, 11:21 PM

BTW, it seems that previous code was non-optimal or it supports that there are may be ore than one machine operand containing the reg mask.
If the second is true, you should support this in your patch. Can you check that?

This revision now requires changes to proceed.Apr 1 2021, 11:21 PM

BTW, it seems that previous code was non-optimal or it supports that there are may be ore than one machine operand containing the reg mask.
If the second is true, you should support this in your patch. Can you check that?

I've never seen instructions with multiple RegMasks and I do not see a reason for multiple RegMasks. This is bitfield big enough to hold all target's physical registers.
After grepping LLVM sources I have no definitive answer either. One can't tell code handling multiple masks from code handling just one, but iterating over all operands just because there is no other way to find regmask (it has no fixed position in MI).

@arsenm : what do you think?

dantrushin edited the summary of this revision. (Show Details)Apr 2 2021, 6:46 AM

ok, I took a look in different usage of MO.getRegMask and it looks like several regmask may happen: see InsteRefBasedImpl.cpp.
So it would be better to re-write the code to support several regmasks - it should not be too complex.

dantrushin updated this revision to Diff 335848.Apr 7 2021, 9:44 AM
dantrushin edited the summary of this revision. (Show Details)

Support multiple RegMasks per instruction.

skatkov added inline comments.Apr 7 2021, 8:53 PM
llvm/lib/CodeGen/RegAllocFast.cpp
164

any_of?

1148

you do not need to collect all tied-defs assigned. As soon as it is under an assert, you can just iterate over defs of MI and check that if it is def it is not assigned to clobbered register.
Moreover you can move an assert to end of an allocation and also check that tied-use is allocated to the same register.

If you decide to leave it in this way - please ensure that this declaration and inserting in the set is not used in product mode.

Did not get any answer on this comment.

1288

all_of?

dantrushin updated this revision to Diff 336039.Apr 8 2021, 2:24 AM

Address comments:

  • use any_of
  • get rid of AssignedTiedDefs vector
dantrushin marked 6 inline comments as done.Apr 8 2021, 2:26 AM
skatkov accepted this revision.Apr 8 2021, 2:42 AM

@arsenm, any comments from your side?

llvm/lib/CodeGen/RegAllocFast.cpp
1287

nit (up to you):

for (const LiveReg &LR : LiveVirtRegs)
  if (MCPhysReg PhysReg = LR.PhysReg)
    if (isClobberedByRegMasks(PhysReg))
      displacePhysReg(MI, PhysReg);

@arsenm, Is this OK for you?

arsenm added inline comments.Apr 27 2021, 5:55 AM
llvm/test/CodeGen/X86/statepoint-fastregalloc.mir
26–67

Can you craft a testcase for multiple regmasks? Other than that LGTM

dantrushin added inline comments.Apr 27 2021, 9:04 AM
llvm/test/CodeGen/X86/statepoint-fastregalloc.mir
26–67

I don't know any instruction which can have multiple reg masks.
I can try to add another arbitrary reg mask to STATEPOINT instruction in this test and see if it'll work (and will pass through verifier).
Would it be OK?

arsenm added inline comments.Apr 27 2021, 10:27 AM
llvm/test/CodeGen/X86/statepoint-fastregalloc.mir
26–67

Regmasks are treated like implicit operands, so you can freely add them to any instruction. You could take a NOP and add whatever implicit uses/defs and regmasks you want

Added test for instruction with two RegMask operands.

Still it is not quite clear to me what multiple RegMasks mean?
This patch basically ANDs them.

arsenm accepted this revision.May 6 2021, 5:27 PM
This revision is now accepted and ready to land.May 6 2021, 5:27 PM
This revision was automatically updated to reflect the committed changes.