[RFC] AMDGPU: Add MachineInstr::Initiator and ::Terminator flags
Needs ReviewPublic

Authored by nhaehnle on Sep 3 2016, 8:40 AM.



Control flow in AMDGPU is lowered not just via branches, but also via bit
manipulation of the EXEC mask. This means every basic block might have
some ALU instructions at the beginning and end to setup the EXEC mask
correctly. Bad things can happen when target-independent passes like
scheduling or (especially) register allocation mess with these parts. This
has become more of an issue now that control flow pseudo instructions are
lowered earlier (which is how things should be).

For example, a conditional branch from an if-statement might be lowered as

s_and_saveexec_b64 s[10:11], vcc
s_xor_b64 s[10:11], exec, s[10:11]
; mask branch BB5_2

As long as only the ; mask branch pseudo instruction is a terminator, the
register allocator might decide to spill registers after the s_and_saveexec
instruction, which (in the case of a vector register) means that the
register will not be spilled correctly.

Another problem is that we want to move the AMDGPU-specific Whole Quad Mode
pass to after machine scheduling. That pass needs to introduce its own
exec-manipulating instructions while being aware of the meaning of the
already existing exec-instructions. That meaning is currently lost.

So here's a suggestion for dealing with all that: Allow arbitrary
instructions to be marked as Terminator (end of BB) or Initiator (beginning
of BB) instructions.

The intention is that target-independent passes will mess with those
instructions as little as possible: no scheduling changes, and register
spilling and restoring happens outside those regions whenever possible.

The diff here is not complete, but it shows the direction I want to go in.
It passes all the AMDGPU lit tests except for spill-m0.ll, which needs
RegAllocFast to be fixed to become aware of the Initiator/Terminator

Diff Detail

nhaehnle retitled this revision from to [RFC] AMDGPU: Add MachineInstr::Initiator and ::Terminator flags.Sep 3 2016, 8:40 AM
nhaehnle updated this object.
nhaehnle added reviewers: arsenm, tstellarAMD.
nhaehnle added a subscriber: llvm-commits.
arsenm added a comment.Sep 3 2016, 9:42 AM

I have a patch that fixes the control flow spilling problem already. This won't actually solve that part

arsenm added a comment.Sep 3 2016, 9:54 AM

I've been moving more in the direction of not considering mask branches as branches at all, and their own separate concept. My patch adds a handful of terminator instruction aliases that are replaced after register allocation with the regular instructions, since it's only used to get correct spill code placement.