This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Prevent spills before exec mask is restored
ClosedPublic

Authored by rampitec on Dec 20 2016, 1:10 PM.

Details

Summary

Inline spiller can decide to move a spill as early as possible in the basic block. It will skip phis and label, but we also need to make sure it skips instructions in the basic block prologue which restore exec mask.

Added isPositionLike callback in TargetInstrInfo to detect instructions which shall be skipped in addition to common phis, labels etc.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec updated this revision to Diff 82139.Dec 20 2016, 1:10 PM
rampitec retitled this revision from to [AMDGPU] Prevent spills before exec mask is restored.
rampitec updated this object.
rampitec added a reviewer: arsenm.
rampitec set the repository for this revision to rL LLVM.
arsenm edited edge metadata.Dec 20 2016, 1:32 PM
arsenm added a subscriber: llvm-commits.
arsenm added inline comments.Dec 20 2016, 1:39 PM
lib/Target/AMDGPU/SIInstrInfo.cpp
3647–3650 ↗(On Diff #82139)

The modifiesRegister check should be last. This can also be simplified to return X instead of if (X) return true. You also might be able to combine this with isSchedulingBoundary

qcolombet added inline comments.Dec 20 2016, 1:59 PM
include/llvm/Target/TargetInstrInfo.h
1515 ↗(On Diff #82139)

For the comment, I would use the same terminology as MachineInstr::isPosition. Right now, the comment seems off compared to the function name.
Also, would it make sense to have this flag exposed in the TDs?

rampitec updated this revision to Diff 82151.Dec 20 2016, 2:07 PM
  1. Changed comment on the callback.
  2. Rearranged conditions check in the SIInstrInfo::isPositionLike()
rampitec marked an inline comment as done.Dec 20 2016, 2:12 PM
rampitec added inline comments.
include/llvm/Target/TargetInstrInfo.h
1515 ↗(On Diff #82139)

Fixed comment.
I thought about exposing the flag in TD, but it is not always can be expressed that way. The same instruction can be used at the end of a basic block, so this in fact does not necessarily depend on the instruction only.

rampitec marked an inline comment as done.Dec 20 2016, 2:27 PM
rampitec added inline comments.
include/llvm/Target/TargetInstrInfo.h
1515 ↗(On Diff #82139)

And then instructions themselves are just a scalar or/and/xor etc, they just happen to write exec mask register. If exposed through a bit in TD all of them would need to be duplicated as pseudo and expanded past RA, which is clearly more definitions and more code.

MatzeB edited edge metadata.Jan 2 2017, 5:32 AM

In general I find rules like "no spill may be before a isPositionLike() instruction" dangerous in that they are not at all obvious to people writing generic codegen passes and you run the danger that even if you fix the register allocator other passes may disregard the new rule.

Have you considered alternative solutions? Like having a late pass that moves your exec mask back to the top in a late pass so that the exec mask rules only need to be known by that one pass?

In general I find rules like "no spill may be before a isPositionLike() instruction" dangerous in that they are not at all obvious to people writing generic codegen passes and you run the danger that even if you fix the register allocator other passes may disregard the new rule.

Have you considered alternative solutions? Like having a late pass that moves your exec mask back to the top in a late pass so that the exec mask rules only need to be known by that one pass?

I have started with exploring alternative solutions which do not require any target callbacks. That turns to be pretty difficult and dangerous. Not only that is extremely tricky to recover, but also creates some paradoxes, like if I need a register being spilled before the new insertion point. Also instructions which operate exec mask are all the same at the beginning of a BB and at the end, and in case of a small BB that would be too late to distinguish. The safest solution is not to recover from broken IR, but prevent the problem at the first place.

Regarding other passes which might do the same, I'm not aware of them. Machine instructions have all bits like side effects and register use-def dependencies set, so no pass (like LICM for example) can illegally rearrange them. That is in particular difficult to recover from wrong spilling, because I would have to perform formally illegal transformations. Only the regalloc is a problem because it does not try to rearrange instructions, it just inserts new.

In general I find rules like "no spill may be before a isPositionLike() instruction" dangerous in that they are not at all obvious to people writing generic codegen passes and you run the danger that even if you fix the register allocator other passes may disregard the new rule.

Have you considered alternative solutions? Like having a late pass that moves your exec mask back to the top in a late pass so that the exec mask rules only need to be known by that one pass?

I have started with exploring alternative solutions which do not require any target callbacks. That turns to be pretty difficult and dangerous. Not only that is extremely tricky to recover, but also creates some paradoxes, like if I need a register being spilled before the new insertion point. Also instructions which operate exec mask are all the same at the beginning of a BB and at the end, and in case of a small BB that would be too late to distinguish. The safest solution is not to recover from broken IR, but prevent the problem at the first place.

Regarding other passes which might do the same, I'm not aware of them. Machine instructions have all bits like side effects and register use-def dependencies set, so no pass (like LICM for example) can illegally rearrange them. That is in particular difficult to recover from wrong spilling, because I would have to perform formally illegal transformations. Only the regalloc is a problem because it does not try to rearrange instructions, it just inserts new.

Ok so if I understand this correctly the dependencies are actually visible by means of defs/uses of some mask register, this only affects spill placement code just because it is not obvious that a spill instruction will read the mask register and before actually creating it?

I don't like the name isPositionLike(). The semantic do not seem to be too similar to isPosition() which are instructions that mark a place in a function that is somehow visible elsewhere (because of exception handling tables, ...), this is not the case here. Also if the only use of this function is fixing a spill code placement problem, then the documentation comment should mention that.

Ok so if I understand this correctly the dependencies are actually visible by means of defs/uses of some mask register, this only affects spill placement code just because it is not obvious that a spill instruction will read the mask register and before actually creating it?

Yes.

I don't like the name isPositionLike(). The semantic do not seem to be too similar to isPosition() which are instructions that mark a place in a function that is somehow visible elsewhere (because of exception handling tables, ...), this is not the case here.

Will the "isBasicBlockPrologue()" name be better?

Also if the only use of this function is fixing a spill code placement problem, then the documentation comment should mention that.

OK, will add comment.

rampitec updated this revision to Diff 83287.Jan 5 2017, 12:43 PM
rampitec edited edge metadata.
  • Renamed callback.
  • Modified comment.
arsenm added inline comments.Jan 19 2017, 10:42 AM
lib/Target/AMDGPU/SIInstrInfo.cpp
3644 ↗(On Diff #83287)

There are some weird graphics intrinsics that end up with exec modifications in the middle of the block (I was working towards splitting the blocks at these points, but I'm not sure it's generally possible. For the purposes of this hook, it's probably good enough. I'm not sure I have any better name ideas

3645–3648 ↗(On Diff #83287)

You can access the RI member directly in SIInstrInfo

rampitec updated this revision to Diff 85027.Jan 19 2017, 1:56 PM

Use existing SIInstrInfo::RI instead of querying TRI.

rampitec marked an inline comment as done.Jan 19 2017, 1:57 PM
MatzeB accepted this revision.Jan 19 2017, 4:38 PM

All of this feels a bit odd still, but I don't have any better suggestion right now, so LGTM.

This revision is now accepted and ready to land.Jan 19 2017, 4:38 PM
This revision was automatically updated to reflect the committed changes.