This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Do not annotate an else branch if there is a kill
ClosedPublic

Authored by critson on Feb 24 2021, 5:09 PM.

Details

Summary

As llvm.amdgcn.kill is lowered to a terminator it can cause
else branch annotations to end up in the wrong block.
Do not annotate conditionals as else branches where there is
a kill to avoid this.

Diff Detail

Event Timeline

critson created this revision.Feb 24 2021, 5:09 PM
critson requested review of this revision.Feb 24 2021, 5:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2021, 5:09 PM

With respect to the searching that this adds, I ran this on 14069 graphics pipelines from them this invoked the search 13735 times.
Mean 5.5 instructions were visited during the search and the worst case was 33 instructions.
So I think the cost of doing this is low.

I am still in two minds about whether to do this or stop lowering kills to block terminators, but this is the smaller change.

Alternatively, you could create the set of blocks containing a kill in advance - by looking at function declarations in the module, and if a kill is found then checking its users to find blocks. Whether that would be better depends on how often hasKill() is called, though.

llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
187

Bail out if calling convention is not CallingConv::AMDGPU_PS ?

What about kills in transitive successors? I really don't like how kill isn't considered a terminator.

Thank you Piotr and Matt for your comments.

Since I don't really like kills being terminators either, I am going to investigate changing that before pushing this further.
I will return to this if changing kill behaviour seem in tractable.

llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp
187

Kills are valid in other calling conventions.

Thank you Piotr and Matt for your comments.

Since I don't really like kills being terminators either, I am going to investigate changing that before pushing this further.

No, no. The exact opposite. My problem is the IR kill intrinsic is not a terminator. In the MIR exec modifications should be terminators as that's the only real mechanism for ensuring correct spill placement around them

Since I don't really like kills being terminators either, I am going to investigate changing that before pushing this further.

No, no. The exact opposite. My problem is the IR kill intrinsic is not a terminator. In the MIR exec modifications should be terminators as that's the only real mechanism for ensuring correct spill placement around them

Ah sorry, I did indeed misread your "isn't" as "is".
Can intrinsics even be terminators?
Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?

Since I don't really like kills being terminators either, I am going to investigate changing that before pushing this further.

No, no. The exact opposite. My problem is the IR kill intrinsic is not a terminator. In the MIR exec modifications should be terminators as that's the only real mechanism for ensuring correct spill placement around them

Ah sorry, I did indeed misread your "isn't" as "is".
Can intrinsics even be terminators?

Almost. Recently the callbr instruction was added. I was thinking about using it for kills, but I haven't thought about this too carefully

Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?

My initial thought is splitting blocks in the IR wouldn't be helpful. IR transforms are more likely to glue these blocks right back together again. Overall I'm more interested in moving towards the wave transform control flow lowering rather than thinking about how to improve the current IR pass flow

! In D97427#2587524, @arsenm wrote:

! In D97427#2587522, @critson wrote:
Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?

My initial thought is splitting blocks in the IR wouldn't be helpful. IR transforms are more likely to glue these blocks right back together again. Overall I'm more interested in moving towards the wave transform control flow lowering rather than thinking about how to improve the current IR pass flow

I agree that focusing our energy on wave transform control flow is the way to go.
However, this is a legitimate bug causing Piglet test failures for Mesa so it would be good to fix it.

Returning to your question about "transitive successors".
I do not think this is an issue -- the problem is with else being put in the same block as a kill causing that kill to executed as if it was part of the preceding if-branch.
If we place an else in a transitive successor then that implies the kill was genuinely part of the if-branch, which is fine.

arsenm accepted this revision.Mar 10 2021, 5:54 AM
This revision is now accepted and ready to land.Mar 10 2021, 5:54 AM

Returning to your question about "transitive successors".
I do not think this is an issue -- the problem is with else being put in the same block as a kill causing that kill to executed as if it was part of the preceding if-branch.
If we place an else in a transitive successor then that implies the kill was genuinely part of the if-branch, which is fine.

You could theoretically have a split block that doesn't actually change control flow (i.e. one that just unconditionally branches to another block that only has the one predecessor). That's unlikely to happen in practice since those will get put back together again. Given that at this point we're just sort of waiting for this scheme to die, I wouldn't worry too much about this edge case