As llvm.amdgcn.kill is lowered to a terminator it can cause
else branch annotations to end up in the wrong block.
Do not annotate conditionals as else branches where there is
a kill to avoid this.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
With respect to the searching that this adds, I ran this on 14069 graphics pipelines from them this invoked the search 13735 times.
Mean 5.5 instructions were visited during the search and the worst case was 33 instructions.
So I think the cost of doing this is low.
I am still in two minds about whether to do this or stop lowering kills to block terminators, but this is the smaller change.
Alternatively, you could create the set of blocks containing a kill in advance - by looking at function declarations in the module, and if a kill is found then checking its users to find blocks. Whether that would be better depends on how often hasKill() is called, though.
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
187 | Bail out if calling convention is not CallingConv::AMDGPU_PS ? |
What about kills in transitive successors? I really don't like how kill isn't considered a terminator.
Thank you Piotr and Matt for your comments.
Since I don't really like kills being terminators either, I am going to investigate changing that before pushing this further.
I will return to this if changing kill behaviour seem in tractable.
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
187 | Kills are valid in other calling conventions. |
No, no. The exact opposite. My problem is the IR kill intrinsic is not a terminator. In the MIR exec modifications should be terminators as that's the only real mechanism for ensuring correct spill placement around them
Ah sorry, I did indeed misread your "isn't" as "is".
Can intrinsics even be terminators?
Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?
Almost. Recently the callbr instruction was added. I was thinking about using it for kills, but I haven't thought about this too carefully
Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?
My initial thought is splitting blocks in the IR wouldn't be helpful. IR transforms are more likely to glue these blocks right back together again. Overall I'm more interested in moving towards the wave transform control flow lowering rather than thinking about how to improve the current IR pass flow
! In D97427#2587524, @arsenm wrote:
! In D97427#2587522, @critson wrote:
Or are you suggesting that we should split the block at the kill in the IR level very early in the backend?My initial thought is splitting blocks in the IR wouldn't be helpful. IR transforms are more likely to glue these blocks right back together again. Overall I'm more interested in moving towards the wave transform control flow lowering rather than thinking about how to improve the current IR pass flow
I agree that focusing our energy on wave transform control flow is the way to go.
However, this is a legitimate bug causing Piglet test failures for Mesa so it would be good to fix it.
Returning to your question about "transitive successors".
I do not think this is an issue -- the problem is with else being put in the same block as a kill causing that kill to executed as if it was part of the preceding if-branch.
If we place an else in a transitive successor then that implies the kill was genuinely part of the if-branch, which is fine.
You could theoretically have a split block that doesn't actually change control flow (i.e. one that just unconditionally branches to another block that only has the one predecessor). That's unlikely to happen in practice since those will get put back together again. Given that at this point we're just sort of waiting for this scheme to die, I wouldn't worry too much about this edge case
Bail out if calling convention is not CallingConv::AMDGPU_PS ?