Allows popping the same basic block multiple times off the stack.
Such scenarios can happen if two conditional branches use the same destination
basic block, e.g., the "false" basic block is the same.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
354 | Could you construct a situation where the block appears more times, and not consecutively? It may be better to change the stack to a SetVector | |
llvm/test/CodeGen/AMDGPU/si-annotate-control-flow-condition-common-blocks.ll | ||
4 | Add a brief comment explaining what this tests. Also, can you add an llc run line? The control flow pass pipeline is a bit fragile and subject to change |
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
354 | The answer is probably no since this is supposed to be structurized already |
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
354 | I managed to create a similarly problematic situation by changing the function below so the second branch uses the common basic block for the "true" instead of the false. We end up with BB22 buried under BB21 and it cant pop it off the stack because of that: |
llvm/lib/Target/AMDGPU/SIAnnotateControlFlow.cpp | ||
---|---|---|
354 | To be more specific, I wrote two additional tests:
The first test is completely fine, but the second fails because bb22 gets buried under bb21 as the latter gets visited before it. bb22 gets trapped on the stack, never popped, and the pass fails. |
I added a special case to "bury" BBs under the last stack entry if the "then" is top of the stack. I feel like it's cheating a bit but I'm not sure how else we can deal with this.
All tests pass now with this change at least.
Hi @Pierre-vh Thanks for working on this. I have to say that the si-annotate-control-flow was not designed to run separately. The control flow lowering process for AMDGPU was completed by several llvm passes running one by one. Also as @arsenm said, the control flow lowering process might change in the future, so a llc test is more helpful.
For this specific issue, I think this might share the same root-cause with D131181. I have leaved some comments there. My point is the si-annotate-control-flow assume the CFG of the input function should already being structurized, but this case happened to skip the structurization process because the unify-function-divergent-exit tries to optimize for uniformly reachable Unreachable block. I am not sure whether we have any workload depend on such optimization. Changing the behavior of unify-function-divergent-exit might regress quality of generated code in some cases. But teaching the existing structurizer to work on such case might need some non-trivial effort. Any comment? @arsenm
I can remove the opt line then, that's okay. The test is fine either way (fails before the patch, passes after)
If I understand correctly, you think this patch might make codegen worse in some cases? I am not sure what you mean by unify-function-divergent-exit
Does D131181 make this patch obsolete?
Hi @Pierre-vh, If you confirm D131181 help solve your problem, I think may be you can just drop the change.
Could you construct a situation where the block appears more times, and not consecutively? It may be better to change the stack to a SetVector