Also improve the check for SALU instructions to also ignore
implicit_def and other fake instructions.
Details
- Reviewers
rampitec
Diff Detail
Event Timeline
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp | ||
---|---|---|
144 | Yes, in case block is only preceded bit itself. We do not generate this way, but I can write such IR manually. Anyway layout successor means there will be no branch. | |
test/CodeGen/AMDGPU/collapse-endcf.mir | ||
146 | I am not sure autogenerated test really tests anything, as there is no GCN-NEXT. The copy may easily remain and be untested. |
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp | ||
---|---|---|
144 | It doesn't mean there's no branch. It still allows S_CBRANCH <fallthrough block>. If both used unconditional branches it shouldn't be any different |
test/CodeGen/AMDGPU/collapse-endcf.mir | ||
---|---|---|
146 | The important part is the number of s_or_b64 |
test/CodeGen/AMDGPU/collapse-endcf.mir | ||
---|---|---|
146 | But how you can be sure about the number if there can be some others in between of the checks? |
lib/Target/AMDGPU/SIOptimizeExecMaskingPreRA.cpp | ||
---|---|---|
144 | I think what's really necessary here is something like: end_cf sources are both si_if This requires moving the pass before the control flow pseudos are lowered | |
test/CodeGen/AMDGPU/collapse-endcf.mir | ||
146 | I think it's pretty unlikely this can break with everything checked. update_mir_test_checks should probably be emitting -NEXT, but that's a bigger problem to work on |
Yes, in case block is only preceded bit itself. We do not generate this way, but I can write such IR manually. Anyway layout successor means there will be no branch.