This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Apply pre-emit s_cbranch_vcc optimation to more patterns
ClosedPublic

Authored by critson on Jul 12 2020, 1:30 AM.

Details

Summary

Depends on D83637 for test correctness, but not operation.
Add handling of s_andn2 and mask of 0.
This eliminates code from uniform control flows.

Diff Detail

Event Timeline

critson created this revision.Jul 12 2020, 1:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2020, 1:30 AM
rampitec added inline comments.Jul 13 2020, 9:59 AM
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
129

Technically here MaskValue can be anything, not just -1 or 0.

136

.. and here you squash it. I think it needs a check.

llvm/test/CodeGen/AMDGPU/infinite-loop.ll
161

Looks like s_mov_b64 vcc, 0?

llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
14–34

Do you mind to pre-commit white space changes?

critson marked 3 inline comments as done.Jul 13 2020, 8:21 PM
critson added inline comments.
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
129

Line 102 ensures MaskValue is 0 or -1.
I can add an assertion here as well.

llvm/test/CodeGen/AMDGPU/infinite-loop.ll
161

Yes, hence how this becomes an unconditional branch.
Are you suggesting that we add a peephole to clean up "s_and_b* vcc, exec, 0" -> "s_mov_b* vcc, 0" in this case ?

llvm/test/CodeGen/AMDGPU/sgpr-control-flow.ll
14–34

Will do.

critson updated this revision to Diff 277691.Jul 14 2020, 12:42 AM

Rebase on top of test pre-commit.
Add assertions relating to MaskValue.

foad added a subscriber: foad.Jul 14 2020, 2:20 AM
arsenm added inline comments.Jul 14 2020, 4:59 AM
llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
131

llvm_unreachable

llvm/test/CodeGen/AMDGPU/insert-skip-from-vcc.mir
343–344

Negative checks make me nervous. Can you generate these?

critson marked 3 inline comments as done.Jul 14 2020, 5:14 AM
critson added inline comments.
llvm/test/CodeGen/AMDGPU/insert-skip-from-vcc.mir
343–344

We could switch the entire test to being generate, but generated MIR tests don't use CHECK-NEXT, so these would still fall through the cracks.

critson updated this revision to Diff 277784.Jul 14 2020, 5:14 AM

Change assertion to llvm_unreachable.

rampitec accepted this revision.Jul 14 2020, 10:23 AM

LGTM with a nit.

llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
129

OK, see it. Thanks.

137

It is MaskValue = ~MaskValue; right? I think it is just more clear with negation.

llvm/test/CodeGen/AMDGPU/infinite-loop.ll
161

Yes, but not in this change.

This revision is now accepted and ready to land.Jul 14 2020, 10:23 AM
This revision was automatically updated to reflect the committed changes.