This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Only select VOP3 forms of VOP2 instructions
ClosedPublic

Authored by foad on Nov 19 2021, 8:49 AM.

Details

Summary

Change VOP_PAT_GEN to default to not generating an instruction selection
pattern for the VOP2 (e32) form of an instruction, only for the VOP3
(e64) form. This allows SIFoldOperands maximum freedom to fold copies
into the operands of an instruction, before SIShrinkInstructions tries
to shrink it back to the smaller encoding.

This affects the following VOP2 instructions:
v_min_i32
v_max_i32
v_min_u32
v_max_u32
v_and_b32
v_or_b32
v_xor_b32
v_lshr_b32
v_ashr_i32
v_lshl_b32

A further cleanup could simplify or remove VOP_PAT_GEN, since its
optional second argument is never used.

Diff Detail

Event Timeline

foad created this revision.Nov 19 2021, 8:49 AM
foad requested review of this revision.Nov 19 2021, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2021, 8:49 AM
foad added inline comments.
llvm/test/CodeGen/AMDGPU/ctpop16.ll
776

Small win here.

llvm/test/CodeGen/AMDGPU/flat-scratch.ll
505

The 15 has been folded here, which I think is good, even though it didn't save any instructions or registers.

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wqm.demote.ll
164

Small win here.

llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll
576

More folding here.

llvm/test/CodeGen/AMDGPU/ssubsat.ll
616

Small win here.

arsenm accepted this revision.Nov 23 2021, 3:03 PM
This revision is now accepted and ready to land.Nov 23 2021, 3:03 PM
This revision was automatically updated to reflect the committed changes.