Detailed description: We currently have a set of patterns to select ISD::FNEG and ISD::FABS to the bitwise operations. We need to make them predicated to select the VALU or SALU bitwise operation variant according to the SDNode divergence bit.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1509 | I think all of that should use VOP3 forms. That way you will avoid copy from SGPR to VGPR which appears in your test. |
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1509 | VOP2 form allows SGPR as 1st operand and the patterns are written to make a profit from this fact. t32: f16,ch = load ... t33: f16 = fneg t32 is legalized to t42: i32 = and t40, Constant:i32<65535> t38: f32 = fp16_to_fp t42 t39: f32 = fneg t38 t45: i32 = fp_to_fp16 t39 The latter, in order, gets combined to t47: i32,ch = load ... t49: i32 = xor t47, Constant:i32<32768> The problem here is that whatever order of operands we use in combiner, the SelectionDAG::getNode will canonicalize it making the constant RHS. So, we always get xor t47, Constant:i32<32768> For fp16 capable subtargets the explicit pattern is used and there are no SGPR to VGPR COPY. For now, I am going to update the test to check fp16 with the gfx900 subtarget. |
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1509 |
I see. We prefer to use VOP3 form anyway to allow more potential operand variants. It will be shrunk later if possible. But thanks for updating the test. |
GlobalISel tests were updated to make them really auto-generatable.
update_mir_test_checks.py doesn't work if the prefixes in different RUN lines are the same.
I think all of that should use VOP3 forms. That way you will avoid copy from SGPR to VGPR which appears in your test.