Note that tests for DOT opcodes have not been updated. DPP variants of DOT instructions seem to have an incorrect encoding.
Diff Detail
Event Timeline
llvm/test/MC/AMDGPU/gfx11_asm_vop3p_dpp16.s | ||
---|---|---|
8 | I agree something is weird here with v_dot2_f32_f16 and v_dot2_f32_bf16, I will try to clarify the desired behavior. | |
20 | I'm not sure where you're getting op_sel_hi=7 for v_fma_mix* instructions. From my reading of the spec and comparison with sp3, all values of op_sel and op_sel_hi are supported. The behavior is non-standard for mix opcodes, but the values should be accepted. |
llvm/test/MC/AMDGPU/gfx11_asm_vop3p_dpp16.s | ||
---|---|---|
20 | I remember that SPG stated the following at some point: When using DPP with VOP3/VOP3P, the OPSEL must be set such that the low result only uses low inputs, and the high result only uses high inputs. I always thought that this was a suspicious statement, but see the original tests gfx11_asm_vop3p_dpp8.s below, they stated essentially the same: // For test purpose only. OP_SEL has to be set to all 0 and OP_SEL_HI has to be set to all 1 | |
llvm/test/MC/AMDGPU/gfx11_asm_vop3p_dpp8.s | ||
13 | Here is the original comment. |
Remove FIXME note regarding use of op_sel for fma_mix instructions. I assume op_sel is legal in this context.
llvm/test/MC/AMDGPU/gfx11_asm_vop3p_dpp16.s | ||
---|---|---|
20 | Oh I see. In my previous reply I was just looking at the rules for vop3p by mistake. The existing comment was actually correct. There is a special case for vop3p with dpp. When using DPP with VOP3P, OPSEL has to be set to all 0 and OPSEL_HI has to be set to all 1 And in GCNDPPCombine we actually check those rules. So Codegen is doing the right thing. Sorry for the noise. |
When using DPP with VOP3P, OPSEL has to be set to all 0 and OPSEL_HI has to be set to all 1
Does it mean that DPP variants of fma_mix opcodes are encoded incorrectly? Both llvm and SP3 assemblers do not set op_sel_hi to all 1:
v_fma_mix_f32 v255, v1, v2, v3 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0x20,0xcc,0xe9,0x04,0x0e,0x04,0x01,0x77,0x39,0x05] encoding: VOP3P.DPP8 clamp 0 encoding 0xcc fi8 0x01 lane_sel_0 0x07 lane_sel_1 0x06 lane_sel_2 0x05 lane_sel_3 0x04 lane_sel_4 0x03 lane_sel_5 0x02 lane_sel_6 0x01 lane_sel_7 0 neg 0 neg_hi 0 op 0x20 v_fma_mix_f32_e64_dpp op_sel 0 op_sel_hi 0 op_sel_hi_2 0 src0 0x01 src1 0x102 src2 0x103 vdst 0xff
I'll file a ticket on this issue then. I think we should not update tests till the issue is fixed.
It is true that there is no check verifying that desired property in the assembler. Filing an issue makes sense, though adding a check for the property seems low priority.
I agree something is weird here with v_dot2_f32_f16 and v_dot2_f32_bf16, I will try to clarify the desired behavior.