This avoids regressions in a future patch. I'm confused by the use of
the gfx9 usage legacy_mad. Was this a pointless instruction rename, or
uses fmul_legacy handling? Why is regular mac avilable in that case?
Details
Diff Detail
Event Timeline
Looks reasonable to me but I'd prefer if it was reviewed by someone who understands the difference between mad / mad_legacy / mac better than I do.
It was used in the div expansion for some non obvious reason. As far as I know it means 0.0 * x = 0.0, but I am not sure it is needed in this case.
Please run PSDB on this change, if it passes then LGTM.
It’s used in the div expansion because we want it regardless of the denormal mode. My confusion is for the f16 case, where the instruction was renamed or changed for some reason
In GFX8, the opcodes below would write their result as: { 16’h0, result[15:0] }
MAD_F16
MAD_U16
MAD_I16
FMA_F16
DIV_FIXUP_F16
INTERP_P2_F16
GFX9 adds the “OPSEL” field to VOP3 (which was all zeros in GFX8), and the behavior of those instructions changes to: { previous_gpr_value[31:16], result[15:0] } when used with VOP3. VOP1/VOP2 will write zero to unused bits unless SDWA specifies otherwise, and VOP1/VOP2 ops encoded as VOP3 will write zero.
In order to support apps written for GFX8, the opcodes above will be renamed with the word “legacy” added:
MAD_LEGACY_F16
MAD_LEGACY_U16
MAD_LEGACY_I16
FMA_LEGACY_F16
DIV_FIXUP_LEGACY_F16
INTERP_P2_LEGACY_F16
New opcodes with the names of the original opcodes have been created which support this new “preserve” behavior. Also, all new 16-bit opcodes for GFX9 support the new “preserve” behavior.
We don't have tests that run with f16 denormals enabled. We also don't use the f16 intrinsic for div expansion (although I suppose we could in some cases?)
I
I meant disabled. They're always enabled, so we have no useful tests for the case where they're forced off
It's a shame that FMADPat and FMADModsPat take the same arguments but in a different order.