This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: keep track of modifiers when converting v_mac to v_mad
ClosedPublic

Authored by hakzsam on Mar 7 2017, 1:38 PM.

Details

Summary

Since v_max_f32_e64/v_max_f16_e64 can be folded if the target
instruction supports the clamp bit, we also need to maintain
modifiers when converting v_mac to v_mad.

This fixes a rendering issue with Dirt Rally because a v_mac
instruction with the clamp bit set was converted to a v_mad
but that bit was lost during the conversion.

Fixes: e184e01dd79 ("AMDGPU: Fold FP clamp as modifier bit")

Diff Detail

Event Timeline

hakzsam created this revision.Mar 7 2017, 1:39 PM
arsenm edited edge metadata.Mar 7 2017, 3:44 PM

Needs tests

lib/Target/AMDGPU/SIInstrInfo.cpp
1779

hasModifiersSet won't preserve the value, you need to keep the whole operand. These aren't simple booleans.

1785–1786

should just add the value. omod is not a simple boolean so that is also broken

arsenm added inline comments.Mar 7 2017, 3:45 PM
lib/Target/AMDGPU/SIInstrInfo.cpp
1783

This is also unnecessary, there is no src2_modifiers for mac so this can stay add 0

hakzsam updated this revision to Diff 91379.Mar 10 2017, 10:54 AM

v2: - preserve valye by using getNamedOperand()->getImm() instead

  • add v_clamp_mac_to_mad test in clamp-modifier.ll
  • add v_omod_mac_to_mad test in omod.ll
arsenm accepted this revision.Mar 10 2017, 11:29 AM

LGTM

This revision is now accepted and ready to land.Mar 10 2017, 11:29 AM
arsenm closed this revision.Mar 10 2017, 9:52 PM

r297556