Page MenuHomePhabricator

AMDGPU: Remove broken ternary i16 patterns

Authored by jvesely on Jul 25 2018, 11:55 PM.



Fixup test to check for GCN prefix
These patterns always zero extend the result even though it might need sign extension.
This has been broken since the addition of i16 support.
It has popped up in mad_sat(char) test since min(max()) combination is turned into v_med3, resulting in the following (incorrect) sequence:

v_mad_i16 v2, v10, v9, v11
v_med3_i32 v2, v2, v8, v7

Fixes mad_sat(char) piglit on VI.

Diff Detail


Event Timeline

jvesely created this revision.Jul 25 2018, 11:55 PM
arsenm added inline comments.Jul 26 2018, 1:46 AM

These can be fixed, they don't need to be dropped.

It's also more complicated because since gfx9 there is a control bit for whether the high 16-bits are zeroed or preserved which we haven't implemented

jvesely added inline comments.Jul 26 2018, 10:00 AM

There is no reason to preserve and fix them since they'd need to be the same instructions which end up being used anyway. There is nothing to be gained from having the extension part in the pattern.

the gfx9 part does not apply you can drop zero_extend if v_mad_i16 clears those bits, but that should not be handled in this pattern.

jvesely added inline comments.Jul 26 2018, 12:10 PM

In fact including the extension op here would be damaging. Leaving it outside exposes it to other patterns that could replace or eliminate it.

Can we just have the fix in, and worry about optimizing i16 extends later?

arsenm accepted this revision.Aug 7 2018, 12:56 AM



instnamer on these

This revision is now accepted and ready to land.Aug 7 2018, 12:56 AM
jvesely updated this revision to Diff 159595.Aug 7 2018, 2:14 PM

rename numbered operations

jvesely marked an inline comment as done.Aug 7 2018, 2:15 PM
This revision was automatically updated to reflect the committed changes.