This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Fix incorrect sinking of a truncate into the operand of a shift.
ClosedPublic

Authored by andreadb on Sep 1 2016, 1:25 PM.

Details

Summary

This patch fixes a regression introduced by revision 268094.

Revision 268094 added the following dag combine rule:
// trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2

That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2).

The problem is that the constraint on the shift count is incorrect. So, the rule doesn't work well in some cases involving vector types.

Example:

;;
define <8 x i16> @trunc_shift(<8 x i32> %a) {
entry:

%shl = shl <8 x i32> %a, <i32 17, i32 17, i32 17, i32 17, i32 17, i32 17, i32 17>
%conv = trunc <8 x i32> %shl to <8 x i16>
ret <8 x i16> %conv

}
;;

According to the above mentioned rule, it is valid to convert the trunc+shift-by-constant into a shift-by-constant of a truncated value.

(v8i16 (trunc (shl (v8i32 %a, <17,17,17,17,17,17,17,17>)))

-->

(v8i16 (shl  (v8i16 (trunc v8i32 %a)), <17,17,17,17,17,17,17,17>)

The problem is that the new "shl" is undefined (the shift count is bigger than the vector element size). So, the dag combiner would later on replace the shift node with an 'undef' value.

The combine rule should have been written instead like this:

// trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits()

Basically, if K is smaller than the "scalar size in bits" of the truncated value, then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined.

This patch fixes the check on the shift count, and adds a test case to show that we no longer fold the entire computation to undef.

Please let me know if this is okay to commit.

Thanks,
Andrea

Diff Detail

Event Timeline

andreadb updated this revision to Diff 70053.Sep 1 2016, 1:25 PM
andreadb retitled this revision from to [DAGCombiner] Fix incorrect sinking of a truncate into the operand of a shift..
andreadb updated this object.
andreadb added reviewers: arsenm, RKSimon, spatel.
andreadb added a subscriber: llvm-commits.
arsenm edited edge metadata.Sep 1 2016, 2:37 PM

LGTM, but can you add a scalar case with the original test I added for this which hits the new limit of the full scalar bit width behavior instead of / 2

hfinkel accepted this revision.Sep 1 2016, 4:40 PM
hfinkel added a reviewer: hfinkel.
This revision is now accepted and ready to land.Sep 1 2016, 4:40 PM

LGTM, but can you add a scalar case with the original test I added for this which hits the new limit of the full scalar bit width behavior instead of / 2

Thanks for the quick review.
Sure, I will add extra scalar test to reduce-trunc-shl.ll.
I can add tests that are similar to those that you have added in test AMDGPU/shift-i64-opts.ll. I will also add tests for the case where the shift count is equal to the scala size in bits.

Cheers,
Andrea

This revision was automatically updated to reflect the committed changes.