This is an archive of the discontinued LLVM Phabricator instance.

[X86] Lowering PACK*S (pack with saturation) intrinsics to native IR (clang side)
AbandonedPublic

Authored by mike.dvoretsky on Apr 17 2018, 7:13 AM.

Details

Summary

This patch lowers the X86 vector packing with saturation intrinsics to native LLVM IR. Comes with an LLVM patch (D45721).

Diff Detail

Event Timeline

mike.dvoretsky created this revision.Apr 17 2018, 7:13 AM
craig.topper added inline comments.Apr 17 2018, 10:06 AM
lib/CodeGen/CGBuiltin.cpp
8420 ↗(On Diff #142768)

Why can't these just be APInts instead of uint64_t? Is this so that APInt widths don't have to match RTy below? I'd rather you just created the narrow APInt and then called sext/zext on it to get it to the right width.

8420 ↗(On Diff #142768)

Pre-select the 8 or 16 based on IsDW. Then you don't need to check IsDW 4 times. You just need to pass the right width.

8432 ↗(On Diff #142768)

Clearing isn't necessary if you just created it.

8433 ↗(On Diff #142768)

This loop could probably use some comments. The multiple variables make the logic hard to follow

8443 ↗(On Diff #142768)

Why arent' these unsigned compares for Unsigned?

8446 ↗(On Diff #142768)

If you have the 8 or 16 selected above, you can use getIntNTy here I think.

mike.dvoretsky added inline comments.Apr 18 2018, 1:15 AM
lib/CodeGen/CGBuiltin.cpp
8443 ↗(On Diff #142768)

The compares are signed on purpose. PACKUS assumes that the input elements are signed, then uses unsigned saturation. So, for instance, an 0xffff value must be evaluated as -1 and saturated to 0, rather than to 0xff as it would be with unsigned comparisons.

Updated per comments.

This revision is now accepted and ready to land.Apr 18 2018, 2:49 PM

Changed the shuffle mask emission code to match D45721.

mike.dvoretsky abandoned this revision.Jun 5 2018, 5:17 AM

Closing this due to failure of D45721.