Page MenuHomePhabricator

[X86] Lowering PACK*S (pack with saturation) intrinsics to native IR (clang side)

Authored by mike.dvoretsky on Apr 17 2018, 7:13 AM.



This patch lowers the X86 vector packing with saturation intrinsics to native LLVM IR. Comes with an LLVM patch (D45721).

Diff Detail

Event Timeline

mike.dvoretsky created this revision.Apr 17 2018, 7:13 AM
craig.topper added inline comments.Apr 17 2018, 10:06 AM
8420 ↗(On Diff #142768)

Why can't these just be APInts instead of uint64_t? Is this so that APInt widths don't have to match RTy below? I'd rather you just created the narrow APInt and then called sext/zext on it to get it to the right width.

8420 ↗(On Diff #142768)

Pre-select the 8 or 16 based on IsDW. Then you don't need to check IsDW 4 times. You just need to pass the right width.

8432 ↗(On Diff #142768)

Clearing isn't necessary if you just created it.

8433 ↗(On Diff #142768)

This loop could probably use some comments. The multiple variables make the logic hard to follow

8443 ↗(On Diff #142768)

Why arent' these unsigned compares for Unsigned?

8446 ↗(On Diff #142768)

If you have the 8 or 16 selected above, you can use getIntNTy here I think.

mike.dvoretsky added inline comments.Apr 18 2018, 1:15 AM
8443 ↗(On Diff #142768)

The compares are signed on purpose. PACKUS assumes that the input elements are signed, then uses unsigned saturation. So, for instance, an 0xffff value must be evaluated as -1 and saturated to 0, rather than to 0xff as it would be with unsigned comparisons.

Updated per comments.

This revision is now accepted and ready to land.Apr 18 2018, 2:49 PM

Changed the shuffle mask emission code to match D45721.

mike.dvoretsky abandoned this revision.Jun 5 2018, 5:17 AM

Closing this due to failure of D45721.