This is an archive of the discontinued LLVM Phabricator instance.

[X86] Lowering PACK*S (pack with saturation) intrinsics to native IR (clang side)
AbandonedPublic

Authored by mike.dvoretsky on Apr 17 2018, 7:13 AM.

Details

Summary

This patch lowers the X86 vector packing with saturation intrinsics to native LLVM IR. Comes with an LLVM patch (D45721).

Diff Detail

Repository
rC Clang

Event Timeline

mike.dvoretsky created this revision.Apr 17 2018, 7:13 AM
craig.topper added inline comments.Apr 17 2018, 10:06 AM
lib/CodeGen/CGBuiltin.cpp
8420

Why can't these just be APInts instead of uint64_t? Is this so that APInt widths don't have to match RTy below? I'd rather you just created the narrow APInt and then called sext/zext on it to get it to the right width.

8420

Pre-select the 8 or 16 based on IsDW. Then you don't need to check IsDW 4 times. You just need to pass the right width.

8432

Clearing isn't necessary if you just created it.

8433

This loop could probably use some comments. The multiple variables make the logic hard to follow

8443

Why arent' these unsigned compares for Unsigned?

8446

If you have the 8 or 16 selected above, you can use getIntNTy here I think.

mike.dvoretsky added inline comments.Apr 18 2018, 1:15 AM
lib/CodeGen/CGBuiltin.cpp
8443

The compares are signed on purpose. PACKUS assumes that the input elements are signed, then uses unsigned saturation. So, for instance, an 0xffff value must be evaluated as -1 and saturated to 0, rather than to 0xff as it would be with unsigned comparisons.

Updated per comments.

This revision is now accepted and ready to land.Apr 18 2018, 2:49 PM

Changed the shuffle mask emission code to match D45721.

mike.dvoretsky abandoned this revision.Jun 5 2018, 5:17 AM

Closing this due to failure of D45721.