This patch lowers the X86 vector packing with saturation intrinsics to native LLVM IR. Comes with an LLVM patch (D45721).
Details
Diff Detail
Event Timeline
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
8420 ↗ | (On Diff #142768) | Why can't these just be APInts instead of uint64_t? Is this so that APInt widths don't have to match RTy below? I'd rather you just created the narrow APInt and then called sext/zext on it to get it to the right width. |
8420 ↗ | (On Diff #142768) | Pre-select the 8 or 16 based on IsDW. Then you don't need to check IsDW 4 times. You just need to pass the right width. |
8432 ↗ | (On Diff #142768) | Clearing isn't necessary if you just created it. |
8433 ↗ | (On Diff #142768) | This loop could probably use some comments. The multiple variables make the logic hard to follow |
8443 ↗ | (On Diff #142768) | Why arent' these unsigned compares for Unsigned? |
8446 ↗ | (On Diff #142768) | If you have the 8 or 16 selected above, you can use getIntNTy here I think. |
lib/CodeGen/CGBuiltin.cpp | ||
---|---|---|
8443 ↗ | (On Diff #142768) | The compares are signed on purpose. PACKUS assumes that the input elements are signed, then uses unsigned saturation. So, for instance, an 0xffff value must be evaluated as -1 and saturated to 0, rather than to 0xff as it would be with unsigned comparisons. |