This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Add support for X86ISD::PACKSS to ComputeNumSignBitsForTargetNode
ClosedPublic

Authored by RKSimon on Sep 11 2017, 4:10 AM.

Details

Summary

Helps improve combineLogicBlendIntoPBLENDV support by allowing us to peek into through PACKSS truncations of vector comparison results

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Sep 11 2017, 4:10 AM
delena edited edge metadata.Sep 11 2017, 4:44 AM

I want to suggest a test case without intrinsic:
define <16 x i8> @vselect_packss_opt(<16 x i16> %a0, <16 x i16> %a1, <16 x i8> %a2, <16 x i8> %a3) {

%1 = icmp eq <16 x i16> %a0, %a1
%x2 = sext <16 x i1> %1 to <16 x i8>
%x6 = and <16 x i8> %x2, %a2
%x7 = xor <16 x i8> %x2, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%x8 = and <16 x i8> %x7, %a3
%x9 = or <16 x i8> %x6, %x8
ret <16 x i8> %x9

}

RKSimon updated this revision to Diff 114574.Sep 11 2017, 5:27 AM

Added general test cases (not directly calling PACKSS intrinsic).

I added v16i32/v16i64 cases although these need some further work to be handled in combineLogicBlendIntoPBLENDV

I want to suggest a test case without intrinsic:

Done - I added vselect_packss_v16i32 / vselect_packss_v16i64 cases as well. Interestingly avx512vl gets caught in an infinite loop on vselect_packss_v16i32, with/without this patch - I'll raise a bug.

delena accepted this revision.Sep 11 2017, 6:06 AM
This revision is now accepted and ready to land.Sep 11 2017, 6:06 AM
This revision was automatically updated to reflect the committed changes.