This is an archive of the discontinued LLVM Phabricator instance.

[x86] improve codegen for bit-masked vector compare and select (PR46531)
ClosedPublic

Authored by spatel on Jul 2 2020, 12:03 PM.

Details

Summary

We canonicalize patterns like:

%s = lshr i32 %a0, 1
%t = trunc i32 %s to i1

to:

%a = and i32 %a0, 2
%c = icmp ne i32 %a, 0

...in IR, but the bit-shifting original sequence may be better for x86 vector codegen.

I tried several variants of the transform, and it's tricky to not induce regressions. In particular, I did not find a way to cleanly handle non-splat constants, so I've left that as a TODO item here (negative tests for those are included here). AVX512 resulted in some diffs, but didn't look meaningful, so I left that out too. Some of the 256-bit AVX1 diffs are questionable, but close enough that it's probably not meaningful.

Diff Detail

Event Timeline

spatel created this revision.Jul 2 2020, 12:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 12:03 PM
RKSimon accepted this revision.Jul 3 2020, 7:32 AM

LGTM, cheers - this is OK as a first step, but getting nonuniform cases working is going to be necessary as well.

This revision is now accepted and ready to land.Jul 3 2020, 7:32 AM