This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add a custom legalization for (i16 (bitcast v16i1)) and (i32 (bitcast v32i1)) without AVX512 to prevent scalarization
ClosedPublic

Authored by craig.topper on Feb 21 2018, 1:56 PM.

Details

Summary

We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.

This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Feb 21 2018, 1:56 PM

Fix the attribute in one more place

Grr.. Mixed up my reviews.

So there's no way to extend combineBitcastvxi1 to split wider types like we do for AVX/PSUBUS/PMADDWD ?

it could probably be done. This was just easier to implement because I didn't have to think about all the extends, shifts, and ors that would need to be done for that.

And maybe I just want the type legalizer to be my friend.

RKSimon accepted this revision.Feb 26 2018, 4:24 AM

LGTM - ideally it'd be great to pull out this 'MOVMSK' concatenation code into a helper that we can reuse.

This revision is now accepted and ready to land.Feb 26 2018, 4:24 AM
This revision was automatically updated to reflect the committed changes.