Page MenuHomePhabricator

[X86] Teach combineBitcastvxi1 to prefer movmsk on avx512 in more cases

Authored by craig.topper on Jun 5 2020, 10:58 PM.



If the input to the bitcast is a sign bit test, it makes sense to
directly use vpmovmskb or vmovmskps/pd. This removes the need to
copy the sign bits to a k-register and then to a GPR.

Fixes PR46200.

Diff Detail

Event Timeline

craig.topper created this revision.Jun 5 2020, 10:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2020, 10:58 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
RKSimon added inline comments.Jun 5 2020, 11:53 PM

I don't recognise what (c/c++) intrinsic this would have come from - is this likely to be a problem? Same for the other *-intrinsics-upgrade.ll cases

craig.topper marked an inline comment as done.Jun 6 2020, 12:16 AM
craig.topper added inline comments.

I believe this is _mm256_movepi8_mask which is the intrinsic for vpmovb2m. The intrinsic itself is defined to return unsigned short which introduced the bitcast. If it got used with another intrinsic that new how to bitcast back those bitcasts would cancel.

But I guess we have a bunch of intrinsics that take scalar integer as arguments for mask and introduce a bitcast to vXi1 in intrinsic lowering. Which is too late to cancel the bitcast that would form MOVMSK. Maybe we need a combine to reverse MOVMSK? I guess I need to do some experiments.

Add a reversing transform for bitcasts that appear during intrinsic lowering. This
required a new test file. The output for that file shown here is the current output
before this patch. No existing tests in tree demonstrated the need for the reversing

RKSimon added inline comments.Jun 11 2020, 6:51 AM

Add a common --check-prefixes=AVX prefix


Add --check-prefixes=AVX,AVX512 common prefixes to KNL/SKX now that they lower to the same code.

RKSimon accepted this revision.Jun 13 2020, 2:19 PM

LGTM - cheers

This revision is now accepted and ready to land.Jun 13 2020, 2:19 PM
This revision was automatically updated to reflect the committed changes.