This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable shuffle combining for AVX512 unless the root is used by a vselect
ClosedPublic

Authored by craig.topper on Apr 10 2020, 8:13 PM.

Details

Summary

A lot of vectorized code doesn't use masks so we shouldn't penalize them by not doing shuffle combining on avx512 targets.

I've added support for VALIGNQ/VALIGND and 512-bit SHUF128 to prevent some regressions. I also prevented recombining 256-bit SHUF128 to PERM2X128. We may not need to add 256-bit SHUF128 support, but I don't think I found any cases requiring that in my testing.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 10 2020, 8:13 PM

Thanks for doing this - I've been putting it off for a long time!

llvm/lib/Target/X86/X86ISelLowering.cpp
33845

// Attempt to match against VALIGN element rotate.

Also - why not merge the 32/64 bit checks?

33849

Use the isAnyZero helper? (It's recent so isn't used everywhere it could be yet).

34071

Do we need to peek through bitcasts at all?

34163

It might be better to use BaseMask directly - its already dealt with the widening so the mask matching should be easier?

llvm/test/CodeGen/X86/vector-shuffle-combining-avx512bw.ll
230

Maybe rename the test? combine_vpermi2var_32i16_as_rotate ? If you want to you could adjust the masks so this (or a new test) still combines to pshufb.

craig.topper marked 2 inline comments as done.Apr 11 2020, 1:32 PM
craig.topper added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
34071

Maybe, but then element count checks at some of the uses of IsMaskedShuffle would also need to be adjusted? Since NumRootElts wouldn't reflect the bitcast.

34163

That would simplify things if we were looking for a 256-bit wide "element". But we're only looking for repeated lane so there's no guarantee the mask can widened to 2 elements.

Address comments

Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2020, 1:33 PM
RKSimon accepted this revision.Apr 11 2020, 2:34 PM

LGTM

This revision is now accepted and ready to land.Apr 11 2020, 2:34 PM
This revision was automatically updated to reflect the committed changes.