This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX512] Enable variable shuffle combining by default on AVX512 targets
AbandonedPublic

Authored by RKSimon on Dec 20 2017, 5:05 AM.

Details

Summary

As discussed on D41323.

I've avoided binding the FeatureFastVariableShuffle feature to FeatureAVX512 directly, and just made it part of combineX86ShuffleChain.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon added a subscriber: gadi.haber.

@gadi.haber This change means that KNL will be more aggressive with shuffle combining as well - is that OK?

zvi added a comment.Dec 21 2017, 3:46 AM

@gadi.haber This change means that KNL will be more aggressive with shuffle combining as well - is that OK?

See my reply here, https://reviews.llvm.org/D41323#961494. I don't know of any reason why applying this for KNL would not be beneficial. @gadi.haber, do you have any means of getting an answer to this?

craig.topper added inline comments.Dec 21 2017, 2:08 PM
lib/Target/X86/X86ISelLowering.cpp
28592

Should we just do the OR in the implementation of hasFastVariableShuffle()? That way if we use this method in other places they'll have consistent behavior.

RKSimon added inline comments.Dec 24 2017, 5:00 AM
lib/Target/X86/X86ISelLowering.cpp
28592

I'm happy to do that if the AVX512 gurus agree, or if they want to go all the way and make FeatureAVX512 inherit FeatureFastVariableShuffle - but AFAICT we don't tend to include the fast/slow 'characteristic' features like that with 'hardware' features.

zvi added a comment.Jan 7 2018, 11:37 AM

Still trying to get hold of a KNL expert that will answer whether KNL should be included. Can we for now conservatively assume no and exclude KNL from this patch just so this patch can make progress? I want to follow-up on updating the AVX2 tests with FastVariableShuffle configurations.

In D41436#969481, @zvi wrote:

Still trying to get hold of a KNL expert that will answer whether KNL should be included. Can we for now conservatively assume no and exclude KNL from this patch just so this patch can make progress? I want to follow-up on updating the AVX2 tests with FastVariableShuffle configurations.

Isn't that what we have already? Skylake etc all have FeatureFastVariableShuffle enabled, the issue with this patch was whether we should enable it for the avx512 attribute and not just on a per-cpu basis.

zvi added a comment.Jan 7 2018, 1:22 PM
In D41436#969481, @zvi wrote:

Still trying to get hold of a KNL expert that will answer whether KNL should be included. Can we for now conservatively assume no and exclude KNL from this patch just so this patch can make progress? I want to follow-up on updating the AVX2 tests with FastVariableShuffle configurations.

Isn't that what we have already? Skylake etc all have FeatureFastVariableShuffle enabled, the issue with this patch was whether we should enable it for the avx512 attribute and not just on a per-cpu basis.

Yes, you are right. Then nothing prevents me from updating the tests...

RKSimon added a subscriber: hfinkel.
In D41436#969481, @zvi wrote:

Still trying to get hold of a KNL expert that will answer whether KNL should be included. Can we for now conservatively assume no and exclude KNL from this patch just so this patch can make progress? I want to follow-up on updating the AVX2 tests with FastVariableShuffle configurations.

@hfinkel IIRC you've been using KNL, should we be trying to combine to a variable shuffle to combine 2 shuffles or more?

RKSimon added a subscriber: pcordes.

Adding @pcordes who had some comments about SLOW/FAST variable shuffle mask operation for KNL on D50074

RKSimon abandoned this revision.Jun 23 2020, 10:45 AM

Abandoning old patch - we've never formed a consistent picture of fast/slow variable shuffles for KNL vs SKX families