Page MenuHomePhabricator

[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead.

Authored by craig.topper on Mar 15 2020, 11:04 PM.



This moves v32i16/v64i8 to a model more consistent with how we
treat integer types with avx1.

This does change the ABI for types vXi16/vXi8 vectors larger than
512 bits to pass in multiple zmms instead of multiple ymms. We'd
already hacked some code to make v64i8/v32i16 pass in zmm.

Cost model is still a bit of a mess. In some place I tried to
match existing behavior. But really we need to account for
splitting and concating costs. Cost model for shuffles is
especially pessimistic. This has an big effect on reductions since
the generic lowering uses PermuteSingleSrc. But reduction uses a
very specific pattern that can handled by subvector extracts and
shifts, but the default handling doesn't know that.

Diff Detail

Event Timeline

craig.topper created this revision.Mar 15 2020, 11:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2020, 11:04 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
craig.topper marked an inline comment as done.Mar 15 2020, 11:08 PM
craig.topper added inline comments.

This is a bit of a hack, but it was easier than trying to hunt down all the places that can create broadcasts in lowering. I couldn't do this with a isel pattern for the broadcast_load case. So I just handled both here.

rebase on top of D76478?

RKSimon added inline comments.Apr 13 2020, 10:31 AM

Maybe update LowerVectorIntUnary so we can use it here? We just need to relax the requirement that the src/dst vector element types are equal which doesn't seem to really necessary.




Maybe pull out a common LowerVectorIntVSETCC and have Lower256IntVSETCC/Lower512IntVSETCC wrap it with value type assertions?



craig.topper marked an inline comment as done.

-Rebase and unify VSETCC spliting. I didn't both with the wrapper function since its just called from one place for 256 and one for 512 which are already VT checks. Didn't seem worth asserting so I just named it splitIntVSETCC.

craig.topper added inline comments.Apr 13 2020, 3:53 PM

I think that requires us to pass the desired result VT to it so we need to update all call sites. I'm happy to do that, but wanted to make sure that's what you want.

RKSimon added inline comments.Apr 14 2020, 6:31 AM

Yeah, for *_EXTEND_VECTOR_INREG cases thats going to a pain - lets not bother.


For this case we know the #elements will match - its just the dst/src element types that differ, so a more minor adjustment to LowerVectorIntUnary should still work?

Rebase on top of D78176 which makes the unary splitter support different source and destination types

Upload full patch. The previous one was missing context.

RKSimon accepted this revision.Apr 15 2020, 9:35 AM

LGTM with a couple of minors


Add something for the next release notes announcing this is going away?


Someday we'll be consistent with indentation/padding for these things :-) Which do you prefer - readability or clang-format friendliness?

This revision is now accepted and ready to land.Apr 15 2020, 9:35 AM
This revision was automatically updated to reflect the committed changes.