Page MenuHomePhabricator

[X86] Another attempt at support prefer-vector-width function attribute
ClosedPublic

Authored by craig.topper on Jan 9 2018, 6:02 PM.

Details

Summary

This is another attempt at supporting prefer-vector-width attribute in the X86 backend. Some of this is based on a conversation I had with Eric Christopher on IRC. This is a subset of the functionality in D41341.

This patch passes the prefer vector width attribute into the X86Subtarget constructor so that we can keep it as a numeric value on the Subtarget and not just a bool flag for each width.

I've add a subtarget feature to specify a preference that should eventually be set by default for skx, cannonlake, and icelake. For now it just makes a convenient command line hook to see how the same function would be generated with and without the preference. Something that can't be done with the prefer-vector-width function attribute.

I've qualifed all known places in the X86 backend that turn 128/256-bit vectors into 512-bit vectors with the preference. Some places will only obey the preference if hasVLX is enabled. Without VLX support there sometimes isn't a way to support the operation, particularly when the operation involves a mask register.

This does not include any changes to constrain the type legalizer which we will probably still need. Hopefully the X86ISelLowering changes in this patch will mean the legalizer change only touches the X86ISelLowering constructor to qualify the setOperationAction calls.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Jan 9 2018, 6:02 PM

Add canExtendTo512DQ and canExtendTo512BW helper methods to encapsulate hasAVX512/hasBWI and prefer vector width check into one place.

Always check VLX before checking prefer vector width. This means KNL/KNM won't honor prefer vector width in codegen, but they aren't know to have a performance problem like SKX. This reduces the number of behavior variations.

Improve testing to add command lines with/without prefer vector width and mixed with command lines without vlx to ensure the vector width preference is ignored without vlx.

RKSimon added inline comments.Jan 17 2018, 1:38 PM
lib/Target/X86/X86ISelLowering.cpp
16580 ↗(On Diff #129323)

Is this right? It isn't the default option for AVX512 anymore.

18558 ↗(On Diff #129323)

We have enough of these that we probbaly need a helper function, we already have similar for unary/binary int-256 cases on AVX1.

lib/Target/X86/X86Subtarget.cpp
260 ↗(On Diff #129323)

Do we need to assert for a sane value here (else in X86TargetMachine.cpp)?

craig.topper added inline comments.Jan 17 2018, 1:46 PM
lib/Target/X86/X86ISelLowering.cpp
16580 ↗(On Diff #129323)

I need to rebase this to use ISD::TRUNCATE is that what you meant?

lib/Target/X86/X86Subtarget.cpp
260 ↗(On Diff #129323)

What do you consider a sane value? There's no bounds checking on the attribute coming in.

Rebase to fix the outdated trunc code. Add helper for splitting and extending v16i1 to v16i16/v16i8

RKSimon added inline comments.Jan 18 2018, 12:25 PM
lib/Target/X86/X86Subtarget.cpp
260 ↗(On Diff #129323)

assert(IsPowerOf2(PreferVectorWidthOverride) && PreferVectorWidthOverride >= 128 )?

craig.topper added inline comments.Jan 18 2018, 12:49 PM
lib/Target/X86/X86Subtarget.cpp
260 ↗(On Diff #129323)

It comes from an unchecked command line argument passed to clang that was intended to be target independent. There's no guarantee of any particular value or range.

echristo accepted this revision.Jan 18 2018, 2:45 PM

In general some of the code generation is a bit confusing and not directly extensible to, say "prefer 128 bits", or even feels a bit awkward around what ISA to use rather than what vector length to do our computation in.

That said, making you fix all fo that before would be ridiculous :)

LGTM with the resolution of a couple of the inline comments. I don't feel strongly about them, but would be good to have something.

-eric

lib/Target/X86/X86ISelLowering.cpp
16576–16578 ↗(On Diff #130271)

This comment could probably use some clarification: i.e. we're going to want to use blah blah blah.

Also how is this going to work for a preferred 128 bit vector?

lib/Target/X86/X86TargetMachine.cpp
263 ↗(On Diff #130271)

Bikeshed: "preferred-vector-width"?

This revision is now accepted and ready to land.Jan 18 2018, 2:45 PM
craig.topper added inline comments.Jan 19 2018, 3:09 PM
lib/Target/X86/X86ISelLowering.cpp
16576–16578 ↗(On Diff #130271)

128 bit vectors here would probably have to split several times using shuffles to move higher elements to the lower elements, then sign_extend_vector_inreg, testd, and concat the results.

lib/Target/X86/X86TargetMachine.cpp
263 ↗(On Diff #130271)

It got named this way because gcc implemented their command line option as -mprefer-vector-width= so then I matched that in clang and kept the attribute name matching the command line option.

@craig.topper

LLVM currently prefers 256-bit operations even on recent Intel processors.
https://godbolt.org/z/dffKPxo53

Is there a reason why there are no TuningPrefer512Bit yet (source). Should it be added?

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 5 2022, 6:57 AM

I raised this on D111029 - its still not confirmed which avx512 capable targets work better with 512-bit vectors (other than KNL of course).

I raised this on D111029 - its still not confirmed which avx512 capable targets work better with 512-bit vectors (other than KNL of course).

Thx for the pointer!