Page MenuHomePhabricator

[X86] Disable 512-bit vectors during type legalization for prefer-vector-width
AbandonedPublic

Authored by craig.topper on Dec 17 2017, 10:31 PM.

Details

Summary

This continues the work started in D41096

This patch adds a new subtarget feature to indicate that there are no 512-bit vectors present in the function. When combined the prefer-avx256 feature, it will disable 512-bit vectors in the legalizer. I intend to set this subtarget feature in getSubtargetImpl when the Function has a function attribute indicating the required vector width for the function is less than 512 bits.

I looked into trying to add a 512-bit register feature that could be disabled instead like D41096 proposed, but I couldn't find a good way to make the existing command line options work. The tablegen generated subtarget feature system just doesn't allow for a wrapper/alias feature that implies other features.

I had to allow VK32 to be legal with BWI and (VLX || 512-bit vectors). VK64 is only legal with BWI and 512-bit vectors. VK64 is only needed when v64i8 is legal.

I know there are still places in the code that extend narrow vectors to 512-bit to do an operation on wider vector elements and truncate back. Those will need to be split before extending instead.

I plan to add some basic tests for this as well and I'll be adding tests as I fix the various extending lowerings mentioned above.

Diff Detail

Event Timeline

Add some tests.

Fix LowerMULH and LowerShift to not extend to 512-bits when its not legal

I looked into trying to add a 512-bit register feature that could be disabled instead like D41096 proposed, but I couldn't find a good way to make the existing command line options work. The tablegen generated subtarget feature system just doesn't allow for a wrapper/alias feature that implies other features.

Thanks for experimenting.

lib/Target/X86/X86Subtarget.h
584

This sits on top of D41096? I thought it would replace it. Do we need to prefer AVX2 if we have AVX-512 without using zmm?

craig.topper added inline comments.Dec 19 2017, 10:42 PM
lib/Target/X86/X86Subtarget.h
584

It's not prefer AVX2. It's prefer 256-bit vectors. The name could be better..

From our previous discussions we still wanted to "prefer 256-bit" even when the user uses 512-bit explicitly unless they pass -mprefer-vector-width=512. And we need the prefer flag to be a property of the affected CPUs. So this flag represents those two things.

We can also use this flag to do targeted fixes to disable extensions to 512-bit when the CPU prefers 256-bit, but we weren't able to disable the legalizer. I think the LowerShift and LowerMULH fixes in this patch might want to be qualified with only "prefer 256-bit" rather than 512-bit types are illegal.

This patch now contains fixes for all of the known issues with lowering trying to use 512-bit types for extending to make operations legal.

I've modified the logic for disable 512-bit types a little. I now only disable 512-bit types if the ABI doesn't require them, the CPU prefers 256-bit, and we have the AVX512VL(the feature that enables masking on 128/256-bit registers) instruction set. This is fine for SKX, but means -mprefer-vector-width=256 will not disable the use of 512-bit regs on KNL as it doesn't have AVX512VL. But KNL doesn't have the frequency issue either so its probably ok. This was mainly done to avoid having to deal with a situatiion where we would support masking on scalar operations, but not being able to doing any masking on vectors. By introducing the VLX requirements before disabling 512-bit vectors we are able support ensure we always have vector+scalar masking and we can continue widening to 512-bits for masking when AVX512VL isn't available.

I've added an experimental pass just befoer isel that detects intrinsics and ABI requirements for needing 512-bit vectors and adds the appropriate require-vector-width attribute based on what it finds. This should hopefully avoid any miscompiles or isel failures for testing this feature. It currently only sets the required width to 256 or 512, but it can be made more generic in the future.

I found all of the lowering fixes by enabling the pass and forcing the prefer-vector-width-256 feature on for all CPUs. Then looking for assertion failures and llvm_unreahables on the X86 codegen lit tests.

I plan to add more directed tests for each of the lowering fixes.

Added test cases for all places where I had to prevent extending to 512-bit to legalize an operation.

Fixed a bug in zext and sext lowering that created a v8i8 node and forced scalarization.

craig.topper retitled this revision from [X86] WIP disable 512-bit vectors during type legalization for prefer-vector-width to [X86] Disable 512-bit vectors during type legalization for prefer-vector-width.Jan 8 2018, 10:24 AM
echristo added inline comments.Jan 8 2018, 5:50 PM
lib/Target/X86/X86Subtarget.h
590

I think I'd rather a preferred-vector-width attribute rather than the combination of 128/256/etc features.

Thoughts?

craig.topper abandoned this revision.Feb 4 2018, 8:30 PM