This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG][X86] Relax restriction on the width of an input to *_EXTEND_VECTOR_INREG. Use them and regular *_EXTEND to replace the X86 specific VSEXT/VZEXT opcodes
ClosedPublic

Authored by craig.topper on Nov 9 2018, 12:33 PM.

Details

Summary

Previously, the extend_vector_inreg opcode required their input register to be the same total width as their output. But this doesn't match up with how the X86 instructions are defined. For X86 the input just needs to be a legal type with at least enough elements to cover the output.

This patch weakens the check on these nodes and allows them to be used as long as they have more input elements than output elements. I haven't changed type legalization behavior so it will still create them with matching input and output sizes.

X86 will custom legalize these nodes by shrinking the input to be a 128 bit vector and once we've done that we treat them as legal operations. We still have one case during type legalization where we must custom handle v64i8 on avx512f targets without avx512bw where v64i8 isn't a legal type. In this case we will custom type legalize to a *extend_vector_inreg with a v16i8 input. After that the input is a legal type so type legalization should ignore the node and doesn't need to know about the relaxed restriction. We are no longer allowed to use the default expansion for these nodes during vector op legalization since the default expansion uses a shuffle which required the widths to match. Custom legalization for all types will prevent us from reaching the default expansion code.

I believe DAG combine works correctly with the released restriction because it doesn't check the number of input elements.

The rest of the patch is changing X86 to use either the vector_inreg nodes or the regular zero_extend/sign_extend nodes. I had to add additional isel patterns to handle any_extend during isel since simplifydemandedbits can create them at any time so we can't legalize to zero_extend before isel. We don't yet create any_extend_vector_inreg in simplifydemandedbits.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Nov 9 2018, 12:33 PM
craig.topper added inline comments.
lib/Target/X86/X86ISelLowering.cpp
40087 ↗(On Diff #173412)

I just flipped this in case we ever wanted to send ANY_EXTEND_VECTOR_INREG in here.

40100 ↗(On Diff #173412)

It seems this code is unnecessary either due to this change or before this change. I didn't check which. I removed it assuming I'd need to put some of it back when tests failed but nothing did so I left it out.

Please can you add context?

lib/Target/X86/X86ISelLowering.cpp
1345 ↗(On Diff #173412)

Move these into the for loop?

5477 ↗(On Diff #173412)

I think there is an InVT type in this function?

19830 ↗(On Diff #173412)

InVT?

40100 ↗(On Diff #173412)

NFC pre-removal?

Address comments and hopefully arcanist adds context.

Rebase to remove a commented out line in combineVSZext

RKSimon accepted this revision.Nov 13 2018, 5:57 AM

LGTM

lib/Target/X86/X86ISelLowering.cpp
40087 ↗(On Diff #173412)

Do this as a separate pre-commit? Not sure if its worth it or not - but seems only tangentially related.

This revision is now accepted and ready to land.Nov 13 2018, 5:57 AM
This revision was automatically updated to reflect the committed changes.