Page MenuHomePhabricator

[CodeGen][SVE] Legalisation of extends with scalable types

Authored by kmclaughlin on May 7 2020, 9:54 AM.



This patch adds legalisation of extensions where the operand
of the extend is a legal scalable type but the result is not.

EXTRACT_SUBVECTOR is used to split the result, before
being replaced by target-specific [S|U]UNPK[HI|LO] operations.

For example:

zext <vscale x 16 x i8> %a to <vscale x 16 x i16>

should emit:

uunpklo z2.h, z0.b
uunpkhi z1.h, z0.b

Diff Detail

Event Timeline

kmclaughlin created this revision.May 7 2020, 9:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2020, 9:54 AM

Can you explain why target-independent legalization isn't suitable here? I'd prefer not to have target-specific type legalization for every operation on vscale'ed types. (The target-independent name of AArch64ISD::SUNPKLO is ISD::SIGN_EXTEND_VECTOR_INREG.)

Hi Kerry, I think we do already have some suitable test files where perhaps you could add your tests instead of creating new files? For example, there are:

CodeGen/AArch64/sve-int-arith.ll (perhaps integer divides and shifts could live there?)
CodeGen/AArch64/sve-int-div-pred.ll (some divides already in here)
CodeGen/AArch64/sve-sext-zext.ll (extend tests recently added)


Hi @efriedma, is there a target-independent equivalent of SUNPKHI? From a quick glance at the codebase where X86 uses ISD::SIGN_EXTEND_VECTOR_INREG it seems vector shuffles are still required for the Hi part, which is fine for fixed length vectors I guess.

For sunpckhi... no, not really. You'd need to either add a new opcode, or add a new shuffle operation of some sort.

kmclaughlin edited the summary of this revision. (Show Details)
  • Removed ReplaceExtensionResults and instead try to use extract_subvector as much as possible to legalise the result
  • Added ReplaceExtractSubVectorResults, which replaces extract_subvector with unpack operations
  • Changed performSignExtendInRegCombine to replace a sign extend of an unsigned unpack with a signed unpack
  • Removed changes to unrelated test files

Hi Kerry, just a couple of comments about the use of getVectorNumElements() - we're trying to remove calls to this function so it would be good if you could use getVectorElementCount() instead. Thanks!


I think we want to move away from calling getVectorNumElements(), so might need to change this to something like

ElementCount ResEC = VT.getVectorElementCount();
if (InVT.getVectorElementCount() != (ResEC * 2))


And here you could then change this to:

if ((Index != 0) && (Index != ResEC.Min))

Sorry I forgot to mention I think we have an existing test file for extends:


It might be worth adding these cases to that file instead of creating a new one?

  • Replaced uses of getVectorNumElements() with getVectorElementCount()
  • Moved the new tests into the existing sve-sext-zext.ll file
kmclaughlin marked 2 inline comments as done.May 29 2020, 10:08 AM
efriedma added inline comments.May 29 2020, 12:07 PM

In general, there are four possibilities for legalization of the operand vector:

  1. Legal. In this case, I guess the operation you want is essentially ANY_EXTEND_VECTOR_INREG. You're currently handling this in target-specific code; I guess that's okay for now.
  2. Promote. In this case, you can promote both the operand and the result, what you're doing here.
  3. Widen. In this case, you just widen the operand.
  4. Split. In this case, you pick the correct half (or construct it with a shuffle), and extract from that.

Currently, you're assuming the the result of GetPromotedInteger() has an element type narrower or equal to the element type of NOutVT. I'm not sure that's true on all targets, but I guess it's true for SVE. Better to explicitly check, I think.


Please put a report_fatal_error here for scalable vectors, so we don't end up with some obscure invalid BUILD_VECTOR error.

Alternatively, I guess you could call ExpandExtractFromVectorThroughStack().


Can you restrict this to specifically the types you're interested in handling? It looks like you only implemented handling for the following types: nxv8i8, nxv4i16, nxv2i32.


I think you can just assert the type isn't scalable? It shouldn't be possible to get here. You're only marking EXTRACT_SUBVECTOR Custom for illegal types.


Do you need to truncate ExtVal somewhere, so you don't end up with a DUP with an over-wide constant?


It seems a little fragile to assume the inner VT of the SIGN_EXTEND_INREG is exactly the type you're expecting here. Probably worth at least adding an assertion to encode the assumptions you're making.


"Bubble truncates to illegal types to the surface" is an optimization?


Please just generate the checks with

kmclaughlin marked 7 inline comments as done.
  • Restricted the illegal types which should be lowered for EXTRACT_SUBVECTOR to those handled in this patch (nxv8i8, nxv4i16 & nxv2i32)
  • Removed unnecessary changes in ReplaceExtractSubVectorResults
  • Updated the tests with
  • Added a check on expected types in performSignExtendInRegCombine & PromoteIntRes_EXTRACT_SUBVECTOR
kmclaughlin marked 2 inline comments as done.Jun 1 2020, 12:31 PM

Thanks for taking another look at this, @efriedma!


I've changed the call to getNode below that creates the DUP to use DAG.getAnyExtOrTrunc (similar to what we do in LowerSPLAT_VECTOR)


I've added an assert above here to make sure the sign_extend_inreg and unpack types match, is this the assumption you were referring to?


Removed - this was not required for this patch.

efriedma added inline comments.Jun 1 2020, 2:05 PM

I'm specifically concerned that you could end up with something like (nxv16i8 (dup (i32 0x12345678))).


We assert that SIGN_EXTEND_INREG has valid operand/result types elsewhere.

I was more concerned about the inner VT (cast<VTSDNode>(N->getOperand(1))->getVT();). You could end up creating an invalid SIGN_EXTEND_INREG if the type is something weird, like a non-byte-size integer type.

kmclaughlin marked 2 inline comments as done.
  • Added a truncate of ExtVal in performSVEAndCombine
  • Changed the assert added to performSignExtendInRegCombine in the previous revision
kmclaughlin marked an inline comment as not done.Jun 2 2020, 10:20 AM
kmclaughlin added inline comments.

I see what you mean - I've added a truncate of Dup->getOperand(0) for this, which will truncate the constant to the type of UnpkOp


Removed my previous check on the operand & result types and added an assert for the type of VT.

efriedma added inline comments.Jun 2 2020, 1:20 PM

It's not legal to generate operations with type EltTy after legalization. You get away with it here because it's guaranteed to constant-fold... but probably less confusing to use APInt::trunc instead.

  • Use APInt::trunc to truncate the constant in performSVEAndCombine
efriedma accepted this revision.Jun 4 2020, 1:23 PM


This revision is now accepted and ready to land.Jun 4 2020, 1:23 PM
This revision was automatically updated to reflect the committed changes.