Page MenuHomePhabricator

[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
Needs ReviewPublic

Authored by paulwalker-arm on May 8 2022, 1:52 PM.



This patch starts small, only detecting sequences of the form <0, n, 2n, 3n, ...> where n is a ConstantSDNode.

Diff Detail

Event Timeline

paulwalker-arm created this revision.May 8 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald Transcript
paulwalker-arm requested review of this revision.May 8 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2022, 1:52 PM
Matt added a subscriber: Matt.May 25 2022, 2:10 PM

Changed isConstantSequence to return and optional.
Added tests.

paulwalker-arm retitled this revision from [WIP][SVE] Use INDEX to generate matching instances of BUILD_VECTOR. to [SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR..
paulwalker-arm edited the summary of this revision. (Show Details)
david-arm added inline comments.Tue, Jun 14, 2:30 AM

Maybe I've missed something, but don't you also need to ensure the first value is zero too, i.e.

if (!ExpectedValue.isZero() || PossibleStride.isZero())
  return None;

Would this be more efficient (and possibly simpler?) if you just did:

if (!ExpectedValue.isZero() || PossibleStride.isZero())
  return None;

for (unsigned i = 2; i < getNumOperands(); i++) {
  if (!isa<ConstantSDNode>(getOperand(i)))
    return None

  APInt Val = getConstantOperandAPInt(i).trunc(EltSize);
  if (Val != (i * PossibleStride))
    return None;

I don't think there are any tests that exercise the case where we find a valid sequence as a result of truncating values to the element size. I'm imagining something like this:

<i64 0x4000000000000000, 0x4000000000000002, 0x4000000000000004, ...>

If it's not too much trouble is it possible to write a unit test for this case to make sure we do the right thing?

paulwalker-arm added inline comments.Tue, Jun 14, 5:24 AM

I had thought the "starting at 0" restriction was baked in. Given this is not the case I think it's likely just as easy to support it as disable it so I'll do the former and thus fix one of the missed optimisations I've highlighted with the tests.


The trunc is really a quirk of type legalisation whereby a vector of i8s is legal but i8 scalars are not. The example you're proposing is likely to be optimised away before we get this far. The code itself is unconditional though so there is no untested code path.

Extended to allow an arbitrary start index.

I'm posting this update for comment but it is my intention to split the patch because the findMoreOptimalIndexType change is not great and better done without needing isConstantSequence.


I know the comments are out of sync. I'll update them when splitting the patch.