This is an archive of the discontinued LLVM Phabricator instance.

[IR] move shuffle mask queries from TTI to ShuffleVectorInst
ClosedPublic

Authored by spatel on Jun 15 2018, 1:31 PM.

Details

Summary

The optimizer is getting smarter (eg, D47986) about differentiating shuffles based on its mask values, so we should make queries on the mask constant operand generally available to avoid code duplication.

We'll probably use this soon in the vectorizers and instcombine (D48023 and https://bugs.llvm.org/show_bug.cgi?id=37806).

We might clean up TTI a bit more once all of its current 'SK_*' options are covered.

Diff Detail

Event Timeline

spatel created this revision.Jun 15 2018, 1:31 PM
lebedev.ri added inline comments.Jun 16 2018, 6:01 AM
lib/IR/Instructions.cpp
1693–1694

Negative values represent undef?

1704–1705

Just as a sanity check, i would have expected to see an assert here:

assert((FoundLHS ^ FoundRHS) && "Should have selected from exactly one source");
1735–1738

This method was declared with:

/// Return true if this shuffle mask swaps the order of elements from exactly
/// one source vector.
/// Example: <7,6,undef,4>
/// This assumes that vector operands are the same length as the mask.

You are sure this does what that comment says it does?

1739–1740

Just as a sanity check, i would have expected to see an assert here:

assert((ReverseLHS ^ ReverseRHS) && "Should have selected from exactly one source");
1757–1758

Just as a sanity check, i would have expected to see an assert here:

assert(!(ShuffleLHS && ShuffleRHS) && "Should not have selected from both sources");
1773–1774

Just as a sanity check, i would have expected to see an assert here:

assert((SplatLHS || SplatRHS) && "Should have splatted from at least one of sources");
1783–1787
unsigned NumSourceElts = Op<0>()->getType()->getVectorNumElements();

But this does not seem to take the NumSourceElts & 1 into account, like the comment says?

spatel added inline comments.Jun 16 2018, 1:33 PM
lib/IR/Instructions.cpp
1693–1694

Good question/catch! Officially, only -1 represents undef:

/// Return the shuffle mask value for the specified element of the mask.
/// Return -1 if the element is undef.
static int getMaskValue(Constant *Mask, unsigned Elt);

...and I wish we weren't using a magic number rather than a named constant, but that's a separate patch.

This code was copied/translated from the existing code, but I'll fix it.

1704–1705

Agreed, and we should assert other assumptions (eg, the input Constant is a vector).

1735–1738

This was copied from the existing code, so unless I introduced a typo, that's the correct logic.

Note that this patch is intended to be no-functional-change for any existing users of the TTI code, so there are existing regression tests in place (though if typical, they are far from thorough).

I'll definitely add more unit tests to exercise this further.

1783–1787

I don't understand this comment. Can you provide more detail/example of the possible logic bug?

lebedev.ri added inline comments.Jun 16 2018, 1:43 PM
lib/IR/Instructions.cpp
1704–1705

Yes, assertions[/contracts] are nice.

1735–1738

I understand that it is meant to be a NFC, but since all the relevant changes
are nicely layed out in a single diff, it is really easy to look them through,
and note possible discrepancies.

Otherwise, if this is fully NFC, you could just commit as is :)

1783–1787

I can try.
The comment states:

// 2. The first element of the mask must be either a zero (for the
// even-numbered vector elements) or a one (for the odd-numbered vector
// elements).

But the check is:

if (getMaskValue(Mask, 0) != 0 && getMaskValue(Mask, 0) != 1)
  return false;

I.e. if the first element is not 0 and not 1, we bailout.
But the comment suggests that the logic should be:

unsigned NumSourceElts = Op<0>()->getType()->getVectorNumElements();
if (!(((NumSourceElts % 2 == 0) && (getMaskValue(Mask, 0) == 0)) ||
      ((NumSourceElts % 2 == 1) && (getMaskValue(Mask, 0) == 1))))
    return false;

Is this any clearer?

Or, the comment is wrong.

lebedev.ri added inline comments.Jun 17 2018, 5:54 AM
lib/IR/Instructions.cpp
1783–1787

In other words, either the comment should not talk about even/odd -numbered vector elements,
or the check should actually consider even/odd -ness.

spatel updated this revision to Diff 151644.Jun 17 2018, 7:02 AM
spatel marked 7 inline comments as done.

Patch updated:

  1. Fixed undef checks to be "x == -1" rather than "x < 0".
  2. Added assertions for inputs and logic.
  3. Added more unit tests (the last group is based on the mask examples that I added to the header file code comments...would be extra embarrassing to get those wrong!).
  4. Reduced confusing comment for transpose (and restated the canonical TRN1 / TRN2 examples for clarity).
  5. Changed all 'unsigned' types to 'int' so we don't have to cast (this is always an annoyance with mask values).
  6. Changed variable names and added locals to try to make the logic clearer and more uniform.

Ok, this looks about right.
A few high-level remarks:

  • There is an obvious pattern in these is*Mask() functions. This makes them slightly self-duplicating. I wonder if it would be possible to refactor them somehow to reduce duplication.
  • There is some correlation between those functions. I would personally add an exhaustive test (with googletest's Combine, if it is not disabled?) to check for those correlations.
include/llvm/IR/Instructions.h
2467

All these functions should be able to take const Constant *Mask.

lib/IR/Instructions.cpp
1740–1741

This is slightly inconsistent, why not use helper variables here?

bool IsLHS = MaskEltVal == (LastEltIndex - i);
bool IsRHS = MaskEltVal == (NumElts + LastEltIndex - i);
UsesLHS &= IsLHS;
UsesRHS &= IsRHS;

I haven't had a chance to go through this yet, but it'd be useful to be able to call these through ArrayRef<int> style shuffle masks (see the TODO I added for SLP to D48023)

Plus, we probably need to think about input/output size mismatches a little more - Identity and Broadcast for instance don't really care - adding insert/extract subvector pattern matches would be a start.

spatel updated this revision to Diff 151649.Jun 17 2018, 4:17 PM
spatel marked 2 inline comments as done.

Patch updated:
Redefine all of the Constant* functions in terms of ArrayRef<int> parent implementations. This makes the code slightly cleaner and might open up the possibility of removing some redundant code from the backend.

I know we can do better to share code, test, and add functionality, but I'd like to get this committed so I'm not holding up functional changes and existing cleanup that is gated on this first step.

spatel updated this revision to Diff 151650.Jun 17 2018, 4:25 PM

Uploaded the wrong draft - went overboard on the 'const &' on that last one. An ArrayRef is a const ref.

lebedev.ri added inline comments.Jun 19 2018, 3:35 AM
include/llvm/IR/Instructions.h
2466

i think it will be less confusing to replace -1 with undef.

2470

From where does this 16 come from?
Magical expected maximal number of elements in one vector? (AVX512 / 16 elements = 32 bit/element)
Might be better to static constexpr int it beforehand.

2486

i think it will be less confusing to replace -1 with undef.

2512

i think it will be less confusing to replace -1 with undef.

2531

i think it will be less confusing to replace -1 with undef.

2550

i think it will be less confusing to replace -1 with undef.

lib/IR/Instructions.cpp
1726–1743

Hm, why is this so different from isIdentityMask()?
I'm not sure why it couldn't be a copy of isIdentityMask() with just changed Is{LR}HS logic.

1763–1777

Same here. The foundation looks very similar, yet the approach seems different.

RKSimon accepted this revision.Jun 19 2018, 3:38 AM

LGTM with a slight addition to the comments - SLP for example supports alternate/permute cases for different sized inputs

include/llvm/IR/Instructions.h
2513

Make this comment in each case a TODO - at some point we're going to have to (optionally) support length changing shuffle patterns

This revision is now accepted and ready to land.Jun 19 2018, 3:38 AM
spatel marked 8 inline comments as done.Jun 19 2018, 10:38 AM
spatel added inline comments.
include/llvm/IR/Instructions.h
2470

Yes - copy/paste of magic constant. This is typical SmallVector usage throughout LLVM. OK, if we make that a follow-up cleanup and zap all of them (at least in this file)?

lib/IR/Instructions.cpp
1726–1743

The symmetry will be clear if we have everyone call isSingleSourceMask as a preliminary step. Then the basic logic will be identical across everything except for the isTranspose oddball.

spatel updated this revision to Diff 151942.Jun 19 2018, 10:43 AM
spatel marked an inline comment as done.

Patch updated:

  1. Improved header comment examples (use 'undef' rather than -1).
  2. Reimplemented logic to maximize code reuse (make use of isSingleSourceMask() internally).
  3. Added 'TODO' notes about allowing size-changing shuffles.
lebedev.ri accepted this revision.Jun 19 2018, 11:04 AM

Nice!
I like this.

include/llvm/IR/Instructions.h
2470

Yep, absolutely, just pointing out.

lib/IR/Instructions.cpp
1691

Oh, you can define multiple variables in init of the for loop?
Somehow i never realized that..

This revision was automatically updated to reflect the committed changes.