This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Fix comparator for cmp instruction vectorization.
ClosedPublic

Authored by ABataev on Dec 7 2021, 10:51 AM.

Details

Summary

The comparator for the sort functions should provide strict weak
ordering relation between parameters. Current solution causes compiler
crash with some standard c++ library implementations, because it does
not meet this criteria. Tried to fix it + it improves the iverall
vectorization result.

Diff Detail

Event Timeline

ABataev created this revision.Dec 7 2021, 10:51 AM
ABataev requested review of this revision.Dec 7 2021, 10:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2021, 10:51 AM
vporpo added a comment.Dec 7 2021, 4:52 PM

Do you think you could add a small test that triggers the issue?

Do you think you could add a small test that triggers the issue?

The test test/Transforms/SLPVectorizer/X86/vectorize-cmps.ll triggers it for us with MSVC, debug build.

vporpo added a comment.Dec 7 2021, 5:07 PM

Hmm, I am not seeing any difference in the output with/without the patch. Could you pre-commit it?

ABataev added a comment.EditedDec 7 2021, 5:11 PM

Hmm, I am not seeing any difference in the output with/without the patch. Could you pre-commit it?

It is not a difference, it is a crash in debug build on Windows, MSVC. MSVC Standard c++ lib has a check for strict weak ordering and crashes because current comparator does not meet the criterion.

The check is simple:

if (compare(v1, v2)) 
  assert(! compare(v2, v1));

i.e. if v1 is "less than" v2, v2 must not be "less" than v1.

vporpo added a comment.Dec 7 2021, 5:30 PM

Ah OK got it, it is stable_sort() in tryToVectorizeSequence() that crashes, not a run-failure when the standard library is compiled with the current compiler. Thanks for the explanation.

Ah OK got it, it is stable_sort() in tryToVectorizeSequence() that crashes, not a run-failure when the standard library is compiled with the current compiler. Thanks for the explanation.

Yep, right, forgot to mention that it is for sorting.

vporpo added a comment.Dec 7 2021, 7:20 PM

It is probably worth splitting the patch, separating the fix for the crash from the improvements. It would also be nice to have a few small tests that check for these improvements to avoid regressions in the future. What do you think?

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9582

Shouldn't this loop skip the condition operand (operand 0) ?

9583

If I understand correctly, at this point the predicates are either equal or opposites. You comparing them like Pred1 <= Pred2 and Pred1 >= Pred2, which I think is equivalent to something like:

bool PredsAreOpposite = Pred1 != Pred2;
for (int I = 0, E = CI1->getNumOperands(); I < E; ++I) {
  auto *Op0 = CI1->getOperand(I);
  auto *Op1 = CI2->getOperand(!PredAreOpposite ? I : E - I - 1);

I think this is what you did in line 9576 below.

9583

Perhaps it is worth moving the operand comparison code in a separate function as it shares a lot with the operand comparison code in CompareSorter() above. It could return an enum like LT, GT, EQ (corresponding to return true, return false and continue).

9584

nit: Same as above: Op1, Op2 instead of Op0, Op1.

It is probably worth splitting the patch, separating the fix for the crash from the improvements. It would also be nice to have a few small tests that check for these improvements to avoid regressions in the future. What do you think?

The fix alone causes test regressions.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9582

It is not for SelectInst but for CmpInst, it has only 2 operands and constant predicate

9583

Yes, but in this form, it does not meet the criterion again for some cases, tried it. The provided form provides strict weak ordering.

9583

No, the logic is different

ABataev added inline comments.Dec 8 2021, 4:42 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9583

It does not work for these 2 cmp instructions: icmp slt 0, x and icmp sgt x, 0 (x is an instruction). Compare returns true for compare(v1, v2) and compare(v2, v1) and thus does not meet the requirement for the weak strict ordering.

ABataev updated this revision to Diff 392734.Dec 8 2021, 6:05 AM

Address comments

vporpo added inline comments.Dec 8 2021, 12:33 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9582

Oops my bad. I think I got confused by the checks in line 9523. They are checking operand 0 only, and they take place before the rest of the operand comparisons.
Should these checks be incorporated into the new operand checks instead of having them as separate checks before the predicate comparisons?

9583

That makes sense. Anyway, I would still factor out the comparison out of the loop and use a variable, just to make it easier to read, something like:

bool SwapOperands1 = Pred1 < Pred2;
bool SwapOperands2 = Pred2 < Pred1;
for (int I = 0, E = CI1->getNumOperands(); I < E; ++I) {
  auto *Op1 = CI1->getOperand(SwapOperands1 ? I : E - I - 1);
  auto *Op2 = CI2->getOperand(SwapOperands2 ? I : E - I - 1);

but I don't feel too strongly about this.

9583

My main concern is that if we make changes in the logic of CompareSorter() we should also make changes here too, and if we don't I think the compare instructions will be silently skipped. Perhaps it is worth mentioning this in a comment?

ABataev added inline comments.Dec 8 2021, 12:41 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9583

That’s inevitable here. We have different logic for sorting comparator and for compatible instructions. They are similar but different, have different number of instructions. I will try but no promises.

ABataev updated this revision to Diff 392925.Dec 8 2021, 1:46 PM

Address comments

vporpo added inline comments.Dec 8 2021, 3:13 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9489–9490

Since this is only used in AreCompatibleCompares() I think it makes sense to move it there. Do we expect to have identical values in CompareSorter() ?

9497

I think this value represents what we return when we find a definite less-than. So a better name for it would be something like: RetValOnLessThan. We should also probably set a default value for it = true, since this is what the comparator would return by default.

9531

This is the value we return if we could not find a definite less-than or greater-than value. So I think it would make sense to use either a separate template variable for this, or a constant, something like DefaultRetVal:

template <bool RetValOnLessThan, bool DefaultRetVal = !RetValOnLessThan>
static bool compareCmp(Value *V, Value *V2, ...

or:

constexpr const bool DefaultRetVal = !RetValOnLessThan;

In this way CompareSorter() would call this function as:

return compareCmp</*RetValOnLessThan=*/true>(V, V2, ...);

and AreCompatibleCompares() would call this like:

return compareCmp</*RetValOnLessThan=*/false>(V1, V2,...);

What do you think?

ABataev added inline comments.Dec 8 2021, 3:26 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

I have a different idea behind IsCompatibility flag. If it is true, we check for compatible operations, if false - check for the ordering. I prefer to have a single flag rather than having several flags, which may affect the whole logic independently. This is a potential source of bugs.

ABataev updated this revision to Diff 392955.Dec 8 2021, 3:30 PM

Address comments

ABataev updated this revision to Diff 392960.Dec 8 2021, 3:46 PM

Added a comment for the function.

vporpo added inline comments.Dec 8 2021, 3:48 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

But this does not explain why the other checks are not returning values based on IsCompatibility. I mean, isn't if (BasePred1 > BasePred2) also checking for compatibility? Why is it not returning a value like !IsCompatibility ?

What I am trying to say is that we should try to be more explicit about what is the purpose of IsCompatibility, otherwise this code is a bit cryptic. My understanding is that it is used for two separate things:

  1. the return value of when we find a definite less-than, and
  2. the default return value if we could not find a definite less-than or greater-than.

We could still use a single flag for those, but I don't think using two is too bad either.

ABataev added inline comments.Dec 8 2021, 4:02 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

Do not treat this flag as a flag for the return value, treat it as a flag for the logic of the function.

vporpo added inline comments.Dec 8 2021, 5:07 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

OK suppose somebody else wants to extend the comparator function to go even deeper through the operands and add more comparisons. How would IsCompatibility guide them in writing the code? They could as well treat it as a flag and do a multi-versioning implementation like: if (IsCompatibility) { do the logic for the compatibility comparison; } else { do the comparison logic; }.

If instead we are more clear about how we expect the flag to be used and name it like ReturnValOnLessThan, then they would have no choice but to use it as expected.

ABataev added inline comments.Dec 8 2021, 5:16 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

OK suppose somebody else wants to extend the comparator function to go even deeper through the operands and add more comparisons. How would IsCompatibility guide them in writing the code? They could as well treat it as a flag and do a multi-versioning implementation like: if (IsCompatibility) { do the logic for the compatibility comparison; } else { do the comparison logic; }.

Ok, no problem.

If instead we are more clear about how we expect the flag to be used and name it like ReturnValOnLessThan, then they would have no choice but to use it as expected.

And someone will definitely use the return values flags in the wrong way.

I prefer existing solution.

ABataev added inline comments.Dec 8 2021, 5:17 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
9497

OK suppose somebody else wants to extend the comparator function to go even deeper through the operands and add more comparisons. How would IsCompatibility guide them in writing the code? They could as well treat it as a flag and do a multi-versioning implementation like: if (IsCompatibility) { do the logic for the compatibility comparison; } else { do the comparison logic; }.

Ok, no problem.

If instead we are more clear about how we expect the flag to be used and name it like ReturnValOnLessThan, then they would have no choice but to use it as expected.

And someone will definitely use the return values flags in the wrong way.

I prefer existing solution.

Or the original solution with 2 different lambdas for different purposes.

Ping! Need to land it ASAP to fix a crash, all functional questions are answered, design decisions can be discussed later.

vporpo accepted this revision.Dec 9 2021, 10:20 AM

LGTM

This revision is now accepted and ready to land.Dec 9 2021, 10:20 AM
This revision was landed with ongoing or failed builds.Dec 9 2021, 10:59 AM
This revision was automatically updated to reflect the committed changes.