The comparator for the sort functions should provide strict weak
ordering relation between parameters. Current solution causes compiler
crash with some standard c++ library implementations, because it does
not meet this criteria. Tried to fix it + it improves the iverall
vectorization result.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The test test/Transforms/SLPVectorizer/X86/vectorize-cmps.ll triggers it for us with MSVC, debug build.
Hmm, I am not seeing any difference in the output with/without the patch. Could you pre-commit it?
It is not a difference, it is a crash in debug build on Windows, MSVC. MSVC Standard c++ lib has a check for strict weak ordering and crashes because current comparator does not meet the criterion.
The check is simple:
if (compare(v1, v2)) assert(! compare(v2, v1));
i.e. if v1 is "less than" v2, v2 must not be "less" than v1.
Ah OK got it, it is stable_sort() in tryToVectorizeSequence() that crashes, not a run-failure when the standard library is compiled with the current compiler. Thanks for the explanation.
It is probably worth splitting the patch, separating the fix for the crash from the improvements. It would also be nice to have a few small tests that check for these improvements to avoid regressions in the future. What do you think?
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9582 | Shouldn't this loop skip the condition operand (operand 0) ? | |
9583 | If I understand correctly, at this point the predicates are either equal or opposites. You comparing them like Pred1 <= Pred2 and Pred1 >= Pred2, which I think is equivalent to something like: bool PredsAreOpposite = Pred1 != Pred2; for (int I = 0, E = CI1->getNumOperands(); I < E; ++I) { auto *Op0 = CI1->getOperand(I); auto *Op1 = CI2->getOperand(!PredAreOpposite ? I : E - I - 1); I think this is what you did in line 9576 below. | |
9583 | Perhaps it is worth moving the operand comparison code in a separate function as it shares a lot with the operand comparison code in CompareSorter() above. It could return an enum like LT, GT, EQ (corresponding to return true, return false and continue). | |
9584 | nit: Same as above: Op1, Op2 instead of Op0, Op1. |
The fix alone causes test regressions.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9582 | It is not for SelectInst but for CmpInst, it has only 2 operands and constant predicate | |
9583 | Yes, but in this form, it does not meet the criterion again for some cases, tried it. The provided form provides strict weak ordering. | |
9583 | No, the logic is different |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9583 | It does not work for these 2 cmp instructions: icmp slt 0, x and icmp sgt x, 0 (x is an instruction). Compare returns true for compare(v1, v2) and compare(v2, v1) and thus does not meet the requirement for the weak strict ordering. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9582 | Oops my bad. I think I got confused by the checks in line 9523. They are checking operand 0 only, and they take place before the rest of the operand comparisons. | |
9583 | That makes sense. Anyway, I would still factor out the comparison out of the loop and use a variable, just to make it easier to read, something like: bool SwapOperands1 = Pred1 < Pred2; bool SwapOperands2 = Pred2 < Pred1; for (int I = 0, E = CI1->getNumOperands(); I < E; ++I) { auto *Op1 = CI1->getOperand(SwapOperands1 ? I : E - I - 1); auto *Op2 = CI2->getOperand(SwapOperands2 ? I : E - I - 1); but I don't feel too strongly about this. | |
9583 | My main concern is that if we make changes in the logic of CompareSorter() we should also make changes here too, and if we don't I think the compare instructions will be silently skipped. Perhaps it is worth mentioning this in a comment? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9583 | That’s inevitable here. We have different logic for sorting comparator and for compatible instructions. They are similar but different, have different number of instructions. I will try but no promises. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9489–9490 | Since this is only used in AreCompatibleCompares() I think it makes sense to move it there. Do we expect to have identical values in CompareSorter() ? | |
9497 | I think this value represents what we return when we find a definite less-than. So a better name for it would be something like: RetValOnLessThan. We should also probably set a default value for it = true, since this is what the comparator would return by default. | |
9531 | This is the value we return if we could not find a definite less-than or greater-than value. So I think it would make sense to use either a separate template variable for this, or a constant, something like DefaultRetVal: template <bool RetValOnLessThan, bool DefaultRetVal = !RetValOnLessThan> static bool compareCmp(Value *V, Value *V2, ... or: constexpr const bool DefaultRetVal = !RetValOnLessThan; In this way CompareSorter() would call this function as: return compareCmp</*RetValOnLessThan=*/true>(V, V2, ...); and AreCompatibleCompares() would call this like: return compareCmp</*RetValOnLessThan=*/false>(V1, V2,...); What do you think? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 | I have a different idea behind IsCompatibility flag. If it is true, we check for compatible operations, if false - check for the ordering. I prefer to have a single flag rather than having several flags, which may affect the whole logic independently. This is a potential source of bugs. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 | But this does not explain why the other checks are not returning values based on IsCompatibility. I mean, isn't if (BasePred1 > BasePred2) also checking for compatibility? Why is it not returning a value like !IsCompatibility ? What I am trying to say is that we should try to be more explicit about what is the purpose of IsCompatibility, otherwise this code is a bit cryptic. My understanding is that it is used for two separate things:
We could still use a single flag for those, but I don't think using two is too bad either. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 | Do not treat this flag as a flag for the return value, treat it as a flag for the logic of the function. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 | OK suppose somebody else wants to extend the comparator function to go even deeper through the operands and add more comparisons. How would IsCompatibility guide them in writing the code? They could as well treat it as a flag and do a multi-versioning implementation like: if (IsCompatibility) { do the logic for the compatibility comparison; } else { do the comparison logic; }. If instead we are more clear about how we expect the flag to be used and name it like ReturnValOnLessThan, then they would have no choice but to use it as expected. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 |
Ok, no problem.
And someone will definitely use the return values flags in the wrong way. I prefer existing solution. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9497 |
Or the original solution with 2 different lambdas for different purposes. |
Ping! Need to land it ASAP to fix a crash, all functional questions are answered, design decisions can be discussed later.
Since this is only used in AreCompatibleCompares() I think it makes sense to move it there. Do we expect to have identical values in CompareSorter() ?