Need to early exit out of the reordering process if the perfect/shuffled match is found in the operands. Such pattern will result in not profitable reordering because of (false positive) external use of scalars.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | Isn't VLOperands a better place for this logic? Perhaps a method like: isDiamondMatch() ? | |
1666 | nit: Perhaps a TODO here? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | I am not sure I follow why moving this logic to a member method in VLOperands won't work in this case. Could you elaborate a bit on this? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | I have the same problem because this function reorder() is a member of VLOperands class :) |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | Or do you suggest transforming this lambda to a member function? If so, I think keeping lambda is better because it does not increase the number of interfaces of the class. If (or when) we'll have several users of this functionality, it can be outlined into a private member function. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | Yes, I was suggesting moving the loops of this lambda to a function. I understand that this is the only use so it is not really needed, but if we need this same functionality in the future it will be hard to remember that this code already exists in this lambda. So we will probably end up re-implementing it. Anyway, regarding your earlier comment, sorry, I still don't understand what issue you are referring to about the first iteration of reordering. Are you referring to the Passes in the for loop? I am confused. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | Yes, to passes in the loop. If the pass failed, we still do some reordering and may end up with the diamond match situation. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | OK, but why do you want to avoid it if it happens in a later pass? Is it going to produce a worse reordering? Do you think the diamond case could be handled by adding a new ReorderingMode::Diamond that disables reordering for those operand indexes (or perhaps disables reordering completely)? This could be set in the loop under // Initialize the modes., line 1627. This would also fit with the existing design and won't look like a workaround. What do you think? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 |
I think it may.
Not sure we can do it. We perform the analysis lane by lane, but here we need to perform the analysis in the orthogonal order - operand by operand. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | Are you planning to move the bail-out check before the pass loop then? Also I would rename the lambda to something like SkipReordering because it is actually looking for corner cases when it should bail-out, it is not looking for cases when it should apply reordering. And please make sure that it is clear from the comment that this is the place where any code related to reordering bail-outs should be placed. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
1656–1665 | I intentionally moved it into the loop, because even if we failed to reorder all the operands/lanes, some of them still might be reordered and after the failed reordering attempt we still may have diamond match. |
Isn't VLOperands a better place for this logic? Perhaps a method like: isDiamondMatch() ?
This will also help separate the temporary check UniqueValues.size() == 2 || !isPowerOf2_32(UniqueValues.size()). What do you think?