This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.
Details
Diff Detail
Event Timeline
test/Transforms/SLPVectorizer/X86/broadcast.ll | ||
---|---|---|
18–28 | Hmm, it is very strange that the new result is more cost effective than the previous one. We change 2 insertelements, 1 vector sub and, actually, one shuffle (2currently), to 2 scalar subs and 8 insertelements (2 broadcasts). Is this really so cost effective? |
test/Transforms/SLPVectorizer/X86/broadcast.ll | ||
---|---|---|
18–28 | I am not sure what you mean. The cost of 2 insert elements + one 2-wide sub should be the same as the 2 scalar subs, so nothing to gain from the old code so far. Next, in the old code we have one shuffle for the left input to the 4-wide add and another shuffle for the right input. This is more expensive than the 2 broadcasts we have in the new code, because broadcasts are cheaper than shuffles. |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
923 | It is modifying the 'IsUsed' so that we skip the operands that have already been identified as a broadcast, so it needs to call the non-const getData(), therefore it cannot be const. | |
932 | Hmm, currently there is no restriction on Undefs, as Data.V can be any value. But maybe undefs should be matched with a lower priority ? |
The function can be made const, I think.