This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix regression in broadcasts caused by operand reordering patch D59973.
ClosedPublic

Authored by vporpo on May 24 2019, 2:11 PM.

Details

Summary

This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 .
The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found.
Please see the lit test for some examples.

Diff Detail

Event Timeline

vporpo created this revision.May 24 2019, 2:11 PM
vporpo updated this revision to Diff 201369.May 24 2019, 5:24 PM

Minor comment update.

@vporpo I've committed the test case with current codegen at rL361805 - please can you rebase so this patch shows the codegen delta?

rcorcs added a subscriber: rcorcs.May 28 2019, 8:34 AM

Thanks for committing the test @RKSimon . I will follow up on this soon.

ABataev added inline comments.May 30 2019, 7:54 AM
test/Transforms/SLPVectorizer/X86/broadcast.ll
18

Hmm, it is very strange that the new result is more cost effective than the previous one. We change 2 insertelements, 1 vector sub and, actually, one shuffle (2currently), to 2 scalar subs and 8 insertelements (2 broadcasts). Is this really so cost effective?

vporpo marked an inline comment as done.May 30 2019, 12:16 PM
vporpo added inline comments.
test/Transforms/SLPVectorizer/X86/broadcast.ll
18

I am not sure what you mean. The cost of 2 insert elements + one 2-wide sub should be the same as the 2 scalar subs, so nothing to gain from the old code so far. Next, in the old code we have one shuffle for the left input to the 4-wide add and another shuffle for the right input. This is more expensive than the 2 broadcasts we have in the new code, because broadcasts are cheaper than shuffles.

ABataev added inline comments.May 30 2019, 12:41 PM
lib/Transforms/Vectorize/SLPVectorizer.cpp
923

The function can be made const, I think.

924

auto->bool

931

auto &->const OperandData &

932

What if the Data.V is Undef? I think, we can use broadcast in this case too.

test/Transforms/SLPVectorizer/X86/broadcast.ll
18

Ah, yes, I see, you're right.

vporpo marked 5 inline comments as done.May 30 2019, 1:31 PM
vporpo added inline comments.
lib/Transforms/Vectorize/SLPVectorizer.cpp
923

It is modifying the 'IsUsed' so that we skip the operands that have already been identified as a broadcast, so it needs to call the non-const getData(), therefore it cannot be const.

932

Hmm, currently there is no restriction on Undefs, as Data.V can be any value. But maybe undefs should be matched with a lower priority ?

vporpo updated this revision to Diff 202285.May 30 2019, 1:34 PM

Addressed the review comments.

looks good to me.

This revision is now accepted and ready to land.Jun 4 2019, 7:04 AM

Thank you for the reviews. Please commit it if @ABataev is also happy with it.

This revision was automatically updated to reflect the committed changes.