This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix order of `insertelement`/`insertvalue` seed operands
ClosedPublic

Authored by anton-afanasyev on Jul 14 2020, 8:18 AM.

Details

Summary

This patch takes the indices operands of insertelement/insertvalue
into account while generation of seed elements for findBuildAggregate().
This function has kept the original order of inserts before.
Also this patch optimizes findBuildAggregate() preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2020, 8:18 AM

Add test?

Sure, thanks!

ABataev added inline comments.Jul 15 2020, 5:46 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7087

Why need a parameter here? Why always return 2?

7096–7097

Just return 0;, no need for else

7098–7108

No need for else.

7103

Use actual type and i->I

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

anton-afanasyev marked 7 inline comments as done.

Update tests

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7087

Fixed, it was temporary dummy.

7096–7097

Fixed, thanks.

7103

Fixed, thanks.

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

No, actually it's not NFC (though instcombine has done the same work previously).

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7093

No need for else here, just unconditional block, since then part is terminated with return

7127

No need for else here

7129

Remove else here

7140

Enclose in braces

7198

.empty()

7387

Very strange check, why we need it? Why did you change the previous one?

7401

Same question here

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209 ↗(On Diff #278479)

A regression?

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296 ↗(On Diff #278479)

Regression?

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–135

Regression?

anton-afanasyev marked 18 inline comments as done.Jul 24 2020, 1:22 PM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7093

Thanks, fixed

7127

Fixed

7129

Removed

7140

Enclosed

7198

Changed

7387

The way the findBuildAggregate() works has changed: it outputs vector of Values, ordered by inserts indices, so it possible this vector contains gaps. But I've changed findBuildAggregate() again, splitting it to recursive and non-recursive parts, and making this check inside the latter one.

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209 ↗(On Diff #278479)

Fixed by allowing to findBuildAggregate() support of partial vector filling.

llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296 ↗(On Diff #278479)

Fixed

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–135

Yes, that's regression, but that's pay for the optimization by reducing shufflevectors at the more common cases.

anton-afanasyev marked 9 inline comments as done.

Update. Splitted findBuildAggregate() to recursive and non-recursive parts.

ABataev added inline comments.Jul 28 2020, 12:15 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7092

returns true for the function, that shall return unsigned

7100

Type *

7124

Just cast

7141

What about VectorType here if it is allowed in the previous function?

7161

cast<Instruction>

7164–7167

No need for else here

7171–7174

Better to transform it a condition of the loop and return false after the end of the loop

7210–7218

Just llvm::erase_value(<container>, nullptr)

7221–7222

Tabs

7327

Why you don't need this?

anton-afanasyev marked 10 inline comments as done.Jul 29 2020, 9:00 AM

Addressed @ABataev's comments.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7092

Thanks, I've forgotten to change return type.

7100

Fixed

7124

Fixed

7141

No need to do this: insertvalue's indices can never point to vector elements, only to structure and array ones. Function getAggregateSize() doesn's use InsertValueInst::indices() and computes aggregate size by getNumElements() multiplication, whereas getOperandIndex() just returns insertvalue's operand offset. The vector can be addressed by insertvalue only as final atomic element of structure. Function findBuildAggregate_rec() processes this element separately using OperandIndex as base offset.

7161

Fixed

7164–7167

No, this else matters here: it embraces the block which is executed only if InsertedOperand is not insert. There is the second nested if inside the first one.

7171–7174

Transformed.

7210–7218

Thanks, changed. Used llvm::erase_if(<container>, <predicate>), didn't found llvm::erase_value().

7221–7222

Hmm, that's actually spaces, not tabs.

7327

This is checked inside findBuildAggregate() itself now.

anton-afanasyev marked 10 inline comments as done.

Updated

Ping. Rebase.

ABataev added inline comments.Aug 5 2020, 5:10 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7087

Maybe better to return Optional<unsigned> here rather than return bool and have AggregateSize return param/

7119–7120

Same, Optional<unsigned>

anton-afanasyev marked 2 inline comments as done.

Addressed comments

anton-afanasyev added a comment.EditedAug 5 2020, 6:40 AM

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7087

Sounds good, fixed

7119–7120

Ok, done

A couple of minor style comments - trivial scope reductions

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7090

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

7094

unsigned AggregateSize = 1;

7121

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

anton-afanasyev marked 3 inline comments as done.

Addressed

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7090

Sure, fixed.

7094

Fixed.

7121

Fixed.

RKSimon added inline comments.Aug 6 2020, 6:05 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7089

VectorType::getNumElements() will be going away soon - you need to replace this with cast<FixedVectorType>(IE->getType())->getNumElements()

7104

dyn_cast<FixedVectorType>

7120

cast<FixedVectorType>(IE->getType())

anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 8:08 AM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7089

Replaced

7104

Fixed

7120

Fixed

anton-afanasyev marked 3 inline comments as done.

Addressed comments

Thanks, no more comments from me - the reverse_hadd_v4f32 fix will be very useful to x86 backend hadd formation!

@ABataev any more comments?

ABataev added inline comments.Aug 6 2020, 9:24 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–431

This change also does not look like caused by the patch

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Drop this change or commit it separately as NFC.

RKSimon added inline comments.Aug 6 2020, 10:03 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–431
anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 10:33 AM
anton-afanasyev added inline comments.
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–431

Thanks, dropped this change.

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Dropped.

anton-afanasyev marked 2 inline comments as done.

Addressed comments

This revision is now accepted and ready to land.Aug 6 2020, 10:35 AM
This revision was landed with ongoing or failed builds.Aug 6 2020, 12:10 PM
This revision was automatically updated to reflect the committed changes.