This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix order of `insertelement`/`insertvalue` seed operands
ClosedPublic

Authored by anton-afanasyev on Jul 14 2020, 8:18 AM.

Details

Summary

This patch takes the indices operands of insertelement/insertvalue
into account while generation of seed elements for findBuildAggregate().
This function has kept the original order of inserts before.
Also this patch optimizes findBuildAggregate() preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2020, 8:18 AM

Add test?

Sure, thanks!

ABataev added inline comments.Jul 15 2020, 5:46 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7086

Why need a parameter here? Why always return 2?

7095–7096

Just return 0;, no need for else

7097–7107

No need for else.

7102

Use actual type and i->I

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

anton-afanasyev marked 7 inline comments as done.

Update tests

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7086

Fixed, it was temporary dummy.

7095–7096

Fixed, thanks.

7102

Fixed, thanks.

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

No, actually it's not NFC (though instcombine has done the same work previously).

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7092

No need for else here, just unconditional block, since then part is terminated with return

7126

No need for else here

7128

Remove else here

7139

Enclose in braces

7202

.empty()

7399

Very strange check, why we need it? Why did you change the previous one?

7413

Same question here

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209 ↗(On Diff #278479)

A regression?

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296 ↗(On Diff #278479)

Regression?

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–135

Regression?

anton-afanasyev marked 18 inline comments as done.Jul 24 2020, 1:22 PM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7092

Thanks, fixed

7126

Fixed

7128

Removed

7139

Enclosed

7202

Changed

7399

The way the findBuildAggregate() works has changed: it outputs vector of Values, ordered by inserts indices, so it possible this vector contains gaps. But I've changed findBuildAggregate() again, splitting it to recursive and non-recursive parts, and making this check inside the latter one.

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209 ↗(On Diff #278479)

Fixed by allowing to findBuildAggregate() support of partial vector filling.

llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296 ↗(On Diff #278479)

Fixed

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–135

Yes, that's regression, but that's pay for the optimization by reducing shufflevectors at the more common cases.

anton-afanasyev marked 9 inline comments as done.

Update. Splitted findBuildAggregate() to recursive and non-recursive parts.

ABataev added inline comments.Jul 28 2020, 12:15 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7091

returns true for the function, that shall return unsigned

7099

Type *

7123

Just cast

7140

What about VectorType here if it is allowed in the previous function?

7160

cast<Instruction>

7163–7166

No need for else here

7170–7173

Better to transform it a condition of the loop and return false after the end of the loop

7214–7222

Just llvm::erase_value(<container>, nullptr)

7225–7226

Tabs

7326

Why you don't need this?

anton-afanasyev marked 10 inline comments as done.Jul 29 2020, 9:00 AM

Addressed @ABataev's comments.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7091

Thanks, I've forgotten to change return type.

7099

Fixed

7123

Fixed

7140

No need to do this: insertvalue's indices can never point to vector elements, only to structure and array ones. Function getAggregateSize() doesn's use InsertValueInst::indices() and computes aggregate size by getNumElements() multiplication, whereas getOperandIndex() just returns insertvalue's operand offset. The vector can be addressed by insertvalue only as final atomic element of structure. Function findBuildAggregate_rec() processes this element separately using OperandIndex as base offset.

7160

Fixed

7163–7166

No, this else matters here: it embraces the block which is executed only if InsertedOperand is not insert. There is the second nested if inside the first one.

7170–7173

Transformed.

7214–7222

Thanks, changed. Used llvm::erase_if(<container>, <predicate>), didn't found llvm::erase_value().

7225–7226

Hmm, that's actually spaces, not tabs.

7326

This is checked inside findBuildAggregate() itself now.

anton-afanasyev marked 10 inline comments as done.

Updated

Ping. Rebase.

ABataev added inline comments.Aug 5 2020, 5:10 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7086

Maybe better to return Optional<unsigned> here rather than return bool and have AggregateSize return param/

7118–7119

Same, Optional<unsigned>

anton-afanasyev marked 2 inline comments as done.

Addressed comments

anton-afanasyev added a comment.EditedAug 5 2020, 6:40 AM

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7086

Sounds good, fixed

7118–7119

Ok, done

A couple of minor style comments - trivial scope reductions

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7089

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

7093

unsigned AggregateSize = 1;

7120

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

anton-afanasyev marked 3 inline comments as done.

Addressed

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7089

Sure, fixed.

7093

Fixed.

7120

Fixed.

RKSimon added inline comments.Aug 6 2020, 6:05 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7088

VectorType::getNumElements() will be going away soon - you need to replace this with cast<FixedVectorType>(IE->getType())->getNumElements()

7103

dyn_cast<FixedVectorType>

7119

cast<FixedVectorType>(IE->getType())

anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 8:08 AM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7088

Replaced

7103

Fixed

7119

Fixed

anton-afanasyev marked 3 inline comments as done.

Addressed comments

Thanks, no more comments from me - the reverse_hadd_v4f32 fix will be very useful to x86 backend hadd formation!

@ABataev any more comments?

ABataev added inline comments.Aug 6 2020, 9:24 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–432

This change also does not look like caused by the patch

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Drop this change or commit it separately as NFC.

RKSimon added inline comments.Aug 6 2020, 10:03 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–432
anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 10:33 AM
anton-afanasyev added inline comments.
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
428–432

Thanks, dropped this change.

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Dropped.

anton-afanasyev marked 2 inline comments as done.

Addressed comments

This revision is now accepted and ready to land.Aug 6 2020, 10:35 AM
This revision was landed with ongoing or failed builds.Aug 6 2020, 12:10 PM
This revision was automatically updated to reflect the committed changes.