This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix order of `insertelement`/`insertvalue` seed operands
ClosedPublic

Authored by anton-afanasyev on Jul 14 2020, 8:18 AM.

Details

Summary

This patch takes the indices operands of insertelement/insertvalue
into account while generation of seed elements for findBuildAggregate().
This function has kept the original order of inserts before.
Also this patch optimizes findBuildAggregate() preventing it from
redundant temporary vector allocations and its multiple reversing.

Fixes llvm.org/pr44067

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2020, 8:18 AM

Add test?

Sure, thanks!

ABataev added inline comments.Jul 15 2020, 5:46 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7078

Why need a parameter here? Why always return 2?

7087–7088

Just return 0;, no need for else

7089–7099

No need for else.

7094

Use actual type and i->I

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

anton-afanasyev marked 7 inline comments as done.

Update tests

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7078

Fixed, it was temporary dummy.

7087–7088

Fixed, thanks.

7094

Fixed, thanks.

Add test?

I assume this is actually NFC, it just tries to optimize the function findBuildAggregate.

No, actually it's not NFC (though instcombine has done the same work previously).

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7084

No need for else here, just unconditional block, since then part is terminated with return

7118

No need for else here

7120

Remove else here

7131

Enclose in braces

7161

.empty()

7363

Very strange check, why we need it? Why did you change the previous one?

7378

Same question here

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209

A regression?

ABataev added inline comments.Jul 16 2020, 7:59 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296

Regression?

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–134

Regression?

anton-afanasyev marked 18 inline comments as done.Jul 24 2020, 1:22 PM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7084

Thanks, fixed

7118

Fixed

7120

Removed

7131

Enclosed

7161

Changed

7363

The way the findBuildAggregate() works has changed: it outputs vector of Values, ordered by inserts indices, so it possible this vector contains gaps. But I've changed findBuildAggregate() again, splitting it to recursive and non-recursive parts, and making this check inside the latter one.

llvm/test/Transforms/SLPVectorizer/AArch64/gather-root.ll
206–209

Fixed by allowing to findBuildAggregate() support of partial vector filling.

llvm/test/Transforms/SLPVectorizer/X86/alternate-cast.ll
293–296

Fixed

llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll
129–134

Yes, that's regression, but that's pay for the optimization by reducing shufflevectors at the more common cases.

anton-afanasyev marked 9 inline comments as done.

Update. Splitted findBuildAggregate() to recursive and non-recursive parts.

ABataev added inline comments.Jul 28 2020, 12:15 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7083

returns true for the function, that shall return unsigned

7091

Type *

7115

Just cast

7132

What about VectorType here if it is allowed in the previous function?

7152

cast<Instruction>

7155–7158

No need for else here

7162–7165

Better to transform it a condition of the loop and return false after the end of the loop

7205–7213

Just llvm::erase_value(<container>, nullptr)

7216–7217

Tabs

7318

Why you don't need this?

anton-afanasyev marked 10 inline comments as done.Jul 29 2020, 9:00 AM

Addressed @ABataev's comments.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7083

Thanks, I've forgotten to change return type.

7091

Fixed

7115

Fixed

7132

No need to do this: insertvalue's indices can never point to vector elements, only to structure and array ones. Function getAggregateSize() doesn's use InsertValueInst::indices() and computes aggregate size by getNumElements() multiplication, whereas getOperandIndex() just returns insertvalue's operand offset. The vector can be addressed by insertvalue only as final atomic element of structure. Function findBuildAggregate_rec() processes this element separately using OperandIndex as base offset.

7152

Fixed

7155–7158

No, this else matters here: it embraces the block which is executed only if InsertedOperand is not insert. There is the second nested if inside the first one.

7162–7165

Transformed.

7205–7213

Thanks, changed. Used llvm::erase_if(<container>, <predicate>), didn't found llvm::erase_value().

7216–7217

Hmm, that's actually spaces, not tabs.

7318

This is checked inside findBuildAggregate() itself now.

anton-afanasyev marked 10 inline comments as done.

Updated

Ping. Rebase.

ABataev added inline comments.Aug 5 2020, 5:10 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7078

Maybe better to return Optional<unsigned> here rather than return bool and have AggregateSize return param/

7110–7111

Same, Optional<unsigned>

anton-afanasyev marked 2 inline comments as done.

Addressed comments

anton-afanasyev added a comment.EditedAug 5 2020, 6:40 AM

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7078

Sounds good, fixed

7110–7111

Ok, done

A couple of minor style comments - trivial scope reductions

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7081

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

7085

unsigned AggregateSize = 1;

7112

if (auto *IE = dyn_cast<InsertElementInst>(InsertInst))

anton-afanasyev marked 3 inline comments as done.

Addressed

Fixed

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7081

Sure, fixed.

7085

Fixed.

7112

Fixed.

RKSimon added inline comments.Aug 6 2020, 6:05 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7080

VectorType::getNumElements() will be going away soon - you need to replace this with cast<FixedVectorType>(IE->getType())->getNumElements()

7095

dyn_cast<FixedVectorType>

7111

cast<FixedVectorType>(IE->getType())

anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 8:08 AM
anton-afanasyev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7080

Replaced

7095

Fixed

7111

Fixed

anton-afanasyev marked 3 inline comments as done.

Addressed comments

Thanks, no more comments from me - the reverse_hadd_v4f32 fix will be very useful to x86 backend hadd formation!

@ABataev any more comments?

ABataev added inline comments.Aug 6 2020, 9:24 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
351–355

This change also does not look like caused by the patch

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Drop this change or commit it separately as NFC.

RKSimon added inline comments.Aug 6 2020, 10:03 AM
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
351–355
anton-afanasyev marked 3 inline comments as done.Aug 6 2020, 10:33 AM
anton-afanasyev added inline comments.
llvm/test/Transforms/SLPVectorizer/X86/alternate-int.ll
351–355

Thanks, dropped this change.

llvm/test/Transforms/SLPVectorizer/X86/load-merge.ll
59

Dropped.

anton-afanasyev marked 2 inline comments as done.

Addressed comments

This revision is now accepted and ready to land.Aug 6 2020, 10:35 AM
This revision was landed with ongoing or failed builds.Aug 6 2020, 12:10 PM
This revision was automatically updated to reflect the committed changes.