This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
3
LoadStoreVectorizer.cpp

Differential D26962

[LoadStoreVectorizer] Split the chain if the prefix is empty
AbandonedPublic

Authored by volkan on Nov 22 2016, 4:03 AM.

Download Raw Diff

Details

Reviewers

asbirlea
• tstellarAMD
jlebar
arsenm

Summary

Currently, the vectorizer discards all of the instructions in the chain if
the vectorizable prefix is empty. Instead, we can try to vectorize smaller
chunks in order to increase the number of vector instructions generated.

Diff Detail

Event Timeline

volkan updated this revision to Diff 78847.Nov 22 2016, 4:03 AM

volkan retitled this revision from to [LoadStoreVectorizer] Split the chain if the prefix is empty.

volkan updated this object.

volkan added reviewers: jlebar, asbirlea, arsenm.

volkan added a subscriber: llvm-commits.

Herald added subscribers: wdng, mzolotukhin. · View Herald TranscriptNov 22 2016, 4:03 AM

Could you please add some test cases?

Added a test case.

Herald added a reviewer: • tstellarAMD. · View Herald TranscriptNov 22 2016, 6:10 AM

Herald added a subscriber: nhaehnle. · View Herald Transcript

Updated the test case.

Herald edited edge metadata. · View Herald TranscriptNov 22 2016, 8:12 AM

jlebar added inline comments.Nov 22 2016, 9:55 AM

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
741	"less than or equal to VF" is redundant with `<= VF` in the code. Can we say "is too small" or "is smaller than the vector width"?
749	Don't we know a priori that Left will not vectorize, because its vectorizable prefix equals Chain's vectorizable prefix? Which leads to another question: Why do we chop off VF elements rather than just one?
750	These return bools, so can we use `\|\|`?
test/Transforms/LoadStoreVectorizer/AMDGPU/empty-prefix.ll
4 ↗	(On Diff #78877)	Can we write this comment in the form of "Check that we X" Something that does not rely on implementation details of the LSV? That is, can we avoid the term "vectorizable prefix", or, if we must use it, can we define it here?

The problem you found is real, but this is not the right solution for it.
The issue is in "getVectorizablePrefix". When a load that aliases the store is found, it gives up trying to find a prefix. In reality, all stores that precede the aliasing load (you can view this as a memory barrier) could be safely vectorized.
I'll work on a patch to fix this.

Side note: The testcase you added is fairly large. The problem can be showcased with a much smaller example, along the lines: 4 stores, 4 loads, 4 stores, with the 8 stores forming a chain. "getVectorizablePrefix" should not give up after the first store, but allow all 4 stores that precede the aliasing load to be vectorized.

asbirlea mentioned this in D27008: [LoadStoreVectorizer] Enable vectorization of stores in the presence of an aliasing load.Nov 22 2016, 3:32 PM

volkan abandoned this revision.Nov 23 2016, 3:31 AM

asbirlea mentioned this in rL287781: [LoadStoreVectorizer] Enable vectorization of stores in the presence of an….Nov 23 2016, 9:53 AM

Revision Contents

Path

Size

lib/

Transforms/

Vectorize/

LoadStoreVectorizer.cpp

30 lines

Diff 78847

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp

Show First 20 Lines • Show All 731 Lines • ▼ Show 20 Lines	bool Vectorizer::vectorizeStoreChain(

if (!isPowerOf2_32(Sz) \|\| VF < 2 \|\| ChainSize < 2) {		if (!isPowerOf2_32(Sz) \|\| VF < 2 \|\| ChainSize < 2) {
InstructionsProcessed->insert(Chain.begin(), Chain.end());		InstructionsProcessed->insert(Chain.begin(), Chain.end());
return false;		return false;
}		}

ArrayRef<Instruction *> NewChain = getVectorizablePrefix(Chain);		ArrayRef<Instruction *> NewChain = getVectorizablePrefix(Chain);
if (NewChain.empty()) {		if (NewChain.empty()) {
// No vectorization possible.		// No vectorization possible if the number of instructions
		// is less than or equal to VF.
		jlebarUnsubmitted Not Done Reply Inline Actions "less than or equal to VF" is redundant with `<= VF` in the code. Can we say "is too small" or "is smaller than the vector width"? jlebar: "less than or equal to VF" is redundant with `<= VF` in the code. Can we say "is too small"…
		if (ChainSize <= VF) {
InstructionsProcessed->insert(Chain.begin(), Chain.end());		InstructionsProcessed->insert(Chain.begin(), Chain.end());
return false;		return false;
}		}
		// Split the chain and try each slice seperately in order
		// to increase the number of vector instructions generated.
		ArrayRef<Instruction *> Left = Chain.slice(0, VF);
		ArrayRef<Instruction *> Right = Chain.slice(VF);
		jlebarUnsubmitted Not Done Reply Inline Actions Don't we know a priori that Left will not vectorize, because its vectorizable prefix equals Chain's vectorizable prefix? Which leads to another question: Why do we chop off VF elements rather than just one? jlebar: Don't we know a priori that Left will not vectorize, because its vectorizable prefix equals…
		return vectorizeStoreChain(Left, InstructionsProcessed) \|
		jlebarUnsubmitted Not Done Reply Inline Actions These return bools, so can we use `\|\|`? jlebar: These return bools, so can we use `\|\|`?
		vectorizeStoreChain(Right, InstructionsProcessed);
		}
if (NewChain.size() == 1) {		if (NewChain.size() == 1) {
// Failed after the first instruction. Discard it and try the smaller chain.		// Failed after the first instruction. Discard it and try the smaller chain.
InstructionsProcessed->insert(NewChain.front());		InstructionsProcessed->insert(NewChain.front());
return false;		return false;
}		}

// Update Chain to the valid vectorizable subchain.		// Update Chain to the valid vectorizable subchain.
Chain = NewChain;		Chain = NewChain;
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	bool Vectorizer::vectorizeLoadChain(

if (!isPowerOf2_32(Sz) \|\| VF < 2 \|\| ChainSize < 2) {		if (!isPowerOf2_32(Sz) \|\| VF < 2 \|\| ChainSize < 2) {
InstructionsProcessed->insert(Chain.begin(), Chain.end());		InstructionsProcessed->insert(Chain.begin(), Chain.end());
return false;		return false;
}		}

ArrayRef<Instruction *> NewChain = getVectorizablePrefix(Chain);		ArrayRef<Instruction *> NewChain = getVectorizablePrefix(Chain);
if (NewChain.empty()) {		if (NewChain.empty()) {
// No vectorization possible.		// No vectorization possible if the number of instructions
		// is less than or equal to VF.
		if (ChainSize <= VF) {
InstructionsProcessed->insert(Chain.begin(), Chain.end());		InstructionsProcessed->insert(Chain.begin(), Chain.end());
return false;		return false;
}		}
		// Split the chain and try each slice seperately in order
		// to increase the number of vector instructions generated.
		ArrayRef<Instruction *> Left = Chain.slice(0, VF);
		ArrayRef<Instruction *> Right = Chain.slice(VF);
		return vectorizeLoadChain(Left, InstructionsProcessed) \|
		vectorizeLoadChain(Right, InstructionsProcessed);
		}
if (NewChain.size() == 1) {		if (NewChain.size() == 1) {
// Failed after the first instruction. Discard it and try the smaller chain.		// Failed after the first instruction. Discard it and try the smaller chain.
InstructionsProcessed->insert(NewChain.front());		InstructionsProcessed->insert(NewChain.front());
return false;		return false;
}		}

// Update Chain to the valid vectorizable subchain.		// Update Chain to the valid vectorizable subchain.
Chain = NewChain;		Chain = NewChain;
▲ Show 20 Lines • Show All 135 Lines • Show Last 20 Lines