This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CGLoopInfo.cpp
-
test/CodeGenCXX/
-
CodeGenCXX/
-
pragma-loop-predicate.cpp

Differential D65776

[Clang] Pragma vectorize_predicate implies vectorize
ClosedPublic

Authored by SjoerdMeijer on Aug 5 2019, 2:19 PM.

Download Raw Diff

Details

Reviewers

Meinersbur
hsaito
fhahn
hfinkel
reames

Commits

rG535efab2e533: [Clang] Pragma vectorize_predicate implies vectorize
rC368970: [Clang] Pragma vectorize_predicate implies vectorize
rL368970: [Clang] Pragma vectorize_predicate implies vectorize

Summary

New pragma "vectorize_predicate(enable)" now implies "vectorize(enable)",
and it is ignored when vectorization is disabled with e.g.
"vectorize(disable) vectorize_predicate(enable)".

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Aug 5 2019, 2:19 PM

Forgot that this requires a doc change too. Will add that once we're happy with the proposed behaviour.

Mmmh, I would have expected this to work the same way as vectorize_width. According to the docs:

The following example implicitly enables vectorization and interleaving by specifying a vector width and interleaving count:
#pragma clang loop vectorize_width(2) interleave_count(2)
for(...) {
...
}

However, vectorize_width does not automatically set llvm.loop.vectorize.enable. Neither does llvm.loop.vectorize.width > 1 imply LoopVectorizeHints::getForce(). At some places they are checked together, such as LoopVectorizeHints::allowReordering(). Other places, notably LoopVectorizationCostModel::selectVectorizationFactor(), only queries getForce(). That is, vectorize_width(2) does not implicitly force vectorization in practice.

Thanks for pointing this all out!

I am not entirely sure yet what to think about all this as I am new to the loop pragma business, but I think it looks inconsistent to me!

I think I find allowReordering() a little bit ugly, because it is also checking getWidth() > 1, and I am probably expecting the allowReordering decision to be based on 1 thing (the Force). But in the current implementation that probably makes sense though if vectorizer_width doesn't set vectorizer.enable (again, which is not what I would expect). Thus, it makes sense to me that LoopVectorizationCostModel::selectVectorizationFactor() only checks getForce()

Mmmh, I would have expected this to work the same way as vectorize_width.

So I don't see at this moment how it could work in the same way. And by working in the same way I think you mean adding vectorize_predicate checks to different places where vectorize_width is also checked, but that doesn't sound ideal to me.

Could a conclusion be that I was lucky? Lucky, because without all this knowledge that I have now, this patch looks to do what we would expect?

In D65776#1615834, @Meinersbur wrote:

Mmmh, I would have expected this to work the same way as vectorize_width. According to the docs:

The following example implicitly enables vectorization and interleaving by specifying a vector width and interleaving count:
#pragma clang loop vectorize_width(2) interleave_count(2)
for(...) {
...
}

However, vectorize_width does not automatically set llvm.loop.vectorize.enable. Neither does llvm.loop.vectorize.width > 1 imply LoopVectorizeHints::getForce(). At some places they are checked together, such as LoopVectorizeHints::allowReordering(). Other places, notably LoopVectorizationCostModel::selectVectorizationFactor(), only queries getForce(). That is, vectorize_width(2) does not implicitly force vectorization in practice.

The current behavior with respect to vectorize_width seems a bit counterintuitive. The docs say vectorize_width implicitly enables vectorization, I would expect it to behave as if vectorize(enable) was also passed. There's also a PR about that mismatch: https://bugs.llvm.org/show_bug.cgi?id=27643

IMO it would make sense to have the more specific pragmas imply vectorize(enable) here (or update the docs accordingly).

Hi Florian, thanks for your input!

IMO it would make sense to have the more specific pragmas imply vectorize(enable) here (or update the docs accordingly).

Yep, fully agree with that, as I also wrote in my previous comment. And thanks for digging up that PR. I think that supports this case and that the proposed behaviour in this patch is what one would expect. If we agree on that, I wouldn't mind fixing that PR too as that seems low hanging-fruit to me.

And to be honest, at the moment I have no clue what the meaning is, or could/should be, when a more specific pragma like vector_width or vector_predicate is set, without also setting/implying vectorize(enable). I need to look into that...

Looking at the similar situation of unroll(enable)/unroll_count(4), unroll_count also does not set llvm.loop.unroll.enable, but it is handled by the LoopUnroll pass itself:

bool ExplicitUnroll = PragmaCount > 0 || PragmaFullUnroll ||
                      PragmaEnableUnroll || UserUnrollCount;

(LoopUnrollPass.cpp line 770f)

I do not know whether/how "setting a transformation option implicitly enables the transformation" should be implemented, maybe we should discuss this. It is currently inconsistent. Also consider that non-Clang frontends and .ll files in the wild might also expect a specific behavior.

I do not know whether/how "setting a transformation option implicitly enables the transformation" should be implemented, maybe we should discuss this.

Yep, agreed. I've sent a mail to the dev list:
http://lists.llvm.org/pipermail/cfe-dev/2019-August/063054.html

thanks @Meinersbur and @fhahn for the discussions on this, also on the dev list.
It looks like we are happy with this behaviour?

LGTM, thanks! Please wait a few days with committing, in case any additional comments come in via the mailing list.

This revision is now accepted and ready to land.Aug 13 2019, 12:30 PM

Thanks for your help! And I will wait a few days.

After this, I will look at that PR, will have a look at diagnostics, and then at the LLVM side of things.

Meinersbur accepted this revision.Aug 13 2019, 7:43 PM

Closed by commit rL368970: [Clang] Pragma vectorize_predicate implies vectorize (authored by SjoerdMeijer). · Explain WhyAug 14 2019, 11:23 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2019, 11:23 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

rscottmanley added a subscriber: rscottmanley.Sep 10 2019, 7:52 AM

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGLoopInfo.cpp

21 lines

test/

CodeGenCXX/

pragma-loop-predicate.cpp

41 lines

Diff 213462

clang/lib/CodeGen/CGLoopInfo.cpp

Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines	MDNode *Followup = createUnrollAndJamMetadata(Attrs, FollowupLoopProperties,
FollowupHasTransforms);		FollowupHasTransforms);

SmallVector<Metadata *, 4> Args;		SmallVector<Metadata *, 4> Args;
TempMDTuple TempNode = MDNode::getTemporary(Ctx, None);		TempMDTuple TempNode = MDNode::getTemporary(Ctx, None);
Args.push_back(TempNode.get());		Args.push_back(TempNode.get());
Args.append(LoopProperties.begin(), LoopProperties.end());		Args.append(LoopProperties.begin(), LoopProperties.end());

// Setting vectorize.predicate		// Setting vectorize.predicate
if (Attrs.VectorizePredicateEnable != LoopAttributes::Unspecified) {		bool IsVectorPredicateEnabled = false;
		if (Attrs.VectorizePredicateEnable != LoopAttributes::Unspecified &&
		Attrs.VectorizeEnable != LoopAttributes::Disable &&
		Attrs.VectorizeWidth < 1) {

		IsVectorPredicateEnabled =
		(Attrs.VectorizePredicateEnable == LoopAttributes::Enable);

Metadata *Vals[] = {		Metadata *Vals[] = {
MDString::get(Ctx, "llvm.loop.vectorize.predicate.enable"),		MDString::get(Ctx, "llvm.loop.vectorize.predicate.enable"),
ConstantAsMetadata::get(ConstantInt::get(		ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt1Ty(Ctx),
llvm::Type::getInt1Ty(Ctx),		IsVectorPredicateEnabled))};
(Attrs.VectorizePredicateEnable == LoopAttributes::Enable)))};
Args.push_back(MDNode::get(Ctx, Vals));		Args.push_back(MDNode::get(Ctx, Vals));
}		}

// Setting vectorize.width		// Setting vectorize.width
if (Attrs.VectorizeWidth > 0) {		if (Attrs.VectorizeWidth > 0) {
Metadata *Vals[] = {		Metadata *Vals[] = {
MDString::get(Ctx, "llvm.loop.vectorize.width"),		MDString::get(Ctx, "llvm.loop.vectorize.width"),
ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt32Ty(Ctx),		ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt32Ty(Ctx),
Attrs.VectorizeWidth))};		Attrs.VectorizeWidth))};
Args.push_back(MDNode::get(Ctx, Vals));		Args.push_back(MDNode::get(Ctx, Vals));
}		}

// Setting interleave.count		// Setting interleave.count
if (Attrs.InterleaveCount > 0) {		if (Attrs.InterleaveCount > 0) {
Metadata *Vals[] = {		Metadata *Vals[] = {
MDString::get(Ctx, "llvm.loop.interleave.count"),		MDString::get(Ctx, "llvm.loop.interleave.count"),
ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt32Ty(Ctx),		ConstantAsMetadata::get(ConstantInt::get(llvm::Type::getInt32Ty(Ctx),
Attrs.InterleaveCount))};		Attrs.InterleaveCount))};
Args.push_back(MDNode::get(Ctx, Vals));		Args.push_back(MDNode::get(Ctx, Vals));
}		}

// Setting vectorize.enable		// Setting vectorize.enable
if (Attrs.VectorizeEnable != LoopAttributes::Unspecified) {		if (Attrs.VectorizeEnable != LoopAttributes::Unspecified \|\|
		IsVectorPredicateEnabled) {
Metadata *Vals[] = {		Metadata *Vals[] = {
MDString::get(Ctx, "llvm.loop.vectorize.enable"),		MDString::get(Ctx, "llvm.loop.vectorize.enable"),
ConstantAsMetadata::get(ConstantInt::get(		ConstantAsMetadata::get(ConstantInt::get(
llvm::Type::getInt1Ty(Ctx),		llvm::Type::getInt1Ty(Ctx),
(Attrs.VectorizeEnable == LoopAttributes::Enable)))};		IsVectorPredicateEnabled
		? true
		: (Attrs.VectorizeEnable == LoopAttributes::Enable)))};
Args.push_back(MDNode::get(Ctx, Vals));		Args.push_back(MDNode::get(Ctx, Vals));
}		}

if (FollowupHasTransforms)		if (FollowupHasTransforms)
Args.push_back(MDNode::get(		Args.push_back(MDNode::get(
Ctx,		Ctx,
{MDString::get(Ctx, "llvm.loop.vectorize.followup_all"), Followup}));		{MDString::get(Ctx, "llvm.loop.vectorize.followup_all"), Followup}));

▲ Show 20 Lines • Show All 488 Lines • Show Last 20 Lines

clang/test/CodeGenCXX/pragma-loop-predicate.cpp

	Show All 21 Lines
	// CHECK-LABEL: @{{.}}test2{{.}}(			// CHECK-LABEL: @{{.}}test2{{.}}(
	// CHECK: br label {{.}}, !llvm.loop ![[LOOP2:.]]			// CHECK: br label {{.}}, !llvm.loop ![[LOOP2:.]]

	#pragma clang loop vectorize(enable) vectorize_predicate(disable)			#pragma clang loop vectorize(enable) vectorize_predicate(disable)
	for (int i = 0; i < Length; i++)			for (int i = 0; i < Length; i++)
	List[i] = i * 2;			List[i] = i * 2;
	}			}

				// vectorize_predicate(enable) implies vectorize(enable)
				void test3(int *List, int Length) {
				// CHECK-LABEL: @{{.}}test3{{.}}(
				// CHECK: br label {{.}}, !llvm.loop ![[LOOP3:.]]

				#pragma clang loop vectorize_predicate(enable)
				for (int i = 0; i < Length; i++)
				List[i] = i * 2;
				}

				// Check that disabling vectorization means a vectorization width of 1, and
				// also that vectorization_predicate isn't enabled.
				void test4(int *List, int Length) {
				// CHECK-LABEL: @{{.}}test4{{.}}(
				// CHECK: br label {{.}}, !llvm.loop ![[LOOP4:.]]

				#pragma clang loop vectorize(disable)
				for (int i = 0; i < Length; i++)
				List[i] = i * 2;
				}

				// Check that vectorize and vectorize_predicate are disabled.
				void test5(int *List, int Length) {
				// CHECK-LABEL: @{{.}}test5{{.}}(
				// CHECK: br label {{.}}, !llvm.loop ![[LOOP5:.]]

				#pragma clang loop vectorize(disable) vectorize_predicate(enable)
				for (int i = 0; i < Length; i++)
				List[i] = i * 2;
				}


	// CHECK: ![[LOOP0]] = distinct !{![[LOOP0]], !3}			// CHECK: ![[LOOP0]] = distinct !{![[LOOP0]], !3}
	// CHECK-NEXT: !3 = !{!"llvm.loop.vectorize.enable", i1 true}			// CHECK-NEXT: !3 = !{!"llvm.loop.vectorize.enable", i1 true}

	// CHECK-NEXT: ![[LOOP1]] = distinct !{![[LOOP1]], !5, !3}			// CHECK-NEXT: ![[LOOP1]] = distinct !{![[LOOP1]], !5, !3}
	// CHECK-NEXT: !5 = !{!"llvm.loop.vectorize.predicate.enable", i1 true}			// CHECK-NEXT: !5 = !{!"llvm.loop.vectorize.predicate.enable", i1 true}

	// CHECK-NEXT: ![[LOOP2]] = distinct !{![[LOOP2]], !7, !3}			// CHECK-NEXT: ![[LOOP2]] = distinct !{![[LOOP2]], !7, !3}
	// CHECK-NEXT: !7 = !{!"llvm.loop.vectorize.predicate.enable", i1 false}			// CHECK-NEXT: !7 = !{!"llvm.loop.vectorize.predicate.enable", i1 false}

				// CHECK-NEXT: ![[LOOP3]] = distinct !{![[LOOP3]], !5, !3}

				// CHECK-NEXT: ![[LOOP4]] = distinct !{![[LOOP4]], !10}
				// CHECK-NEXT: !10 = !{!"llvm.loop.vectorize.width", i32 1}

				// CHECK-NEXT: ![[LOOP5]] = distinct !{![[LOOP5]], !10}