Download Raw Diff

Details

Reviewers

jmorse
ABataev

Commits

rGac9b9e3aad6b: [SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize

Summary

We don't want the existence of debug instructions affect codegen so we now
ignore debug instructions and other "isAssumeLikeIntrinsics in the
"extend schedule region" search loop in
BoUpSLP::BlockScheduling::extendSchedulingRegion.

Diff Detail

Event Timeline

uabelho created this revision.Jun 8 2023, 6:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2023, 6:20 AM

Herald added subscribers: vporpo, bjope, hiraditya. · View Herald Transcript

uabelho requested review of this revision.Jun 8 2023, 6:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2023, 6:20 AM

Herald added subscribers: llvm-commits, • pcwang-thead. · View Herald Transcript

@jmorse : I saw you did a bunch of nice debug info cleanups in https://reviews.llvm.org/D151419 but I don't think this problem is fixed there?

I stumbled upon this problem in downstream testing where debug info made us hit the ScheduleRegionSize limit thing.

What is the difference between runs results? Looks like it gets vectorized in all cases. Also, do a cleanup for metadata and attributes.

bjope added inline comments.Jun 8 2023, 6:50 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget.ll
8 ↗	(On Diff #529575)	So this test case is kind of "broken" already on trunk? (Looks like commit 352c46e70716061e99cae2009daddbfc78380fda changed this test in a way so that the first loads are vectorized even with a budget at 16.)

Harbormaster completed remote builds in B237481: Diff 529575.Jun 8 2023, 6:54 AM

Many thanks for the patch, it looks good to me although I'm not very familiar with the vectorisers -- if no-one else reviews deeply then I can give it a shot later,

Would it be simpler to replace the increment lines with calls to getNextNonDebugInstruction rather than add a new predicate / set of calls -- this has the benefit of being slightly smaller, and will make cleanup easier once we've suppressed debug intrinsics entirely,

@jmorse : I saw you did a bunch of nice debug info cleanups in https://reviews.llvm.org/D151419 but I don't think this problem is fixed there?

Those are just the least contentious changes, there's more at [0], and probably a few other patches too. Although it looks like I didn't detect this particular defect at all!

[0] https://github.com/jmorse/llvm-project/commit/2d3354323ae83df2b00dc327645435fae668f94c

ABataev added inline comments.Jun 8 2023, 7:51 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget.ll
8 ↗	(On Diff #529575)	Not broken, just changed. You'd better to add a new test with budget limits and debug info.

bjope added inline comments.Jun 8 2023, 8:16 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget.ll
8 ↗	(On Diff #529575)	Just thinking that this comment, as well as the later comment about " ; Don't vectorize these loads.", seem to explain the expected outcome of the test. But then the CHECK:s seem to validate something else. That is confusing. But maybe uabelho can just remove the RUN line with budget=16. One could just use budget=18 instead. As that also would show the debug-variance problem (e.g by pre-commit such a slightly modified test).

ABataev added inline comments.Jun 8 2023, 8:30 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget.ll
8 ↗	(On Diff #529575)	The comments need to be removed in the separate patch. Or budget limit adjusted, whatever.

Created a new testcase instead of fiddling with the existing one.
Minor cleanup in metadata, removed dbg.value attributes.

In D152441#4405841, @jmorse wrote:

Many thanks for the patch, it looks good to me although I'm not very familiar with the vectorisers -- if no-one else reviews deeply then I can give it a shot later,

Would it be simpler to replace the increment lines with calls to getNextNonDebugInstruction rather than add a new predicate / set of calls -- this has the benefit of being slightly smaller, and will make cleanup easier once we've suppressed debug intrinsics entirely,

I don't know. The searches use both forward and reverse iterators, I thought changing to getNextNonDebugInstruction/getPrevNonDebugInstruction would make the change larger since I guess I'd have to get rid of the reverse iterator then (?) and since I'm not familiar with this code I tried keeping the change as small as possible to not risk messing things up.

But I can try to go that route again if we prefer that.

Harbormaster completed remote builds in B237664: Diff 529826.Jun 8 2023, 11:30 PM

But I can try to go that route again if we prefer that.

Not a strong opinion, it works just as well as it is.

ABataev added inline comments.Jun 10 2023, 5:25 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget_debug_info.ll
179	Try to simplify metadata, I rather doubt you need all the debug info specified here

Rebase and remove as much metadata as I can without getting complaints from e.g. update_test_checks.py.

uabelho marked an inline comment as done.Jun 12 2023, 4:48 AM

ABataev added inline comments.Jun 12 2023, 4:57 AM

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget_debug_info.ll
1	Create a separate NFC patch with this test only.

Precommit testcase in https://reviews.llvm.org/D152705 so this patch shows how the testcase is improved.

Harbormaster completed remote builds in B238163: Diff 530474.Jun 12 2023, 5:30 AM

uabelho added a parent revision: D152705: [test][SLPVectorizer] Precommit testcase showing debug info affects codegen.Jun 12 2023, 5:31 AM

Rebase

Harbormaster completed remote builds in B238427: Diff 530821.Jun 13 2023, 3:39 AM

ABataev added inline comments.Jun 13 2023, 7:31 AM

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
11460	I think it worth it to extend it not only for debug intrinsics, but for all ephemeral instructions, like lifetime, assume, etc.

uabelho added inline comments.Jun 14 2023, 1:33 AM

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
11460	I can do that, e.g. to check isAssumeLikeIntrinsic(). However, then the patch is starting to drift away from just fixing the "dbg info affects codegen" problem I was aiming at so then we start fixing/changing something else as well. Do you still want that in the same patch?

ABataev added inline comments.Jun 14 2023, 2:45 AM

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
11460	Yes, the change is small, you can do it here. And rename the title.

Ignore isAssumeLikeIntrinsic and not just debug intrinsics.

uabelho marked 2 inline comments as done.Jun 14 2023, 3:46 AM

This revision is now accepted and ready to land.Jun 14 2023, 3:55 AM

This revision was landed with ongoing or failed builds.Jun 14 2023, 4:03 AM

Closed by commit rGac9b9e3aad6b: [SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize (authored by uabelho). · Explain Why

This revision was automatically updated to reflect the committed changes.

uabelho added a commit: rGac9b9e3aad6b: [SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize.

Harbormaster completed remote builds in B238743: Diff 531250.Jun 14 2023, 4:24 AM

Diff 530464

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,443 Lines • ▼ Show 20 Lines	if (!ScheduleStart) {
if (isOneOf(S, I) != I)		if (isOneOf(S, I) != I)
CheckScheduleForI(I);		CheckScheduleForI(I);
assert(ScheduleEnd && "tried to vectorize a terminator?");		assert(ScheduleEnd && "tried to vectorize a terminator?");
LLVM_DEBUG(dbgs() << "SLP: initialize schedule region to " << *I << "\n");		LLVM_DEBUG(dbgs() << "SLP: initialize schedule region to " << *I << "\n");
return true;		return true;
}		}
// Search up and down at the same time, because we don't know if the new		// Search up and down at the same time, because we don't know if the new
// instruction is above or below the existing scheduling region.		// instruction is above or below the existing scheduling region.
		// Ignore debug info so that's not counted against the budget. Otherwise
		// debug info could affect codegen.
BasicBlock::reverse_iterator UpIter =		BasicBlock::reverse_iterator UpIter =
++ScheduleStart->getIterator().getReverse();		++ScheduleStart->getIterator().getReverse();
BasicBlock::reverse_iterator UpperEnd = BB->rend();		BasicBlock::reverse_iterator UpperEnd = BB->rend();
BasicBlock::iterator DownIter = ScheduleEnd->getIterator();		BasicBlock::iterator DownIter = ScheduleEnd->getIterator();
BasicBlock::iterator LowerEnd = BB->end();		BasicBlock::iterator LowerEnd = BB->end();
		auto IsDbgInstr = [](const Instruction &I) {
		return isa<DbgInfoIntrinsic>(&I);
		ABataevUnsubmitted Done Reply Inline Actions I think it worth it to extend it not only for debug intrinsics, but for all ephemeral instructions, like lifetime, assume, etc. ABataev: I think it worth it to extend it not only for debug intrinsics, but for all ephemeral…
		uabelhoAuthorUnsubmitted Done Reply Inline Actions I can do that, e.g. to check isAssumeLikeIntrinsic(). However, then the patch is starting to drift away from just fixing the "dbg info affects codegen" problem I was aiming at so then we start fixing/changing something else as well. Do you still want that in the same patch? uabelho: I can do that, e.g. to check isAssumeLikeIntrinsic(). However, then the patch is starting to…
		ABataevUnsubmitted Done Reply Inline Actions Yes, the change is small, you can do it here. And rename the title. ABataev: Yes, the change is small, you can do it here. And rename the title.
		};
		UpIter = std::find_if_not(UpIter, UpperEnd, IsDbgInstr);
		DownIter = std::find_if_not(DownIter, LowerEnd, IsDbgInstr);
while (UpIter != UpperEnd && DownIter != LowerEnd && &*UpIter != I &&		while (UpIter != UpperEnd && DownIter != LowerEnd && &*UpIter != I &&
&*DownIter != I) {		&*DownIter != I) {
if (++ScheduleRegionSize > ScheduleRegionSizeLimit) {		if (++ScheduleRegionSize > ScheduleRegionSizeLimit) {
LLVM_DEBUG(dbgs() << "SLP: exceeded schedule region size limit\n");		LLVM_DEBUG(dbgs() << "SLP: exceeded schedule region size limit\n");
return false;		return false;
}		}

++UpIter;		++UpIter;
++DownIter;		++DownIter;

		UpIter = std::find_if_not(UpIter, UpperEnd, IsDbgInstr);
		DownIter = std::find_if_not(DownIter, LowerEnd, IsDbgInstr);
}		}
if (DownIter == LowerEnd \|\| (UpIter != UpperEnd && &*UpIter == I)) {		if (DownIter == LowerEnd \|\| (UpIter != UpperEnd && &*UpIter == I)) {
assert(I->getParent() == ScheduleStart->getParent() &&		assert(I->getParent() == ScheduleStart->getParent() &&
"Instruction is in wrong basic block.");		"Instruction is in wrong basic block.");
initScheduleData(I, ScheduleStart, nullptr, FirstLoadStoreInRegion);		initScheduleData(I, ScheduleStart, nullptr, FirstLoadStoreInRegion);
ScheduleStart = I;		ScheduleStart = I;
if (isOneOf(S, I) != I)		if (isOneOf(S, I) != I)
CheckScheduleForI(I);		CheckScheduleForI(I);
▲ Show 20 Lines • Show All 3,510 Lines • Show Last 20 Lines

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget_debug_info.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				ABataevUnsubmitted Not Done Reply Inline Actions Create a separate NFC patch with this test only. ABataev: Create a separate NFC patch with this test only.
				; RUN: opt < %s -passes=slp-vectorizer -S -slp-schedule-budget=3 -mtriple=x86_64-apple-macosx10.8.0 -mcpu=corei7-avx \| FileCheck %s -check-prefix VECTOR_DBG
				; RUN: opt < %s -strip-debug -passes=slp-vectorizer -S -slp-schedule-budget=3 -mtriple=x86_64-apple-macosx10.8.0 -mcpu=corei7-avx \| FileCheck %s -check-prefix VECTOR_NODBG

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.9.0"

				; Verify that we get vectorization with -slp-schedule-budget=3. We should
				; get vectorization even if there happens to be some dbg.value calls since they
				; should be ignored, to not let debug information affect the code we get.

				declare void @unknown()

				define void @test(ptr %a, ptr %b, ptr %c, ptr %d) {
				; VECTOR_DBG-LABEL: @test(
				; VECTOR_DBG-NEXT: entry:
				; VECTOR_DBG-NEXT: [[TMP0:%.]] = load <4 x float>, ptr [[A:%.]], align 4
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @unknown()
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3:![0-9]+]], metadata !DIExpression()), !dbg [[DBG5:![0-9]+]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: call void @llvm.dbg.value(metadata i16 1, metadata [[META3]], metadata !DIExpression()), !dbg [[DBG5]]
				; VECTOR_DBG-NEXT: store <4 x float> [[TMP0]], ptr [[B:%.*]], align 4
				; VECTOR_DBG-NEXT: [[TMP1:%.]] = load <4 x float>, ptr [[C:%.]], align 4
				; VECTOR_DBG-NEXT: store <4 x float> [[TMP1]], ptr [[D:%.*]], align 4
				; VECTOR_DBG-NEXT: ret void
				;
				; VECTOR_NODBG-LABEL: @test(
				; VECTOR_NODBG-NEXT: entry:
				; VECTOR_NODBG-NEXT: [[TMP0:%.]] = load <4 x float>, ptr [[A:%.]], align 4
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: call void @unknown()
				; VECTOR_NODBG-NEXT: store <4 x float> [[TMP0]], ptr [[B:%.*]], align 4
				; VECTOR_NODBG-NEXT: [[TMP1:%.]] = load <4 x float>, ptr [[C:%.]], align 4
				; VECTOR_NODBG-NEXT: store <4 x float> [[TMP1]], ptr [[D:%.*]], align 4
				; VECTOR_NODBG-NEXT: ret void
				;
				entry:
				%l0 = load float, ptr %a
				%a1 = getelementptr inbounds float, ptr %a, i64 1
				%l1 = load float, ptr %a1
				%a2 = getelementptr inbounds float, ptr %a, i64 2
				%l2 = load float, ptr %a2
				%a3 = getelementptr inbounds float, ptr %a, i64 3
				%l3 = load float, ptr %a3

				; some unrelated instructions inbetween to enlarge the scheduling region
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()
				call void @unknown()

				; The dbg.values should not affect vectorization.
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5

				store float %l0, ptr %b

				; The dbg.values should not affect vectorization.
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				call void @llvm.dbg.value(metadata i16 1, metadata !3, metadata !DIExpression()), !dbg !5
				%b1 = getelementptr inbounds float, ptr %b, i64 1
				store float %l1, ptr %b1
				%b2 = getelementptr inbounds float, ptr %b, i64 2
				store float %l2, ptr %b2
				%b3 = getelementptr inbounds float, ptr %b, i64 3
				store float %l3, ptr %b3

				%l4 = load float, ptr %c
				%c1 = getelementptr inbounds float, ptr %c, i64 1
				%l5 = load float, ptr %c1
				%c2 = getelementptr inbounds float, ptr %c, i64 2
				%l6 = load float, ptr %c2
				%c3 = getelementptr inbounds float, ptr %c, i64 3
				%l7 = load float, ptr %c3

				store float %l4, ptr %d
				%d1 = getelementptr inbounds float, ptr %d, i64 1
				store float %l5, ptr %d1
				%d2 = getelementptr inbounds float, ptr %d, i64 2
				store float %l6, ptr %d2
				%d3 = getelementptr inbounds float, ptr %d, i64 3
				store float %l7, ptr %d3

				ret void
				}

				declare void @llvm.dbg.value(metadata, metadata, metadata)

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2}

				!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1)
				!1 = !DIFile(filename: "foo.c", directory: "/")
				ABataevUnsubmitted Done Reply Inline Actions Try to simplify metadata, I rather doubt you need all the debug info specified here ABataev: Try to simplify metadata, I rather doubt you need all the debug info specified here
				!2 = !{i32 2, !"Debug Info Version", i32 3}
				!3 = !DILocalVariable(scope: !4)
				!4 = distinct !DISubprogram(unit: !0)
				!5 = !DILocation(scope: !4)

This is an archive of the discontinued LLVM Phabricator instance.

[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 530464

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget_debug_info.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSizeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 530464

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

llvm/test/Transforms/SLPVectorizer/X86/schedule_budget_debug_info.ll

[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize
ClosedPublic