This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/IR/
-
IR/
1
Instructions.cpp
-
test/Transforms/
-
Transforms/
-
Inline/
-
prof-update-sample-alwaysinline.ll
-
prof-update-sample.ll
-
SampleProfile/
-
entry_counts_cold.ll
-
inline-mergeprof.ll
-
unittests/IR/
-
IR/
-
InstructionsTest.cpp

Differential D90539

Make CallInst::updateProfWeight emit i32 weights instead of i64
ClosedPublic

Authored by aeubanks on Oct 31 2020, 12:43 PM.

Download Raw Diff

Details

Reviewers

asbirlea
ychen
wmi
mtrofin

Commits

rG3d1149c6fe48: Make CallInst::updateProfWeight emit i32 weights instead of i64

Summary

Typically branch_weights are i32, not i64.
This fixes entry_counts_cold.ll under NPM.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aeubanks created this revision.Oct 31 2020, 12:43 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 31 2020, 12:43 PM

Herald added subscribers: llvm-commits, dexonsmith, wenlei, hiraditya. · View Herald Transcript

aeubanks requested review of this revision.Oct 31 2020, 12:43 PM

Harbormaster completed remote builds in B77152: Diff 302109.Oct 31 2020, 1:31 PM

asbirlea added reviewers: wmi, mtrofin.Nov 3 2020, 6:38 PM

Would we rather want them to be i64, or is there a fundamental reason they should be i32?

Ideally they'd be i64, see https://reviews.llvm.org/D88609 where I tried doing that, but it was reverted multiple times due to uint64_t overflows and I gave up trying to fix the various issues. There are multiple instances of using uint64_t to do uint32_t arithmetic and checking for overflows, and I eventually got annoyed with all those and gave up.

In D90539#2372744, @aeubanks wrote:

Ideally they'd be i64, see https://reviews.llvm.org/D88609 where I tried doing that, but it was reverted multiple times due to uint64_t overflows and I gave up trying to fix the various issues. There are multiple instances of using uint64_t to do uint32_t arithmetic and checking for overflows, and I eventually got annoyed with all those and gave up.

Ah... iiuc, with 32 bit, overflow can be detected more easily?

Yeah, just do 32-bit arithmetic with uint64_t and you can see if the final result overflows. One example is here: https://github.com/llvm/llvm-project/blob/7ba3293691beb9a2c6ea4a81064c24580afe5816/llvm/lib/Analysis/BranchProbabilityInfo.cpp#L486

The main issue is that this is done in multiple places around LLVM.

Ping
We can always fix everything to use i64 in a follow-up change, but for now I'd like to fix entry_counts_cold.ll under the NPM.

In D90539#2389866, @aeubanks wrote:

Ping
We can always fix everything to use i64 in a follow-up change, but for now I'd like to fix entry_counts_cold.ll under the NPM.

I agree this should be fixed separately from updating the world (that could be before or after this patch). I'm also a bit skeptical that it's valuable to have more than i32 for branch weights.

llvm/lib/IR/Instructions.cpp
563–565	I think you need `getLimitedValue(UINT32_MAX)` here. Can you add a test that covers that as well?

getLimitedValue(UINT32_MAX) and add test

Harbormaster completed remote builds in B78840: Diff 305294.Nov 13 2020, 7:47 PM

ping

Unblocking the fixing of the tests under NPM. Additional changes can be done as follow ups.

This revision is now accepted and ready to land.Nov 24 2020, 4:36 PM

This revision was landed with ongoing or failed builds.Nov 24 2020, 6:28 PM

Closed by commit rG3d1149c6fe48: Make CallInst::updateProfWeight emit i32 weights instead of i64 (authored by aeubanks). · Explain Why

This revision was automatically updated to reflect the committed changes.

aeubanks added a commit: rG3d1149c6fe48: Make CallInst::updateProfWeight emit i32 weights instead of i64.

Revision Contents

Path

Size

llvm/

lib/

IR/

Instructions.cpp

5 lines

test/

Transforms/

Inline/

prof-update-sample-alwaysinline.ll

10 lines

prof-update-sample.ll

10 lines

SampleProfile/

entry_counts_cold.ll

3 lines

inline-mergeprof.ll

2 lines

unittests/

IR/

InstructionsTest.cpp

24 lines

Diff 307491

llvm/lib/IR/Instructions.cpp

Show First 20 Lines • Show All 554 Lines • ▼ Show 20 Lines	void CallInst::updateProfWeight(uint64_t S, uint64_t T) {
APInt APS(128, S), APT(128, T);		APInt APS(128, S), APT(128, T);
if (ProfDataName->getString().equals("branch_weights") &&		if (ProfDataName->getString().equals("branch_weights") &&
ProfileData->getNumOperands() > 0) {		ProfileData->getNumOperands() > 0) {
// Using APInt::div may be expensive, but most cases should fit 64 bits.		// Using APInt::div may be expensive, but most cases should fit 64 bits.
APInt Val(128, mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(1))		APInt Val(128, mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(1))
->getValue()		->getValue()
.getZExtValue());		.getZExtValue());
Val *= APS;		Val *= APS;
Vals.push_back(MDB.createConstant(ConstantInt::get(		Vals.push_back(MDB.createConstant(
Type::getInt64Ty(getContext()), Val.udiv(APT).getLimitedValue())));		ConstantInt::get(Type::getInt32Ty(getContext()),
		Val.udiv(APT).getLimitedValue(UINT32_MAX))));
		dexonsmithUnsubmitted Not Done Reply Inline Actions I think you need `getLimitedValue(UINT32_MAX)` here. Can you add a test that covers that as well? dexonsmith: I think you need `getLimitedValue(UINT32_MAX)` here. Can you add a test that covers that as…
} else if (ProfDataName->getString().equals("VP"))		} else if (ProfDataName->getString().equals("VP"))
for (unsigned i = 1; i < ProfileData->getNumOperands(); i += 2) {		for (unsigned i = 1; i < ProfileData->getNumOperands(); i += 2) {
// The first value is the key of the value profile, which will not change.		// The first value is the key of the value profile, which will not change.
Vals.push_back(ProfileData->getOperand(i));		Vals.push_back(ProfileData->getOperand(i));
// Using APInt::div may be expensive, but most cases should fit 64 bits.		// Using APInt::div may be expensive, but most cases should fit 64 bits.
APInt Val(128,		APInt Val(128,
mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(i + 1))		mdconst::dyn_extract<ConstantInt>(ProfileData->getOperand(i + 1))
->getValue()		->getValue()
▲ Show 20 Lines • Show All 3,944 Lines • Show Last 20 Lines

llvm/test/Transforms/Inline/prof-update-sample-alwaysinline.ll

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	!8 = !{!"NumCounts", i64 2}			!8 = !{!"NumCounts", i64 2}
	!9 = !{!"NumFunctions", i64 2}			!9 = !{!"NumFunctions", i64 2}
	!10 = !{!"DetailedSummary", !11}			!10 = !{!"DetailedSummary", !11}
	!11 = !{!12, !13, !14}			!11 = !{!12, !13, !14}
	!12 = !{i32 10000, i64 100, i32 1}			!12 = !{i32 10000, i64 100, i32 1}
	!13 = !{i32 999000, i64 100, i32 1}			!13 = !{i32 999000, i64 100, i32 1}
	!14 = !{i32 999999, i64 1, i32 2}			!14 = !{i32 999999, i64 1, i32 2}
	!15 = !{!"function_entry_count", i64 1000}			!15 = !{!"function_entry_count", i64 1000}
	!16 = !{!"branch_weights", i64 2000}			!16 = !{!"branch_weights", i32 2000}
	!17 = !{!"branch_weights", i64 400}			!17 = !{!"branch_weights", i32 400}
	!18 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}			!18 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}
	attributes #0 = { alwaysinline }			attributes #0 = { alwaysinline }
	; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}			; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}
	; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i64 2000}			; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i32 2000}
	; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i64 1200}			; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i32 1200}
	; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}			; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}
	; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i64 800}			; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i32 800}
	; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}			; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}

llvm/test/Transforms/Inline/prof-update-sample.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	!8 = !{!"NumCounts", i64 2}			!8 = !{!"NumCounts", i64 2}
	!9 = !{!"NumFunctions", i64 2}			!9 = !{!"NumFunctions", i64 2}
	!10 = !{!"DetailedSummary", !11}			!10 = !{!"DetailedSummary", !11}
	!11 = !{!12, !13, !14}			!11 = !{!12, !13, !14}
	!12 = !{i32 10000, i64 100, i32 1}			!12 = !{i32 10000, i64 100, i32 1}
	!13 = !{i32 999000, i64 100, i32 1}			!13 = !{i32 999000, i64 100, i32 1}
	!14 = !{i32 999999, i64 1, i32 2}			!14 = !{i32 999999, i64 1, i32 2}
	!15 = !{!"function_entry_count", i64 1000}			!15 = !{!"function_entry_count", i64 1000}
	!16 = !{!"branch_weights", i64 2000}			!16 = !{!"branch_weights", i32 2000}
	!17 = !{!"branch_weights", i64 400}			!17 = !{!"branch_weights", i32 400}
	!18 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}			!18 = !{!"VP", i32 0, i64 140, i64 111, i64 80, i64 222, i64 40, i64 333, i64 20}
	; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}			; CHECK: ![[ENTRY_COUNT]] = !{!"function_entry_count", i64 600}
	; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i64 2000}			; CHECK: ![[COUNT_CALLEE1]] = !{!"branch_weights", i32 2000}
	; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i64 1200}			; CHECK: ![[COUNT_CALLEE]] = !{!"branch_weights", i32 1200}
	; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}			; CHECK: ![[COUNT_IND_CALLEE]] = !{!"VP", i32 0, i64 84, i64 111, i64 48, i64 222, i64 24, i64 333, i64 12}
	; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i64 800}			; CHECK: ![[COUNT_CALLER]] = !{!"branch_weights", i32 800}
	; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}			; CHECK: ![[COUNT_IND_CALLER]] = !{!"VP", i32 0, i64 56, i64 111, i64 32, i64 222, i64 16, i64 333, i64 8}

llvm/test/Transforms/SampleProfile/entry_counts_cold.ll

	; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/entry_counts_cold.prof -S \| FileCheck %s			; RUN: opt < %s -sample-profile -sample-profile-file=%S/Inputs/entry_counts_cold.prof -S \| FileCheck %s
				; RUN: opt < %s -passes=sample-profile -sample-profile-file=%S/Inputs/entry_counts_cold.prof -S \| FileCheck %s
	; ModuleID = 'temp.bc'			; ModuleID = 'temp.bc'
	source_filename = "temp.c"			source_filename = "temp.c"
	target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-apple-macosx10.14.0"			target triple = "x86_64-apple-macosx10.14.0"

	; Function Attrs: nounwind ssp uwtable			; Function Attrs: nounwind ssp uwtable
	; CHECK: define i32 @top({{.*}} !prof [[TOP:![0-9]+]]			; CHECK: define i32 @top({{.*}} !prof [[TOP:![0-9]+]]
	define i32 @top(i32* %p) #0 !dbg !8 {			define i32 @top(i32* %p) #0 !dbg !8 {
	▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines

	!llvm.dbg.cu = !{!0}			!llvm.dbg.cu = !{!0}
	!llvm.module.flags = !{!3, !4, !5, !6}			!llvm.module.flags = !{!3, !4, !5, !6}
	!llvm.ident = !{!7}			!llvm.ident = !{!7}

	; CHECK: [[TOP]] = !{!"function_entry_count", i64 101}			; CHECK: [[TOP]] = !{!"function_entry_count", i64 101}
	; CHECK: [[FOO]] = !{!"function_entry_count", i64 151}			; CHECK: [[FOO]] = !{!"function_entry_count", i64 151}
	; CHECK: [[BAR]] = !{!"function_entry_count", i64 303}			; CHECK: [[BAR]] = !{!"function_entry_count", i64 303}
	; CHECK: [[BAZ]] = !{!"branch_weights", i64 303}			; CHECK: [[BAZ]] = !{!"branch_weights", i32 303}

	!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 8.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: GNU)			!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 8.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2, nameTableKind: GNU)
	!1 = !DIFile(filename: "temp.c", directory: "llvm/test/Transforms/SampleProfile")			!1 = !DIFile(filename: "temp.c", directory: "llvm/test/Transforms/SampleProfile")
	!2 = !{}			!2 = !{}
	!3 = !{i32 2, !"Dwarf Version", i32 4}			!3 = !{i32 2, !"Dwarf Version", i32 4}
	!4 = !{i32 2, !"Debug Info Version", i32 3}			!4 = !{i32 2, !"Debug Info Version", i32 3}
	!5 = !{i32 1, !"wchar_size", i32 4}			!5 = !{i32 1, !"wchar_size", i32 4}
	!6 = !{i32 7, !"PIC Level", i32 2}			!6 = !{i32 7, !"PIC Level", i32 2}
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

llvm/test/Transforms/SampleProfile/inline-mergeprof.ll

	Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines
	!15 = !DILocation(line: 6, scope: !12)			!15 = !DILocation(line: 6, scope: !12)
	!16 = distinct !DISubprogram(name: "sub", scope: !1, file: !1, line: 20, type: !7, scopeLine: 20, virtualIndex: 6, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)			!16 = distinct !DISubprogram(name: "sub", scope: !1, file: !1, line: 20, type: !7, scopeLine: 20, virtualIndex: 6, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
	!17 = !DILocation(line: 20, scope: !16)			!17 = !DILocation(line: 20, scope: !16)
	!18 = !DILocation(line: 21, scope: !16)			!18 = !DILocation(line: 21, scope: !16)

	; SCALE: name: "sum"			; SCALE: name: "sum"
	; SCALE-NEXT: {!"function_entry_count", i64 46}			; SCALE-NEXT: {!"function_entry_count", i64 46}
	; SCALE: !{!"branch_weights", i32 11, i32 2}			; SCALE: !{!"branch_weights", i32 11, i32 2}
	; SCALE: !{!"branch_weights", i64 20}			; SCALE: !{!"branch_weights", i32 20}
	; SCALE: name: "sub"			; SCALE: name: "sub"
	; SCALE-NEXT: {!"function_entry_count", i64 -1}			; SCALE-NEXT: {!"function_entry_count", i64 -1}

	; MERGE: name: "sum"			; MERGE: name: "sum"
	; MERGE-NEXT: {!"function_entry_count", i64 46}			; MERGE-NEXT: {!"function_entry_count", i64 46}
	; MERGE: !{!"branch_weights", i32 11, i32 23}			; MERGE: !{!"branch_weights", i32 11, i32 23}
	; MERGE: !{!"branch_weights", i32 10}			; MERGE: !{!"branch_weights", i32 10}
	; MERGE: name: "sub"			; MERGE: name: "sub"
	; MERGE-NEXT: {!"function_entry_count", i64 3}			; MERGE-NEXT: {!"function_entry_count", i64 3}

llvm/unittests/IR/InstructionsTest.cpp

Show First 20 Lines • Show All 1,370 Lines • ▼ Show 20 Lines	ASSERT_TRUE(M);
EXPECT_EQ(I2->getDebugLoc().getLine(), 2U);		EXPECT_EQ(I2->getDebugLoc().getLine(), 2U);
I2->dropLocation();		I2->dropLocation();
EXPECT_EQ(I2->getDebugLoc().getLine(), 0U);		EXPECT_EQ(I2->getDebugLoc().getLine(), 0U);
EXPECT_EQ(I2->getDebugLoc().getScope(), Scope);		EXPECT_EQ(I2->getDebugLoc().getScope(), Scope);
EXPECT_EQ(I2->getDebugLoc().getInlinedAt(), nullptr);		EXPECT_EQ(I2->getDebugLoc().getInlinedAt(), nullptr);
}		}
}		}

		TEST(InstructionsTest, BranchWeightOverflow) {
		LLVMContext C;
		std::unique_ptr<Module> M = parseIR(C,
		R"(
		declare void @callee()

		define void @caller() {
		call void @callee(), !prof !1
		ret void
		}

		!1 = !{!"branch_weights", i32 20000}
		)");
		ASSERT_TRUE(M);
		CallInst *CI =
		cast<CallInst>(&M->getFunction("caller")->getEntryBlock().front());
		uint64_t ProfWeight;
		CI->extractProfTotalWeight(ProfWeight);
		ASSERT_EQ(ProfWeight, 20000U);
		CI->updateProfWeight(10000000, 1);
		CI->extractProfTotalWeight(ProfWeight);
		ASSERT_EQ(ProfWeight, UINT32_MAX);
		}

} // end anonymous namespace		} // end anonymous namespace
} // end namespace llvm		} // end namespace llvm