Download Raw Diff

Details

Reviewers

xur
wenlei

Commits

rG958a3d8e2dec: [FS-AFDO] Do not load non-FS profile in MIR loader.

Summary

I was seeing a regression when enabling FS discriminators on an non-FS CSSPGO build. This is because a probe can get a zero-valued discriminator at a specific pass and that could lead to accidentally loading the corresponding base counter in the non-FS profile, while a non-zeo discriminator would end up getting zero samples. This could in turn undo the sample distribution effort done by previous BFI maintenance work and the probe distribution factor work for pseudo probes specifically. To mitigate that I'm disabling loading a non-FS profile against FS discriminators. The problem should also exist with non-CS AutoFDO, so I'm doing this for it too.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hoy created this revision.May 1 2023, 10:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2023, 10:21 AM

Herald added subscribers: wlei, modimo, wenlei and 2 others. · View Herald Transcript

hoy requested review of this revision.May 1 2023, 10:21 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2023, 10:21 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B229262: Diff 518485.May 1 2023, 10:22 AM

hoy added reviewers: xur, wenlei.May 1 2023, 10:22 AM

hoy retitled this revision from [FS-AFDO] Do not load non-FS profil in MIR loader. to [FS-AFDO] Do not load non-FS profile in MIR loader..

hoy added a parent revision: D148584: [FS-AFDO] Load pseudo probe profile on MIR.

@xur wondering if you have seen such regression for non-CS AutoFDO. I think in theory it could be there but I never measured that.

I was seeing a regression when enabling FS discriminators on an non-FS CSSPGO build. This is because a probe can get a zero-valued discriminator at a specific pass and that could lead to accidentally loading the corresponding base counter in the non-FS profile

For non-FS profile, probe doesn't have discriminator, is there no way to differentiate zero discriminator from no discriminator? If a probe in the profile doesn't have discriminator, it should not be considered for FS profile loading. In this case, if input profile is not FS, no probe would have discriminator, hence nothing should be loaded in MIR. Is such checking missing on per-probe level?

llvm/lib/CodeGen/MIRSampleProfile.cpp
297	The profile itself isn't really invalid, it's just not suitable for FS loading, so piggyback on ProfileIsValid seems hacky. Suggest to have a more dedicated way to skip. Also please add comment to explain why only FS enabled profile is accepted for MIR loading.

In D149597#4328418, @wenlei wrote:

I was seeing a regression when enabling FS discriminators on an non-FS CSSPGO build. This is because a probe can get a zero-valued discriminator at a specific pass and that could lead to accidentally loading the corresponding base counter in the non-FS profile

For non-FS profile, probe doesn't have discriminator, is there no way to differentiate zero discriminator from no discriminator? If a probe in the profile doesn't have discriminator, it should not be considered for FS profile loading. In this case, if input profile is not FS, no probe would have discriminator, hence nothing should be loaded in MIR. Is such checking missing on per-probe level?

The body sample map uses id+discriminator as the key, so currently we don't have a way on the function level to tell if a function profile is FS or not. We can tell on the program level, basically using the FS flag.

Updating D149597: [FS-AFDO] Do not load non-FS profile in MIR loader.

Harbormaster completed remote builds in B230953: Diff 520805.May 9 2023, 4:12 PM

In D149597#4330531, @hoy wrote:

In D149597#4328418, @wenlei wrote:

I was seeing a regression when enabling FS discriminators on an non-FS CSSPGO build. This is because a probe can get a zero-valued discriminator at a specific pass and that could lead to accidentally loading the corresponding base counter in the non-FS profile

For non-FS profile, probe doesn't have discriminator, is there no way to differentiate zero discriminator from no discriminator? If a probe in the profile doesn't have discriminator, it should not be considered for FS profile loading. In this case, if input profile is not FS, no probe would have discriminator, hence nothing should be loaded in MIR. Is such checking missing on per-probe level?

The body sample map uses id+discriminator as the key, so currently we don't have a way on the function level to tell if a function profile is FS or not. We can tell on the program level, basically using the FS flag.

So the situation here is that the input profile isn't FS, but the current build has FS flag on. Now at MIR profile loading, we get the line+FSdiscriminator from IR, and trying to look up BodySamples map via findSamplesAt. How does that end up with base profile given the findSamplesAt has line+FSdiscriminator as key, which is from DIL on IR. When input profile is not FS, I assume the difference (comparing to when input profile is FS) is that the matching line+FSdiscriminator won't exist in BodySamples, hence it will find and load nothing for that location?

In D149597#4332879, @wenlei wrote:

In D149597#4330531, @hoy wrote:

In D149597#4328418, @wenlei wrote:

I was seeing a regression when enabling FS discriminators on an non-FS CSSPGO build. This is because a probe can get a zero-valued discriminator at a specific pass and that could lead to accidentally loading the corresponding base counter in the non-FS profile

For non-FS profile, probe doesn't have discriminator, is there no way to differentiate zero discriminator from no discriminator? If a probe in the profile doesn't have discriminator, it should not be considered for FS profile loading. In this case, if input profile is not FS, no probe would have discriminator, hence nothing should be loaded in MIR. Is such checking missing on per-probe level?

The body sample map uses id+discriminator as the key, so currently we don't have a way on the function level to tell if a function profile is FS or not. We can tell on the program level, basically using the FS flag.

So the situation here is that the input profile isn't FS, but the current build has FS flag on. Now at MIR profile loading, we get the line+FSdiscriminator from IR, and trying to look up BodySamples map via findSamplesAt. How does that end up with base profile given the findSamplesAt has line+FSdiscriminator as key, which is from DIL on IR. When input profile is not FS, I assume the difference (comparing to when input profile is FS) is that the matching line+FSdiscriminator won't exist in BodySamples, hence it will find and load nothing for that location?

Yes, most of the cases the lines will end up loading nothing, and the sample loader doesn't do anything further in terms of annotation and inference (https://github.com/llvm/llvm-project/blob/516e301752560311d2cd8c2b549493eb0f98d01b/llvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h#L877). However for lines that do have a zero FSdiscriminator, line+FSdiscriminator can accidentally match the base counter while all other lines still load nothing. This creates a situation only a few blocks in a function have samples from their base counter while all other blocks will have a zero count. Inferencing from there will be problematic.

However for lines that do have a zero FSdiscriminator, line+FSdiscriminator can accidentally match the base counter while all other lines still load nothing.

Can this also happen when input profile is actually a FS profile? which is also not good? Or is that less of a problem when input profile is FS because we will also load other samples whose FSDiscriminator is non-zero?

In D149597#4333284, @wenlei wrote:

However for lines that do have a zero FSdiscriminator, line+FSdiscriminator can accidentally match the base counter while all other lines still load nothing.

Can this also happen when input profile is actually a FS profile? which is also not good? Or is that less of a problem when input profile is FS because we will also load other samples whose FSDiscriminator is non-zero?

I think it's not a problem for FS profiles. Given a FS profile, a zero discriminator is a valid FS discriminator which means the counter is not aggregated and there will be no more than one line loading that counter on MIR. For a non-FS profile, we cannot tell if a zero is a valid FS discriminator unless the file-level flag is checked.

In D149597#4333318, @hoy wrote:

In D149597#4333284, @wenlei wrote:

However for lines that do have a zero FSdiscriminator, line+FSdiscriminator can accidentally match the base counter while all other lines still load nothing.

Can this also happen when input profile is actually a FS profile? which is also not good? Or is that less of a problem when input profile is FS because we will also load other samples whose FSDiscriminator is non-zero?

I think it's not a problem for FS profiles. Given a FS profile, a zero discriminator is a valid FS discriminator which means the counter is not aggregated and there will be no more than one line loading that counter on MIR. For a non-FS profile, we cannot tell if a zero is a valid FS discriminator unless the file-level flag is checked.

ok, that makes sense since profile reader actually masks discriminator based on the current FSDiscriminator mask.

This revision is now accepted and ready to land.May 10 2023, 3:18 PM

Moving check from doInitialization into runOnFunction since the former doesn't stop MIRProfileLoader which is a function pass from continuing.

Updating D149597: [FS-AFDO] Do not load non-FS profile in MIR loader.

This revision was landed with ongoing or failed builds.May 10 2023, 4:39 PM

Closed by commit rG958a3d8e2dec: [FS-AFDO] Do not load non-FS profile in MIR loader. (authored by hoy). · Explain Why

This revision was automatically updated to reflect the committed changes.

hoy added a commit: rG958a3d8e2dec: [FS-AFDO] Do not load non-FS profile in MIR loader..

Harbormaster completed remote builds in B231205: Diff 521137.May 10 2023, 5:23 PM

hoy added a child revision: D150625: [PseudoProbe] Only emit discriminstor in FS-AFDO mode..May 15 2023, 4:34 PM

Diff 521155

llvm/include/llvm/ProfileData/SampleProfReader.h

Show First 20 Lines • Show All 477 Lines • ▼ Show 20 Lines	public:
bool profileIsProbeBased() const { return ProfileIsProbeBased; }		bool profileIsProbeBased() const { return ProfileIsProbeBased; }

/// Whether input profile is fully context-sensitive.		/// Whether input profile is fully context-sensitive.
bool profileIsCS() const { return ProfileIsCS; }		bool profileIsCS() const { return ProfileIsCS; }

/// Whether input profile contains ShouldBeInlined contexts.		/// Whether input profile contains ShouldBeInlined contexts.
bool profileIsPreInlined() const { return ProfileIsPreInlined; }		bool profileIsPreInlined() const { return ProfileIsPreInlined; }

		/// Whether input profile is flow-sensitive.
		bool profileIsFS() const { return ProfileIsFS; }

virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() {		virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() {
return nullptr;		return nullptr;
};		};

/// It includes all the names that have samples either in outline instance		/// It includes all the names that have samples either in outline instance
/// or inline instance.		/// or inline instance.
virtual std::vector<StringRef> *getNameTable() { return nullptr; }		virtual std::vector<StringRef> *getNameTable() { return nullptr; }
virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) { return false; };		virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) { return false; };
▲ Show 20 Lines • Show All 385 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MIRSampleProfile.cpp

Show First 20 Lines • Show All 288 Lines • ▼ Show 20 Lines	bool MIRProfileLoader::doInitialization(Module &M) {
if (std::error_code EC = ReaderOrErr.getError()) {		if (std::error_code EC = ReaderOrErr.getError()) {
std::string Msg = "Could not open profile: " + EC.message();		std::string Msg = "Could not open profile: " + EC.message();
Ctx.diagnose(DiagnosticInfoSampleProfile(Filename, Msg));		Ctx.diagnose(DiagnosticInfoSampleProfile(Filename, Msg));
return false;		return false;
}		}

Reader = std::move(ReaderOrErr.get());		Reader = std::move(ReaderOrErr.get());
Reader->setModule(&M);		Reader->setModule(&M);
ProfileIsValid = (Reader->read() == sampleprof_error::success);		ProfileIsValid = (Reader->read() == sampleprof_error::success);
		wenleiUnsubmitted Not Done Reply Inline Actions The profile itself isn't really invalid, it's just not suitable for FS loading, so piggyback on ProfileIsValid seems hacky. Suggest to have a more dedicated way to skip. Also please add comment to explain why only FS enabled profile is accepted for MIR loading. wenlei: The profile itself isn't really invalid, it's just not suitable for FS loading, so piggyback on…

// Load pseudo probe descriptors for probe-based function samples.		// Load pseudo probe descriptors for probe-based function samples.
if (Reader->profileIsProbeBased()) {		if (Reader->profileIsProbeBased()) {
ProbeManager = std::make_unique<PseudoProbeManager>(M);		ProbeManager = std::make_unique<PseudoProbeManager>(M);
if (!ProbeManager->moduleIsProbed(M)) {		if (!ProbeManager->moduleIsProbed(M)) {
return false;		return false;
}		}
}		}

return true;		return true;
}		}

bool MIRProfileLoader::runOnFunction(MachineFunction &MF) {		bool MIRProfileLoader::runOnFunction(MachineFunction &MF) {
		// Do not load non-FS profiles. A line or probe can get a zero-valued
		// discriminator at certain pass which could result in accidentally loading
		// the corresponding base counter in the non-FS profile, while a non-zero
		// discriminator would end up getting zero samples. This could in turn undo
		// the sample distribution effort done by previous BFI maintenance and the
		// probe distribution factor work for pseudo probes.
		if (!Reader->profileIsFS())
		return false;

Function &Func = MF.getFunction();		Function &Func = MF.getFunction();
clearFunctionData(false);		clearFunctionData(false);
Samples = Reader->getSamplesFor(Func);		Samples = Reader->getSamplesFor(Func);
if (!Samples \|\| Samples->empty())		if (!Samples \|\| Samples->empty())
return false;		return false;

if (FunctionSamples::ProfileIsProbeBased) {		if (FunctionSamples::ProfileIsProbeBased) {
if (!ProbeManager->profileIsValid(MF.getFunction(), *Samples))		if (!ProbeManager->profileIsValid(MF.getFunction(), *Samples))
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/fsafdo_test2.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s \| FileCheck %s --check-prefixes=V0,V01			; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s \| FileCheck %s --check-prefixes=V0,V01
	; RUN: llvm-profdata merge --sample -profile-isfs -o %t0.afdo %S/Inputs/fsloader.afdo			; RUN: llvm-profdata merge --sample -profile-isfs -o %t0.afdo %S/Inputs/fsloader.afdo
	; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false -fs-profile-file=%t0.afdo -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=LOADERV0,LOADER			; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false -fs-profile-file=%t0.afdo -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=LOADERV0,LOADER
	; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s \| FileCheck %s --check-prefixes=V1,V01			; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s \| FileCheck %s --check-prefixes=V1,V01
	; RUN: llvm-profdata merge --sample -profile-isfs -o %t1.afdo %S/Inputs/fsloader_v1.afdo			; RUN: llvm-profdata merge --sample -profile-isfs -o %t1.afdo %S/Inputs/fsloader_v1.afdo
	; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true -fs-profile-file=%t1.afdo -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=LOADERV1,LOADER			; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true -fs-profile-file=%t1.afdo -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=LOADERV1,LOADER
	;			; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true -fs-profile-file=%S/Inputs/fsloader_v1.afdo -profile-isfs -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=LOADERV1,LOADER
				; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true -fs-profile-file=%S/Inputs/fsloader_v1.afdo -show-fs-branchprob -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false < %s 2>&1 \| FileCheck %s --check-prefixes=NOLOAD
	;;			;;
	;; C source code for the test (compiler at -O3):			;; C source code for the test (compiler at -O3):
	;; // A test case for loop unroll.			;; // A test case for loop unroll.
	;;			;;
	;; __attribute__((noinline)) int bar(int i){			;; __attribute__((noinline)) int bar(int i){
	;; volatile int j;			;; volatile int j;
	;; j = i;			;; j = i;
	;; return j;			;; return j;
	▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines
	; LOADER: Set branch fs prob: MBB (12 -> 13): unroll.c:24:11 W=283590 0x50000000 / 0x80000000 = 62.50% --> 0x7dfedaf9 / 0x80000000 = 98.43%			; LOADER: Set branch fs prob: MBB (12 -> 13): unroll.c:24:11 W=283590 0x50000000 / 0x80000000 = 62.50% --> 0x7dfedaf9 / 0x80000000 = 98.43%
	; LOADERV0: Set branch fs prob: MBB (14 -> 16): unroll.c:22:11-->unroll.c:24:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x0a5856e1 / 0x80000000 = 8.08%			; LOADERV0: Set branch fs prob: MBB (14 -> 16): unroll.c:22:11-->unroll.c:24:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x0a5856e1 / 0x80000000 = 8.08%
	; LOADERV1: Set branch fs prob: MBB (14 -> 16): unroll.c:22:11-->unroll.c:24:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x7aca7894 / 0x80000000 = 95.93%			; LOADERV1: Set branch fs prob: MBB (14 -> 16): unroll.c:22:11-->unroll.c:24:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x7aca7894 / 0x80000000 = 95.93%
	; LOADERV0: Set branch fs prob: MBB (14 -> 15): unroll.c:22:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x75a7a91f / 0x80000000 = 91.92%			; LOADERV0: Set branch fs prob: MBB (14 -> 15): unroll.c:22:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x75a7a91f / 0x80000000 = 91.92%
	; LOADERV1: Set branch fs prob: MBB (14 -> 15): unroll.c:22:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x0535876c / 0x80000000 = 4.07%			; LOADERV1: Set branch fs prob: MBB (14 -> 15): unroll.c:22:11 W=283590 0x40000000 / 0x80000000 = 50.00% --> 0x0535876c / 0x80000000 = 4.07%
	; LOADER: Set branch fs prob: MBB (16 -> 18): unroll.c:24:11-->unroll.c:19:3 W=283590 0x30000000 / 0x80000000 = 37.50% --> 0x16588166 / 0x80000000 = 17.46%			; LOADER: Set branch fs prob: MBB (16 -> 18): unroll.c:24:11-->unroll.c:19:3 W=283590 0x30000000 / 0x80000000 = 37.50% --> 0x16588166 / 0x80000000 = 17.46%
	; LOADER: Set branch fs prob: MBB (16 -> 17): unroll.c:24:11 W=283590 0x50000000 / 0x80000000 = 62.50% --> 0x69a77e9a / 0x80000000 = 82.54%			; LOADER: Set branch fs prob: MBB (16 -> 17): unroll.c:24:11 W=283590 0x50000000 / 0x80000000 = 62.50% --> 0x69a77e9a / 0x80000000 = 82.54%

				;; Check that the profile is not loaded since the reader doesn't know it is a FS profile.
				; NOLOAD-NOT: Set branch fs prob

	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	@sum = dso_local local_unnamed_addr global i32 0, align 4			@sum = dso_local local_unnamed_addr global i32 0, align 4

	declare i32 @bar(i32 %i) #0			declare i32 @bar(i32 %i) #0
	declare void @work(i32 %i) #2			declare void @work(i32 %i) #2

	define dso_local void @foo() #0 !dbg !29 {			define dso_local void @foo() #0 !dbg !29 {
	▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[FS-AFDO] Do not load non-FS profile in MIR loader.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 521155

llvm/include/llvm/ProfileData/SampleProfReader.h

llvm/lib/CodeGen/MIRSampleProfile.cpp

llvm/test/CodeGen/X86/fsafdo_test2.ll

This is an archive of the discontinued LLVM Phabricator instance.

[FS-AFDO] Do not load non-FS profile in MIR loader.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 521155

llvm/include/llvm/ProfileData/SampleProfReader.h

llvm/lib/CodeGen/MIRSampleProfile.cpp

llvm/test/CodeGen/X86/fsafdo_test2.ll

[FS-AFDO] Do not load non-FS profile in MIR loader.
ClosedPublic