This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
ClosedPublic

Authored by xur on Aug 10 2021, 9:59 PM.

Details

Summary

This patch implements Flow Sensitive Sample FDO (FSAFDO) profile loader.
We have two additional profile loaders for FS profile, one before RegAlloc and one before BlockPlacement.

Use "-enable-fs-discriminator=true -disable-ra-fsprofile-loader=false -disable-layout-fsprofile-loader=false" to turn on the FS profile loaders, if sample-profile-use is specified.

Diff Detail

Event Timeline

xur created this revision.Aug 10 2021, 9:59 PM
xur requested review of this revision.Aug 10 2021, 9:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2021, 9:59 PM
wmi added inline comments.Aug 12 2021, 12:40 PM
llvm/include/llvm/IR/DebugInfoMetadata.h
2229–2230

EnableFSDiscriminator has been handled above already?

llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
276 ↗(On Diff #365656)

We may want to set remapping file otherwise remapping cannot be effective enough when FSAFDO is enabled.

wmi added inline comments.Aug 12 2021, 12:40 PM
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
165 ↗(On Diff #365656)

What is its difference with the generic version of SampleProfileLoaderBaseImpl<BT>::buildEdges? They look similar.

wmi added inline comments.Aug 12 2021, 2:56 PM
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
242 ↗(On Diff #365656)

Looks like the BB prob update shouldn't be guarded by ShowFSBranchProb.

245–248 ↗(On Diff #365656)

Nit: Diff = std::abs(OldProb - NewProb);

249–265 ↗(On Diff #365656)

Enclose the block of debug output with LLVM_DEBUG({ ... }).

xur marked 4 inline comments as done.Aug 12 2021, 4:08 PM
xur added inline comments.
llvm/include/llvm/IR/DebugInfoMetadata.h
2229–2230

Good catch. This is from a bad merge. Fixed.

llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
165 ↗(On Diff #365656)

There are basically the same. We have this just because IR and MIR have different interfaces for predecessors and successors. In IR, there is a utility iteration while MIR's iterators are with CFG.
I could add utilities in MIR to remove this specialization.

242 ↗(On Diff #365656)

Yes. that's correct. I added this option just before sending the review to suppress the debug output. It's not added correctly. fixed

276 ↗(On Diff #365656)

OK. Seems that I need to pass the remap file name the same way as profile file name. Will fix.

wenlei added inline comments.Aug 12 2021, 5:00 PM
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
291 ↗(On Diff #365656)

FSProfileLoader inherits from SampleProfileLoaderBaseImpl<MachineBasicBlock>, so it's for MIR only, why do we still have a templated runOnFunction?

If we want something that serves both IR and MIR, perhaps it can go into base. Or more generally, how do we plan to support IR FS-loader?

293 ↗(On Diff #365656)

Why do we keep dominator tree and loop info?

305 ↗(On Diff #365656)

When do we want to run FS loader but not setting branch probabilities?

llvm/lib/CodeGen/TargetPassConfig.cpp
1159

nit: for readability, the EnableFSDiscriminator check inside getFSProfileFile can be removed, and loader pass can be guarded under the same if check for disc pass.

1161

I'm wondering if there's a need for users to control which pass to have profile loading independently, at least for tuning purpose? i.e. some switch like disable-ra-fsprofile-load, disable-layout-fsprofile-load.

hoy added inline comments.Aug 13 2021, 12:12 AM
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
34 ↗(On Diff #365656)

typo: senstive

35 ↗(On Diff #365656)

Wondering why needs this switch. The whole point of FSProfileLoader is to overwrite branch weights?

llvm/lib/Passes/PassBuilder.cpp
1121 ↗(On Diff #365656)

Can this be moved into the constructor of PGOOptions so that it looks more consistent with the rest of PGO setup?

xur marked 2 inline comments as done.Aug 13 2021, 1:04 PM

Thanks all for the review comments! My replies are inlined.

llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
34 ↗(On Diff #365656)

Fixed

35 ↗(On Diff #365656)

some of the benefits are from the improved profile (i.e. using sum rather max) that helps SampleLoder (inline for example).
This option help to measure it.
I tested it in many of my experiments. But I agree that it does not need to expose to the users.
I will remove this.

291 ↗(On Diff #365656)

Not sure I follow the question here. Basically this is MIR version of runOnFunction.
APIs etc are different from IR version of runOnFunction. If we want IR FS-loader, that should be based on SampleProfileLoader (need to remove the inline related code).

293 ↗(On Diff #365656)

Profile loader does not invalid these infos (this is different in SampleLoader, where inline will change it)

305 ↗(On Diff #365656)

Sorry. This is a list minute bug (as I mentioned in answering Wei's comments)

llvm/lib/CodeGen/TargetPassConfig.cpp
1159

OK. I will change this.

1161

Yes. that's a good suggestion. I havd them in my testing implementation but got it out when sending for review. I will add them back.

llvm/lib/Passes/PassBuilder.cpp
1121 ↗(On Diff #365656)

Do you mean we still using FSProfileFile but just move the assignment to PGOOptions constructor? This should do doable.

hoy added inline comments.Aug 13 2021, 1:24 PM
llvm/lib/Passes/PassBuilder.cpp
1121 ↗(On Diff #365656)

Yes, that's what I meant. Currently FSProfileFile is set in postlink for LTO. It shouldn't matter if it is always set, since no prelink pass consumes that now?

wenlei added inline comments.Aug 13 2021, 1:50 PM
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
110–111 ↗(On Diff #365656)

This is more like a MIRProfileLoader as opposed to FSProfileLoader. If we do IR FS profile loading, it will be handled in existing SampleLoader, the name FSProfileLoader would suggest this one handles all FS profile loading for both IR and MIR. Same for FSProfileLoaderPass. We could follow the naming/structure of MIRAddFSDiscriminators too and call it MIRProfileLoader.

291 ↗(On Diff #365656)

Basically I was wondering why we use runOnFunction(FunctionT &F) instead of runOnFunction(MachineFunction &F) since this is only meant for MIR? The IR loader has runOnFunction(Function &F..).

293 ↗(On Diff #365656)

This is still all per-function info, so it needs to be cleared before we use the profile loader to process the next function, right? or did i miss anything?

xur marked an inline comment as done.Aug 13 2021, 4:04 PM
xur added inline comments.
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
110–111 ↗(On Diff #365656)

OK. I totally agree what you said. I will change the name.

291 ↗(On Diff #365656)

OK. I see what you meant. This is my bad. I should have used runOnFunction(MachineFunction &F). Note that runOnFucntion is not a templated function: after we propagate "using FunctionT=MachineFunction", this function will become runOnFunction(MachineFunction &F). But I agree this is confusing. I will fix. Thanks!

293 ↗(On Diff #365656)

We initialize these data structures per function in setInitVals(). Each function will have its own value and no need to clear.

llvm/lib/Passes/PassBuilder.cpp
1121 ↗(On Diff #365656)

Got it. I'll try to fix this.

xur updated this revision to Diff 366762.Aug 16 2021, 4:26 PM
xur marked an inline comment as done.

Integrated review comments from Wei, Wenlei and Hongtao. Thanks for all the suggestions/comments!

The only suggested was not applied was the one to use std::abs to BranchProbability -- we don't have the abs operator for BranchProbability. I could add but does not seem worthy.

-Rong

hoy added inline comments.Aug 17 2021, 10:31 AM
llvm/lib/CodeGen/MIRSampleProfile.cpp
283

Wondering whether this is needed when EnableFSBranchProb is false, where the weights computed are saved in an internal buffer and not used anywhere?

llvm/lib/Transforms/IPO/SampleProfile.cpp
371

Nit: pls remove this piece.

wmi added inline comments.Aug 17 2021, 10:47 AM
llvm/include/llvm/Passes/PassBuilder.h
38–40

It adds a dependency of LLVMProfileData to LLVMPasses.

Maybe it is better to pass the related information to codegen through param instead of through internal flag? The interface to the codegen is LLVMTargetMachine::addPassesToEmitFile. An example is its param "CodeGenFileType FileType" which is got from configuration in clang/lto backend/thinlto backend/...

xur marked an inline comment as done.Aug 17 2021, 12:14 PM
xur added inline comments.
llvm/include/llvm/Passes/PassBuilder.h
38–40

OK. I also find it a bit awkward to use internal flag here. Let me try to use this suggested way.

llvm/lib/CodeGen/MIRSampleProfile.cpp
283

Indeed. I think I will remove this option.

llvm/lib/Transforms/IPO/SampleProfile.cpp
371

Sorry. I thought I cleaned the code before sending for the review.

xur updated this revision to Diff 367058.Aug 17 2021, 4:43 PM
xur marked an inline comment as done.

Integrated recent comments from Wei and Hongtao.

wmi added inline comments.Aug 17 2021, 5:52 PM
llvm/include/llvm/Target/TargetMachine.h
115–118

Feel it is better to move the whole PGOOpt here. Similar as OptLevel above, PGOOpt shows us the major PGO related configuration and it will be available for the MIR passes after it is included in TargetMachine class.

xur added inline comments.Aug 17 2021, 6:05 PM
llvm/include/llvm/Target/TargetMachine.h
115–118

I'm OK with that. But for now, only these two fields are used in MIR (at lease for this patch).
I will make the change if I don't hear objection.

hoy added inline comments.Aug 18 2021, 9:28 AM
llvm/include/llvm/Target/TargetMachine.h
115–118

Moving PGOOpt here sounds good to time. It may be extended for MIR profiling use in the future. Maybe the EnableFSDiscriminator switch can be added into PGOOpt as well?

xur updated this revision to Diff 367236.Aug 18 2021, 9:57 AM

Followed Wei's suggestion to put PGOOptions to TargetMachine.

We could not pass TM to PGOOptions because that would create a circular dependency. We have to set PGOOptions outside of the constructor.
I also split out declaration of PGOptions to a separated head-file, because I don't want include unnecessary declamations to TargetMachine.

I just sent out the patch that puts PGOOptions to TargetMachine before
getting this message.
So EnableFSDiscriminator is not included.

If we want to include EnableFSDiscrinator there, we'd better to promote
this to an user visible option (as of now, it's an internal option).
I was thinking this option only serves in the transitional phrase. In the
long term, we will make it default and get rid of it.
So, I did not add it as a user level flag.

If people think we should keep it, i'd happy to change it to a user level
flag and include it in PGOOptions.

hoy added a subscriber: xur.Aug 18 2021, 10:35 AM

Thanks working on including PGOOptions in TargetMachine. It sounds good to me to leave EnableFSDiscrinator as an llvm switch for now.

Thanks,
Hongtao

From: Rong Xu <xur@google.com>
Date: Wednesday, August 18, 2021 at 10:15 AM
To: Hongtao Yu <reviews+D107878+public+a10d6f6e79d09f89@reviews.llvm.org>
Cc: wmi@google.com <wmi@google.com>, Wenlei He <wenlei@fb.com>, Hongtao Yu <hoy@fb.com>, stevenwu@apple.com <stevenwu@apple.com>, matthew.voss@sony.com <matthew.voss@sony.com>, mgorny@gentoo.org <mgorny@gentoo.org>, davidxl@google.com <davidxl@google.com>, pengfei.wang@intel.com <pengfei.wang@intel.com>, dexonsmith@apple.com <dexonsmith@apple.com>, llvm-commits@lists.llvm.org <llvm-commits@lists.llvm.org>, bhuvanendra.kumarn@amd.com <bhuvanendra.kumarn@amd.com>, yanliang.mu@intel.com <yanliang.mu@intel.com>, dougpuob@gmail.com <dougpuob@gmail.com>, david.green@arm.com <david.green@arm.com>, yuanfang.chen@sony.com <yuanfang.chen@sony.com>, ruiling.song@amd.com <ruiling.song@amd.com>
Subject: Re: [PATCH] D107878: [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader
I just sent out the patch that puts PGOOptions to TargetMachine before getting this message.
So EnableFSDiscriminator is not included.

If we want to include EnableFSDiscrinator there, we'd better to promote this to an user visible option (as of now, it's an internal option).
I was thinking this option only serves in the transitional phrase. In the long term, we will make it default and get rid of it.
So, I did not add it as a user level flag.

If people think we should keep it, i'd happy to change it to a user level flag and include it in PGOOptions.

xur updated this revision to Diff 367289.Aug 18 2021, 11:54 AM

Fix some issues in last patch set.

wmi accepted this revision.Aug 18 2021, 12:58 PM

LGTM. Thanks.

This revision is now accepted and ready to land.Aug 18 2021, 12:58 PM
hoy accepted this revision.Aug 18 2021, 1:07 PM

LGTM.

wenlei accepted this revision.Aug 18 2021, 1:36 PM
wenlei added inline comments.
llvm/lib/CodeGen/FlowSensitiveSampleProfile.cpp
293 ↗(On Diff #365656)

Ok, but resetting it isn't going to cause problem either, right? Just trying to understand if there's a correctness issue that requires not resetting these fields. If not, might as well not adding the special case and always reset/clear. Seem unnecessary to skip reset, but not a big deal though.

xur added a subscriber: zturner.Aug 18 2021, 1:52 PM

Yes. Resetting them will not cause correctness problems.

Note that SampleProfileLoader is a module scope object -- we initialize
it in a module pass. LoopInfo, DOM etc are a module level analysis obj.
After inline, we need to invalidate and recompute.

This is different for MIRProfileLoader (which is a function level object).
The loopInfo and Dom etc are specific to this function. Since we do not do
any transformation (except changing the BranchProb). I don't think we need
to invalidate them either.

xur updated this revision to Diff 367321.Aug 18 2021, 2:07 PM
xur edited the summary of this revision. (Show Details)

Discussed with Wei offline: it's better to set fs profile loader as off by default so people can opt-in (rather opt-out).
Once we are done with more perf testing, we will switch the default to on.

hoy added a comment.Aug 18 2021, 2:23 PM

Discussed with Wei offline: it's better to set fs profile loader as off by default so people can opt-in (rather opt-out).
Once we are done with more perf testing, we will switch the default to on.

Sounds good to me.

Discussed with Wei offline: it's better to set fs profile loader as off by default so people can opt-in (rather opt-out).
Once we are done with more perf testing, we will switch the default to on.

For FS-AFDO to be the default, we need profile generation part to be available to everyone as well (ideally in llvm-profgen).

This revision was landed with ongoing or failed builds.Aug 18 2021, 6:44 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 6:44 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
lenary added a subscriber: lenary.Aug 19 2021, 9:14 AM
lenary added inline comments.
llvm/test/CodeGen/X86/fsafdo_test2.ll
3

I think this REQUIRES: asserts if you're checking the debug output?

wenlei added inline comments.Aug 19 2021, 9:15 AM
llvm/test/CodeGen/X86/fsafdo_test2.ll
3

Should be fixed in D108364

hoy added inline comments.Aug 19 2021, 9:16 AM
llvm/test/CodeGen/X86/fsafdo_test2.ll
3

That's right. It was just fixed by https://reviews.llvm.org/D108364.

JDevlieghere added inline comments.
llvm/include/llvm/CodeGen/MIRSampleProfile.h
59

You're instantiating a forward-declared type. This breaks the modules build: https://green.lab.llvm.org/green/job/lldb-cmake/34450/console

llvm/lib/CodeGen/MIRSampleProfile.cpp
289

Why is this outside the llvm namespace?

dexonsmith added inline comments.Aug 19 2021, 10:32 AM
llvm/lib/CodeGen/MIRSampleProfile.cpp
289

I think it's common style in LLVM to have function definitions outside of namespaces -- IMO, the odd thing here is that the preceding function definitions were inside the namespace.

RKSimon added inline comments.
llvm/lib/LTO/LTOBackend.cpp
234

@xur Why does TM need a null check - afaict it's already been dereferenced in the call tree, and will be again below.

Noticed in a scan-build warning: https://llvm.org/reports/scan-build/report-LTOBackend.cpp-runNewPMPasses-34-35dfe9.html#EndPath