This is an archive of the discontinued LLVM Phabricator instance.

SamplePGO ThinLTO ICP fix for local functions.
ClosedPublic

Authored by danielcdh on Mar 8 2017, 12:55 PM.

Details

Summary

In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen:

  1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.
  2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName

This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining:

  1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
  2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName.
  3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote.

Event Timeline

danielcdh created this revision.Mar 8 2017, 12:55 PM
tejohnson edited edge metadata.Mar 9 2017, 9:23 AM

In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because the function name will mismatch as ThinLTO added module ID in the local function name. This patch tries at the best effort to find the GUID from the unpromoted local function name (in profile), and use that in ICP promotion.

Want to clarify what the situation is that we're trying to fix here. The function name used to compute the GUID in the index for ThinLTO for functions with local linkage is 'SourceFilePath + ":" + FunctionName'. I'm not sure how collecting on a non-LTO binary matters here. Note that ThinLTO will sometimes promote local functions to global scope and rename them, but that is a completely different naming scheme, and is done in the ThinLTO Backends ('FunctionName + ".llvm." + ModuleHash').

Normally the PGOFunctionName used in profiles for indirect call targets (at least with instrumentation FDO) is the same 'SourceFilePath + ":" + FunctionName'., so the GUID computed for ThinLTO is the same as the GUID in the profile metadata for the hot targets. Is that not the case for AutoFDO indirect call targets?

danielcdh edited the summary of this revision. (Show Details)Mar 9 2017, 1:14 PM

Thanks for updating the description, it makes more sense now. A few wording suggestions below, and comments on the code following that.

In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because the function name will mismatch as ThinLTO added module ID in the local function name. There are two places of where the mismatch can happen:

As you note below it isn't really a single problem of adding a module ID. I would just change this to say that ThinLTO adjusts local function names to avoid conflicts.

  1. thin-link appends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier)

s/appends/prepends/. Also add the fact that unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.

  1. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive UpdatedFunctionName

Clearer to call this PromotedFunctionName.

This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion:

(and in SamplePGO matching that happens in the backend after importing/inlining)

  1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
  2. in backend compiler, if sample profile reader cannot find a profile match for UpdatedFunctionName, it will try to find if there is a match for OriginalFunctionName.
  3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of UpdatedFunctionName, so that ICP can find the correct target to promote.
include/llvm/IR/ModuleSummaryIndex.h
564

The fact that ValueGUID may be modified should be documented. But I have a suggestion about this down in selectCallee where it is invoked.

include/llvm/ProfileData/SampleProfReader.h
288

Needs comment

lib/ProfileData/InstrProf.cpp
302

Needs comment.

lib/Transforms/IPO/FunctionImport.cpp
180

I think it would be better to detect whether we need to update the GUID in the caller, in computeImportForFunction, and explicitly update the GUID there near the start of the loop. It will make what is going on clearer. Also, there is at least one check done based on the GUID early in that loop (to see if it is already in the dest module), that should be done on the updated GUID.

test/Bitcode/thinlto-function-summary-originalnames.ll
17 ↗(On Diff #91061)

Why do we need to emit these in the combined index? We have the mappings available via the COMBINED_ORIGINAL_NAME if we need to recreate the relationship.

test/Transforms/PGOProfile/thinlto_samplepgo_icp.ll
7

Maybe, "calls to static target functions..." (I assume the promotion you are referring to is the ICP on the call, not the promotion to global of the static function?).
I assume that the GUID in the value profile metadata is from the original name - add comment to that effect?

danielcdh updated this revision to Diff 91251.Mar 9 2017, 6:46 PM
danielcdh marked 5 inline comments as done.

update

danielcdh edited the summary of this revision. (Show Details)Mar 9 2017, 6:50 PM
tejohnson accepted this revision.Mar 14 2017, 9:25 AM

LGTM, with some suggestions on comments below.

include/llvm/ProfileData/SampleProfReader.h
287

Are they stripped, or just not yet promoted and renamed?

lib/Bitcode/Writer/BitcodeWriter.cpp
3706

Needs comment about how the call edge may have been added based on indirect call profiles, in which case if it was for SamplePGO it would have used the OriginalID for local functions.

lib/Transforms/IPO/FunctionImport.cpp
200

Similar comment required here.

test/Transforms/PGOProfile/thinlto_samplepgo_icp.ll
7

This one looks not done yet.

This revision is now accepted and ready to land.Mar 14 2017, 9:25 AM
danielcdh updated this revision to Diff 91742.Mar 14 2017, 9:54 AM
danielcdh marked 3 inline comments as done.

update

danielcdh added inline comments.Mar 14 2017, 10:14 AM
include/llvm/IR/ModuleSummaryIndex.h
573

I also added a call to addOriginalName here as it should be the same as the other instance.

include/llvm/ProfileData/SampleProfReader.h
287

They should be all stripped. As discussed offline, if they are not stripped, we need to have a reliable way to map the unstripped name back to IR, which is undesirable.

Still LGTM except the new comments don't seem quite right to me.

include/llvm/ProfileData/SampleProfReader.h
287

Ok, right.

lib/Bitcode/Writer/BitcodeWriter.cpp
3707

This is not the name stripping in the profile issue though. This is because the SamplePGO doesn't use the PGOFuncName scheme of computing the GUID based on "SourceFileName:FunctionName" for locals, right?

lib/Transforms/IPO/FunctionImport.cpp
202

Ditto

danielcdh updated this revision to Diff 91748.Mar 14 2017, 10:41 AM

update comments

danielcdh edited the summary of this revision. (Show Details)Mar 14 2017, 10:42 AM
danielcdh closed this revision.Mar 14 2017, 10:45 AM