This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Fix profile mismatch in Comdat function with pre-inliner
ClosedPublic

Authored by xur on Jul 20 2016, 2:59 PM.

Details

Summary

Pre-instrumentation inline (pre-inliner) greatly improves the IR instrumentation code performance, among other benefits. One issue of the pre-inliner is it can introduce CFG-mismatch for COMDAT functions. This is due to the fact that the same COMDAT function may have different early inline decisions across different modules -- that means different copies of COMDAT functions will have different CFG checksum.

A simple example:

COMDAT function foo() --> bar()

foo() is defined in f1.cc and f2.cc. If bar() is available in f1.cc and is small enough to inline in pre-inliner, and it's not available in f2.cc. After pre-inline pass, f1.cc and f2.cc each has its own foo() with different IR (different CFG). If the version in f1.cc is chosen, we only have this version of counters in the profile (as we merge the COMDAT profile variables). We will have checksum mismatch when do a profile-use compilation in f2.cc.

In this patch, we propose a partially renaming the COMDAT group and its member function/variable so we have different profile counter for each version. We will post-fix the COMDAT function and the group name with its FunctionHash.

For the above case, if the foo() in f1.cc has a hash of 123456 and the foo() in f2.cc has a hash of 234567. foo() in f1.cc will be rename to foo.123456() (as well as the COMDAT name, and the profile variables in that function). foo() in f2.cc will be rename to foo.234567() (as well as the COMDAT name, and the profile variables in that function).

There are cases where two functions with the same FunctionHash might not have the same IR. For example,
foo() { bar(); goo(); }
bar() { a++; }
goo() { b++; }
in f1.cc bar() is inlined while in f2.cc goo() is inlined. From edge profile point of view, nothing will go wrong as both version has the same profile variables and all the counter updates will be captured.

The only potential problem is the indirect-call profiling. For example, if bar() contains an indirect-call. The version of foo() in f1.cc will also contain an indirect-call counter, while the version in f2.cc does not have. This will create a mismatch when reading value profiles. To address this, we add the number of indirect-calls to the function hash.

This is not bullet-proof solution. As same number of indirect-calls does not guarantee the indirect-calls are the same. For example, if goo() contains a different indirect-call. Both version in f1.cc and f2.cc has one indirect call, even though they have different call-sites. Our proposed method will treat the two versions of foo() as identical as they have the same function hash. But we believe this happens rarely. Also note that the worse case here is that we will promote a wrong indirect-target. There are not correctness issues.

Some implementation details:
(1) The mismatch also applies to AvailableExternallyLinkage functions.
We will convert AvailableExternallyLinkage functions to COMDAT functions, but not ExternalWeakLinkage functions.

(2) COMDAT group with multiple functions.
We only handle COMDAT group that having one COMDAT function to reduce the complexity. If a COMDAT group has multiple functions, we need to have a unique post-fix for all the functions. To do this, it requires to collect all the member functions and their hash, which is costly.

An alternate way is to do a post instrumentation fix-up on the instrumentation intrinsics. This is costly too.

We find multiple functions COMDAT groups are relatively rare (mostly in global static initializer). So we decided to only handle single member COMDAT groups.

Future work:
(1) Reduce the number of renamings
One optimization that reduces the number of renamings is to only apply the renaming to the COMDAT functions that preinline occurs. Unfortunately, currently we does not have an attribute for this.

(2) Darwin
Darwin does not use COMDAT, instead, it uses LinkOnce linkage. We will have a separated patch to deal the mismatch introduced by pre-inliner.

Diff Detail

Repository
rL LLVM

Event Timeline

xur updated this revision to Diff 64772.Jul 20 2016, 2:59 PM
xur retitled this revision from to [PGO] Fix profile mismatch in Comdat function with pre-inliner.
xur updated this object.
xur added reviewers: davidxl, silvas, tejohnson.
xur added subscribers: llvm-commits, xur.

adding Jake

jakev edited edge metadata.Jul 20 2016, 5:38 PM

Just a couple nits from me.

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
121 ↗(On Diff #64772)

Does this need ZeroOrMore?

122 ↗(On Diff #64772)

functin -> function

286 ↗(On Diff #64772)

Can this just be ComdatMembers.size()?

325 ↗(On Diff #64772)

Can this be called 'canRenameComdat'?

888 ↗(On Diff #64772)

comdat.in -> comdat in

silvas accepted this revision.Jul 21 2016, 12:14 AM
silvas edited edge metadata.

I think this is a good solution. LGTM although I'd like to wait for other reviewers to chime in.

FWIW, one alternative solution to this issue (and the static var name issue that Jake fixed recently) is to make it so that multiple functions with same name and different hash can coexist naturally. I haven't looked closely, but this would require changing quite a bit (e.g. indexed format, various API's, etc.). So at the moment I don't think it is worth it since this patch + Jake's patch are relatively small.

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
988 ↗(On Diff #64772)

Is this variable Builder needed?

This revision is now accepted and ready to land.Jul 21 2016, 12:14 AM
xur marked 5 inline comments as done.Jul 21 2016, 11:23 AM
xur added inline comments.
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
121 ↗(On Diff #64772)

Should be Optional. Removed ZeroOrMore.

286 ↗(On Diff #64772)

good suggestion. changed.

325 ↗(On Diff #64772)

changed.

988 ↗(On Diff #64772)

This is from a leftover of earlier merge. It should not be here. Thanks for catching this.

xur updated this revision to Diff 64931.Jul 21 2016, 11:24 AM
xur edited edge metadata.
xur marked 3 inline comments as done.

Integrated the review suggestions from Jake and Sean.

Thanks,

-Rong

davidxl added inline comments.Jul 21 2016, 11:38 AM
lib/ProfileData/InstrProf.cpp
784 ↗(On Diff #64931)

Please separate the restructuring into a NFC patch

lib/Transforms/Instrumentation/PGOInstrumentation.cpp
120 ↗(On Diff #64931)

Change the name to "DoComdatRenaming"

xur updated this revision to Diff 64946.Jul 21 2016, 1:14 PM

Address David's review comments:
(1) change option name.
(2) split the refactoring work into NFC patch:
https://reviews.llvm.org/D22643
D22643 [PGO] Make needsComdatForCounter() available (NFC)

davidxl added inline comments.Jul 21 2016, 1:47 PM
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
338 ↗(On Diff #64946)

Perhaps add an assert about the linkage?

341 ↗(On Diff #64946)

Add // FIXME and some explanation of the limitation.

342 ↗(On Diff #64946)

have have --> have

344 ↗(On Diff #64946)

Also explain why having other variables are ok?

365 ↗(On Diff #64946)

--> change the linkage to LinkOnceODR and put them into comdat. This is because after renaming, there is no backup external copy available for the function.

367 ↗(On Diff #64946)

Add assertion about availableExternally linkage.

test/Transforms/PGOProfile/comdat_internal.ll
7 ↗(On Diff #64946)

Can you use wild card and defined a variable for the hash val? There is no need to test the actual alue.

test/Transforms/PGOProfile/indirect_call_profile.ll
16 ↗(On Diff #64946)

Same here

xur marked 7 inline comments as done.Jul 21 2016, 4:59 PM
xur added inline comments.
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
344 ↗(On Diff #64946)

It turns out having variable in the comdat group is not safe to rename: if we rename the variable, we will create different copy of variable which is wrong. if we don't rename, we might run into duplicated symbol in linking.
So in the new patch, I disable the rename of the comdat groups with variables.

It's ok to have aliases though.

xur updated this revision to Diff 64997.Jul 21 2016, 5:02 PM

This new patch addressed David's comments:
It refines the handling of comdat group with variables and aliases, and adds a test case to check handled and unhandled cases.

davidxl added inline comments.Jul 22 2016, 12:06 PM
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
348 ↗(On Diff #64997)

(2) variables can not be renamed, so we can not rename comdat function in a group including global vars.

393 ↗(On Diff #64997)

Add assert that the alias target is still F

test/Transforms/PGOProfile/comdat_rename.ll
2 ↗(On Diff #64997)

you may want to try this test case on other platform such as coff to make sure it works:

-mtriple=x86_64-pc-win32-coff

xur marked an inline comment as done.Jul 22 2016, 1:50 PM
xur added inline comments.
lib/Transforms/Instrumentation/PGOInstrumentation.cpp
393 ↗(On Diff #64997)

I'll assert GA's comdat is the same as OrigComdat.

test/Transforms/PGOProfile/comdat_rename.ll
2 ↗(On Diff #64997)

The test failed for the AvailableExternallyLinkage case. The reason is in needsComdatForCounter().

For AvailableExternallyLinkage, since this is not ELF format target, we return false.

I'm wondering if we should move the checking for TargetTriple() after the linkage check. The target can have Comdat support for nonELF targets after all.

davidxl added inline comments.Jul 22 2016, 1:56 PM
test/Transforms/PGOProfile/comdat_rename.ll
2 ↗(On Diff #64997)

you should explicitly add -mtripple (covering elf and coff). For coff case, do not check availableExternally (which is not supported yet with this change).

xur updated this revision to Diff 65179.Jul 22 2016, 3:26 PM

Integrated David's comments.

Thanks,

-Rong

davidxl accepted this revision.Jul 25 2016, 10:11 AM
davidxl edited edge metadata.

As a follow up, perfhaps a new internal option can be introduced to force 'privatize a comdat function': when a comdat function has module context specific profile, it is can be useful to let it have its own copy of profile counters.

lgtm

This revision was automatically updated to reflect the committed changes.

Thanks David for the close review! (and of course thanks to Rong for the patch!)