This is an archive of the discontinued LLVM Phabricator instance.

[AutoFDO] Use flattened profiles for profile staleness metrics
ClosedPublic

Authored by wlei on Mar 20 2023, 1:24 PM.

Details

Summary

For profile staleness report, before it only counts for the top-level function samples in the nested profile, the samples in the inlinees are ignored. This could affect the quality of the metrics when there are heavily inlined functions. This change adds a feature to flatten the nested profile and we're changing to use flatten profile as the input for stale profile detection and matching.
Example for profile flattening:

Original profile:
_Z3bazi:20301:1000
 1: 1000
 3: 2000
 5: inline1:1600
   1: 600
   3: inline2:500
     1: 500

Flattened profile:
_Z3bazi:18701:1000
 1: 1000
 3: 2000
 5: 600 inline1:600
inline1:1100:600
 1: 600
 3: 500 inline2: 500
inline2:500:500
 1: 500

This feature could be useful for offline analysis, like understanding the hotness of each individual function. So I'm adding the support to llvm-profdata merge under --gen-flattened-profile.

Diff Detail

Event Timeline

wlei created this revision.Mar 20 2023, 1:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 1:24 PM
wlei requested review of this revision.Mar 20 2023, 1:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 1:24 PM
wlei retitled this revision from [AutoFDO] Use flattened profiles for stale profile metrics to [AutoFDO] Use flattened profiles for profile staleness metrics.Mar 20 2023, 1:52 PM
wlei edited the summary of this revision. (Show Details)
wlei added reviewers: hoy, wenlei, xur, davidxl.
wlei edited the summary of this revision. (Show Details)Mar 20 2023, 1:55 PM

Document the new llvm-profdata option in the Commandline guide.

llvm/include/llvm/ProfileData/SampleProf.h
1293

Remove 'CS'

llvm/lib/Transforms/IPO/SampleProfile.cpp
145

Can you add a test case of using the new option for state profile detection?

mingmingl added inline comments.Mar 20 2023, 11:33 PM
llvm/include/llvm/ProfileData/SampleProf.h
1316

nit: const SampleProfileMap

1319

nit: I'm wondering if it's simpler to pass 'ProfileMap' as opposed to use std::move

llvm/lib/Transforms/IPO/SampleProfile.cpp
2077–2078

for my information, is it intended that flattening is true if it's not set (num occurrences = 0)? ask since it seems to change existing behavior.

2089–2093

If I read correctly, FunctionSamples::ProfileIsCS is a per-reader instance bool (correct me if I miss anything). So it might be useful to add tests for context-sensitive profile and non-CS profiles.

wlei marked an inline comment as done.Mar 21 2023, 5:32 PM
wlei added inline comments.
llvm/include/llvm/ProfileData/SampleProf.h
1316

As mentioned below, it's intentional to change the input.

1319

It's a bit tricky to update the map while iterating its element, it could insert a new FunctionSamples. We had an internal patch to do it in-place, it ended up using an additional map, later we still needed to make copies to the original ProfileMap. IMO, that's more complicated than this version.

llvm/lib/Transforms/IPO/SampleProfile.cpp
145

Test added

2077–2078

Yeah, it's intended to set true, maybe I can just set the default value to true. I was thinking to use different value for pre-link and post-link here, but that's not ready, will leave it later.

2089–2093

Sounds good, full CS test is added.

wlei updated this revision to Diff 507189.Mar 21 2023, 5:33 PM
wlei edited the summary of this revision. (Show Details)

addressing reviewers' feedback: updated docs, added tests, other fixs.

hoy added inline comments.Mar 21 2023, 6:03 PM
llvm/include/llvm/ProfileData/SampleProf.h
1293

Update the comment correspondingly.

1293–1296

nit: name it convertCSProfiles .

1322

nit: swap the parameter declarations

llvm/lib/Transforms/IPO/SampleProfile.cpp
461

nit: name it getFlattenedSamples?

2089

Does it make sense to incorporate the CS path into ProfileConverter::flattenProfile?

wlei added inline comments.Mar 22 2023, 11:15 AM
llvm/lib/Transforms/IPO/SampleProfile.cpp
2089

Good idea, done! I just found it needs to clear the context to make llvm-profdata sample writer pass the context check. Also added the llvm-profdata test for full cs profile.

wlei updated this revision to Diff 507438.Mar 22 2023, 11:15 AM

addressing Hongtao's feedback

wenlei added inline comments.Mar 22 2023, 6:01 PM
llvm/include/llvm/ProfileData/SampleProf.h
1339

This function doesn't have check on whether input is CS or profile or not, and it relies on caller to pass in the right kind of profile. Should we make it private and only expose flattenProfile as API which can take any kind of profile as input.

llvm/tools/llvm-profdata/llvm-profdata.cpp
1039

this check seems unnecessary

1249

do we need to make sure gen-flattened-profile and gen-cs-nested-profile won't/can't be used at the same time?

also wondering if we should consolidate them into one flag, something like profile-structure=<cs-nested|flat>.

wenlei added inline comments.Mar 22 2023, 6:08 PM
llvm/include/llvm/ProfileData/SampleProf.h
1359–1362

Is the total samples populated this way going to be consistent with total samples from profiles generated directly from llvm-profgen?

hoy added inline comments.Mar 23 2023, 10:19 AM
llvm/include/llvm/ProfileData/SampleProf.h
1357

nit: assert CallsiteSamples is empty?

llvm/tools/llvm-profdata/llvm-profdata.cpp
1249

also wondering if we should consolidate them into one flag, something like profile-structure=<cs-nested|flat>.

This is a good point. There's a similar term SampleProfileLayout introduced https://reviews.llvm.org/D121651 . Consider unifying them like below?

enum SampleProfileLayout {
  SPL_None = 0,
  SPL_Nest = 0x1,
  SPL_Flat = 0x2,
  SPL_CSFlat = 0x3,
};
wlei added inline comments.Mar 23 2023, 10:00 PM
llvm/include/llvm/ProfileData/SampleProf.h
1339

Sounds good, moved to private scope.

1357

Sounds good, assertion added

1359–1362

I think so. Supposing TotalSamples = TotalBodySamples + TotalCallsiteSamples. Remembering we discussed this while implementing llvm-profgen, the TotalBodySamples usually is not the sum of all bodySamples, we can't sum the bodySamples at the end for this, so here use the reversed way: use TotalSamples minus TotalCallsiteSamples to recover the totalBodySamples.

llvm/tools/llvm-profdata/llvm-profdata.cpp
1039

removed.

1249

Good idea! The implementation in https://reviews.llvm.org/D121651 looks good to me, so I copied that (except the "CSFlat", will leave it to the original patch), but I have scratching my head with the term nest, CS-nested, flat, cs-flat...I will follow your decision.

wlei updated this revision to Diff 507959.Mar 23 2023, 10:01 PM

Updating D146452: [AutoFDO] Use flattened profiles for profile staleness metrics

wenlei accepted this revision.Mar 24 2023, 8:25 AM

lgtm, thanks.

This revision is now accepted and ready to land.Mar 24 2023, 8:25 AM
hoy accepted this revision.Mar 24 2023, 10:04 AM
hoy added inline comments.
llvm/tools/llvm-profdata/llvm-profdata.cpp
1249

Thanks. Nest and Flat sound good to me for now.

mingmingl added inline comments.Mar 28 2023, 4:30 PM
llvm/include/llvm/ProfileData/SampleProf.h
1319

thanks! It's true that updating map while iterating the elements are complicated; and the current way of writing looks reasonable then (I tend to be cautious when seeing std::move but obviously over cautious here :))

llvm/lib/Transforms/IPO/SampleProfile.cpp
2077–2078

thanks for confirming!

bjope added a subscriber: bjope.Mar 31 2023, 1:22 AM
bjope added inline comments.
llvm/test/tools/llvm-profdata/sample-flatten-profile.test
6

Downstream I got this when building with expensive checks:

Command Output (stderr):
--
/repo/bbiswjenk/fem2s10-eiffel176/workspace/llvm/llvm-main-expensive-checks/llvm/test/tools/llvm-profdata/sample-flatten-profile.test:13:14: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: !CFGChecksum: 1
             ^
<stdin>:6:7: note: scanning from here
 10: 1
      ^
<stdin>:7:2: note: possible intended match here
 !CFGChecksum: 0
 ^

Input file: <stdin>
Check file: /repo/bbiswjenk/fem2s10-eiffel176/workspace/llvm/llvm-main-expensive-checks/llvm/test/tools/llvm-profdata/sample-flatten-profile.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           1: baz:169:10 
           2:  1: 10 
           3:  3: 20 
           4:  5: 20 foo:20 
           5:  6: 2 bar:2 
           6:  10: 1 
next:13'0           X error: no match found
           7:  !CFGChecksum: 0 
next:13'0     ~~~~~~~~~~~~~~~~~
next:13'1      ?                possible intended match
           8: foo:134:21 
next:13'0     ~~~~~~~~~~~
           9:  1: 21 
next:13'0     ~~~~~~~
          10:  3: 12 bar:11 
next:13'0     ~~~~~~~~~~~~~~
          11:  4: 1 
next:13'0     ~~~~~~
          12:  !CFGChecksum: 3 
next:13'0     ~~~~~~~~~~~~~~~~~
           .
           .
           .
>>>>>>

Although I haven't seen that in any upstream buildbot.

Could it perhaps be that the order in which things are outputted is sensitive to iteration order of some maps?
FWIW, SampleProfileMap is an UnorderedMap and you are at least iterating over such maps in a few places (and I think SampleProfReader/SampleProfWriter is using sortFuncProfiles to sort map entries when needed).

wlei added inline comments.Mar 31 2023, 10:24 AM
llvm/test/tools/llvm-profdata/sample-flatten-profile.test
6

Thanks for the comment. You're right, the checksum num should be the same for the same func name profile, I missed adding one checksum, it should be fixed by https://reviews.llvm.org/rG3520f6d66645.