This is an archive of the discontinued LLVM Phabricator instance.

[CSSPGO][llvm-profgen] Compress recursive cycles in calling context
ClosedPublic

Authored by wlei on Dec 18 2020, 10:42 AM.

Details

Summary

This change compresses the context string by removing cycles due to recursive function for CS profile generation. Removing recursion cycles is a way to normalize the calling context which will be better for the sample aggregation and also make the context promoting deterministic.
Specifically for implementation, we recognize adjacent repeated frames as cycles and deduplicated them through multiple round of iteration.
For example:
Considering a input context string stack:
[“a”, “a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For first iteration,, it removed all adjacent repeated frames of size 1:
[“a”, “b”, “c”, “a”, “b”, “c”, “b”, “c”, “d”]
For second iteration, it removed all adjacent repeated frames of size 2:
[“a”, “b”, “c”, “a”, “b”, “c”, “d”]
So in the end, we get compressed output:
[“a”, “b”, “c”, “d”]

Compression will be called in two place: one for sample's context key right after unwinding, one is for the eventual context string id in the ProfileGenerator.
Added a switch compress-recursion to control the size of duplicated frames, default -1 means no size limit.
Added unit tests and regression test for this.

Diff Detail

Event Timeline

wlei created this revision.Dec 18 2020, 10:42 AM
wlei requested review of this revision.Dec 18 2020, 10:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 18 2020, 10:42 AM
wlei edited the summary of this revision. (Show Details)Dec 18 2020, 11:28 AM
wlei added reviewers: wmi, davidxl, hoy, wenlei.
wlei updated this revision to Diff 313134.Dec 21 2020, 9:49 AM
wlei edited the summary of this revision. (Show Details)

rebase

wmi added inline comments.Jan 10 2021, 9:29 PM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #313134)

Could you show the result of --compress-recursion=0 (I assume no compression will happen) so it is easy to check which part of the output is compressed for --compress-recursion=-1?

llvm/tools/llvm-profgen/ProfileGenerator.cpp
450

Use SmallVectorImpl<std::string>& as a parameter type instead of SmallVector<std::string, 16>&. There are some other places with the same issue.

473

Use SmallVectorImpl<std::string>& as a parameter type instead of SmallVector<std::string, 16>&.

501

Nit: Change it to ContextStrStackCopy to make the vector copy more explicit.

llvm/tools/llvm-profgen/ProfileGenerator.h
94–95

Is p a variable or does it refer to some other variable?

106–108

Could you give an example of what the sequence looks like after the population?

llvm/unittests/tools/llvm-profgen/ContextCompressionTest.cpp
15

How about have a similar test for pseudo probe?

wlei updated this revision to Diff 315966.Jan 11 2021, 5:12 PM

Addressing Wei's feedback

wlei added inline comments.Jan 11 2021, 5:15 PM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #313134)

Good idea, added the uncompressed test case

llvm/tools/llvm-profgen/ProfileGenerator.cpp
450

Yeah, thanks for the suggestions, fixed all the issues.

501

Good catch, changed to use a new variable ContextStrStackCopy to make it more explicit.

llvm/tools/llvm-profgen/ProfileGenerator.h
94–95

Sorry, I just miss to update the comments. 'p' is actually the Left. fixed the comments.

106–108

Sure, an example is added in the code comments

llvm/unittests/tools/llvm-profgen/ContextCompressionTest.cpp
15

Do you mean to add compression test for the element whose type is PseudoProbe?
The pseudo probe in the context key is represented as a PseudoProbe pointer, i.e an integer, so it should be same as the string. The only thing for the type T is implementing the == operator.

wmi added inline comments.Jan 12 2021, 4:43 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Thanks for the example. I can understand where are the redundent comparisons but I havn't understood how to skip the windows by doing the population (std::copy). I don't see the std::copy change anything for the example below. Could you clarify?

Similarly I don't see how the duplicated str is removed from Context sequence when Duplication is found above (if (Right - Left == I) is true). Could you also clarify that?

llvm/unittests/tools/llvm-profgen/ContextCompressionTest.cpp
15

Ok, thanks for the clarification.

wlei updated this revision to Diff 316530.Jan 13 2021, 4:09 PM

rewrote the comments, adding more examples

wlei added inline comments.Jan 13 2021, 4:10 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Yeah, our original design is to use a new vector to store the compressed result, then we changed to use in-place algorithm to reduce the memcpy which make readability worse.
rewrote the code comments, added another example, please see this version is clear or not?

hoy added inline comments.Jan 14 2021, 11:13 AM
llvm/tools/llvm-profgen/ProfileGenerator.h
61

Moving the implementation to ProfileGenerator.cpp so that RecursionCompression can be checked instead of using an extra global?

63

typo: sequence

wmi added inline comments.Jan 14 2021, 3:10 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Thanks, now I understand it better. However, I am surprised seems you doesn't remove the redundent comparison which I thought you tried to remove.

suppose you have the following sequence:
a b c d b c d

Considering I==3.
In the first iteration. abc and dbc are compared, so Left==0 and Right==2, The algorithm find the first non-common place is Left==0. Then it updates Right to be Left + I = 3 at the end of the iteration. Note in this iteration, it have compared the common parts so it already knows the 2th/5th char in the string are the same --> 'b', and 3th/6th chars in the string are the same --> 'd'.

In the next iteration, the algorithm executes the comparison loop again:

uint32_t Left = Right;
while (Right - Left < I && Context[Left] == Context[Left + I]) {
   // Find the longest suffix inside the window. When stops, Left points
   // at the diverging point in the current sequence.
   Left--;
}

With Right==3, it will compare 3 chars before it can confirm there is duplication found, so it will compare 2th char and 5th char, 4th char and 6th char again which it already know they are the same in the first iteration.

Do I understand it correctly?

wmi added inline comments.Jan 14 2021, 3:15 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Sorry: some typos:

2th/5th char in the string are the same --> 'b', and 3th/6th chars in the string are the same --> 'd'.

'd' ==> 'c'

so it will compare 2th char and 5th char, 4th char and 6th char again which it already know they are the same in the first iteration.

4th char and 6th char again ==> 3th char and 6th char again

wlei added inline comments.Jan 14 2021, 3:52 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Yes, your'are right. I think we miss this, your case is another optimizing opportunity.

This code's redundancy removing is like below:

0 1 2 3 4 5 6 7 8
a b c a b d a b d
I = 3

For 1st iteration, abc and abd are compared, we know c and d is different. So Left is 2 and Right is 2. Then Right will be Left + I = 5. It means to avoid comparing [bca, bda] and [cab, dab] since we already know the comparison between c and d is false.

The 2nd iteration will just compare [abd, abd]

So let me implement it based on your idea, thanks for your feedback.

hoy added inline comments.Jan 14 2021, 4:08 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Yes, that's a missing opportunity. Once a common suffix is found, it will become the common prefix for the next compare. Then the next compare will not need to compare the common prefix again. The optimization was not done because it adds complexity and does not change the complexity, still O(n).

Thanks for pointing this out, Wei and thanks for giving it a shot Lei!

wmi added inline comments.Jan 14 2021, 4:20 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

The optimization was not done because it adds complexity and does not change the complexity, still O(n).

Agree. I don't know how much cpu and memory overhead are spent on compression for large case. What is your experience on it? That will help to decide the tradeoff between optimization and complexity.

hoy added inline comments.Jan 14 2021, 4:45 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

We haven't measured that. But I think this optimization is worse pursuing. We just need a variable to store the length of the last common suffix and stop the comparing when the rest of string to compare is right in that length.

wlei added inline comments.Jan 14 2021, 4:58 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
106–108

Yeah, I think that shouldn't need too much work. Let me try it.

wlei updated this revision to Diff 317006.Jan 15 2021, 9:53 AM

Remove more redundancy cases during deduplication

wlei added inline comments.Jan 15 2021, 9:57 AM
llvm/tools/llvm-profgen/ProfileGenerator.h
61

Yes, I tried this but failed because of the unit test which need to link the cpp file and the dependence need to include many other files. Not sure other ways to link the a Tool dir to the unit test, so I chose to put it into header file. what do you think?

63

fixed

106–108

Implemented, added a new variable LeftBoundary for the check and the Left pointer will end at it not the Right - I.

hoy added inline comments.Jan 15 2021, 4:41 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
61

I see. Can CompressionSize be made a static field of the class and initialized in the cpp file?

Also, it sounds to me compressRecursionContext should be a function of CSProfileGenerator since it is CS-specific. What do you think?

106–108

looks good, thanks!

wlei updated this revision to Diff 317604.Jan 19 2021, 9:27 AM

addressing Hongtao's feedback

wlei added inline comments.Jan 19 2021, 9:29 AM
llvm/tools/llvm-profgen/ProfileGenerator.h
61

Yes, good suggestions, thanks. made CompressionSize static and moved compressRecursionContext into CSProfileGenerator

hoy added inline comments.Jan 19 2021, 4:01 PM
llvm/tools/llvm-profgen/ProfileGenerator.h
20

Remove this?

221

Nit: add a comment about the special value -1?

wlei updated this revision to Diff 317725.Jan 19 2021, 4:44 PM

addressing Hongtao's feedback

llvm/tools/llvm-profgen/ProfileGenerator.h
20

Good catch!

221

Good point!

hoy accepted this revision.Jan 20 2021, 11:03 AM
This revision is now accepted and ready to land.Jan 20 2021, 11:03 AM
wenlei added inline comments.Jan 20 2021, 11:09 AM
llvm/test/tools/llvm-profgen/recursion-compression.test
69 ↗(On Diff #317725)

nit: missed a CHECK?

1 ↗(On Diff #313134)

Would it be better to place the check for compressed form side by side with uncompressed form for readability?

wenlei added inline comments.Jan 20 2021, 12:05 PM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #317725)

A test case to cover line-based profile too? That goes through slightly different path for handling the inline context expansion, so e2e test would be good (doesn't have to be as through as the one for probe though).

llvm/tools/llvm-profgen/ProfileGenerator.h
96

We could remove this redirector, and just use default parameter?

wlei updated this revision to Diff 318247.Jan 21 2021, 9:53 AM

addressing Wenlei's feedback, added line-based test

wlei added inline comments.Jan 21 2021, 10:00 AM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #317725)

Good catch, the line-based test added

69 ↗(On Diff #317725)

Good catch!

1 ↗(On Diff #313134)

Just tried but the problem here is the result is sorted by the total sample, if the order changed after compressed and merged, it couldn't properly match the line. So how about keep the current one or I can give some comments for readability?

llvm/tools/llvm-profgen/ProfileGenerator.h
96

Yeah, good to know this, we can use uint32_t CompressionSize = MaxCompressionSize as the default parameter.

wenlei added inline comments.Jan 21 2021, 12:27 PM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #317725)

Test for --compress-recursion=0 is gone? and I don't see new test case here?

1 ↗(On Diff #313134)

I see.. then we can leave it. thanks..

wlei added inline comments.Jan 21 2021, 12:33 PM
llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #317725)

Oh, I guess you need to refer the latest commit. I removed this file, so the code here is outdated.

wenlei accepted this revision.Jan 21 2021, 12:36 PM

lgtm, thanks!

llvm/test/tools/llvm-profgen/recursion-compression.test
1 ↗(On Diff #317725)

Oops.. sorry i was indeed looking at the wrong version. thanks for adding test for line-based profile.

wmi accepted this revision.Jan 21 2021, 7:16 PM

LGTM. Thanks.

This revision was landed with ongoing or failed builds.Feb 3 2021, 6:53 PM
This revision was automatically updated to reflect the committed changes.