This is an archive of the discontinued LLVM Phabricator instance.

[AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code
ClosedPublic

Authored by wlei on Sep 13 2021, 12:11 PM.

Details

Summary

In order to be consistent with compiler that interprets zero count as unexecuted(cold), this change reports zero-value count for unexecuted part of function code. For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value. After all ranges are merged and converted into disjoint ranges, the remaining zero count will indicates the unexecuted(cold) part of the function.

This change also extends the current findDisjointRanges method which now can support adding zero-value range.

Diff Detail

Event Timeline

wlei created this revision.Sep 13 2021, 12:11 PM
wlei requested review of this revision.Sep 13 2021, 12:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2021, 12:11 PM
wlei retitled this revision from [AutoFDO][llvm-profgen] Report zero count for unexecuted function code to [AutoFDO][llvm-profgen] Report zero count for unexecuted part of function code.Sep 13 2021, 12:19 PM
wlei edited the summary of this revision. (Show Details)
wlei added reviewers: hoy, wenlei, wmi.
wlei updated this revision to Diff 372319.Sep 13 2021, 12:28 PM
wlei edited the summary of this revision. (Show Details)

fix one return type error

hoy added a comment.Sep 14 2021, 4:36 PM

Thanks for the change. I'm wondering instead of extending findDisjointRanges if we can just add a post-processing pass that adds a zero sample to all line offsets. We did this for probe-based CS profile.

wlei added a comment.Sep 14 2021, 4:56 PM

Thanks for the change. I'm wondering instead of extending findDisjointRanges if we can just add a post-processing pass that adds a zero sample to all line offsets. We did this for probe-based CS profile.

The advantage to do it by extending findDisjointRanges is it can save some operations and adding zero sample for line-number is heavier than probe because it need to travels the inline callstack.

For extreme case say if we have counter like [1,9999] : 1. and the function range is [1,10000], post-processing will do both for [1,9999] : 1, [1,10000]: 0 . it's almost double the cost. pre-processing will do [1,9999] : 1 and [10000,10000]:0.

hoy added a comment.Sep 14 2021, 6:27 PM

Thanks for the change. I'm wondering instead of extending findDisjointRanges if we can just add a post-processing pass that adds a zero sample to all line offsets. We did this for probe-based CS profile.

The advantage to do it by extending findDisjointRanges is it can save some operations and adding zero sample for line-number is heavier than probe because it need to travels the inline callstack.

For extreme case say if we have counter like [1,9999] : 1. and the function range is [1,10000], post-processing will do both for [1,9999] : 1, [1,10000]: 0 . it's almost double the cost. pre-processing will do [1,9999] : 1 and [10000,10000]:0.

Sounds good. Leveraging the range calculator to avoid redundant scanning is smart.

hoy added inline comments.Sep 15 2021, 5:39 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
141–142

Add a comment for this? Does it stand for a point that only have a zero count?

208–209

This initialization is probably not needed since you are adding the default constructor.

226

Should this always be true? addBeginCount is called every time a new boundary point is created?

295

Can you remind me of what could lead to a zero count previously? I'm wondering if that kind of zero count should be reported.

325

nit: preProcess -> preprocess

llvm/tools/llvm-profgen/ProfiledBinary.cpp
400–401

Does the initializer list require c++14? emplace_back should work if you see a warning for that.

llvm/tools/llvm-profgen/ProfiledBinary.h
167–169

Nit: rename FuncStartAddrMap to FuncStartOffsetMap?

Please adjust comment to include end offset.

hoy added inline comments.Sep 15 2021, 5:48 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
226

nvm, I get that.

hoy added inline comments.Sep 15 2021, 5:55 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
242

Add a comment for this? Suppose we never have a zero-count range, does this change the default behavior?

wlei updated this revision to Diff 372995.Sep 16 2021, 10:40 AM

Addressing Hongtao's feedback.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
141–142

Comments added, not a zero count only, it's also used for the overlapping point. like

[----10 ----]
[--------- 0 ----------]
A            B         C

Point A's count is 10 and the isZeroRangeBegin is true.

208–209

Sounds good, refactor the code to use emplace

242

Comment added. It will change our previous logic a little bit, i, e. previous one can generate new zero-count range.
For example, supposing we have two non-overlapping ranges

[<---- 10 ---->]  
                            [<----20 ----->]
A              B             C             D

Previously, it will add a zero range [B+1, C-1]. (Anyway, this extra zero range will be filtered later as you can see the code below).

After this change, the begin point B+1 is marked invalid(UINT64_MAX), so no extra zero range is added.

295

See the above comment. that's from the gap between two non-overlapping ranges.

325

Fixed!

llvm/tools/llvm-profgen/ProfiledBinary.cpp
400–401

The value here is a std::pair, I guess you mean emplace?

llvm/tools/llvm-profgen/ProfiledBinary.h
167–169

Good catch!

hoy added inline comments.Sep 16 2021, 12:44 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
208–209

I actually meant you can just do something like below. Sorry for the confusion.

Boundaries[Begin].addBeginCount(Count);
Boundaries[End].addEndCount(Count);
if (Count == 0) {
  BeginPoint.IsZeroRangeBegin = true;
  EndPoint.IsZeroRangeEnd = true;
}
242

Thanks for the explanation. It makes sense to me now. Would be great to include your example in the comment.

wlei updated this revision to Diff 373047.Sep 16 2021, 1:11 PM

addressing Hongtao's feedback.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
208–209

I see, that's clearer. Thanks!

242

example added!

hoy added a comment.Sep 16 2021, 1:44 PM

lgtm, thanks.

hoy accepted this revision.Sep 16 2021, 1:44 PM
This revision is now accepted and ready to land.Sep 16 2021, 1:44 PM
wenlei accepted this revision.Sep 22 2021, 10:12 PM

lgtm, thanks.

For the implementation, it leverages the range counter, initializes all the executed function range with the zero-value.

This should be quite efficient.

llvm/tools/llvm-profgen/ProfileGenerator.cpp
148

curious why we need this?

221

Since we only add zero ranges for functions, there won't be any overlapping zero range and the Depth is going to be either 0 or 1, right?

wlei added inline comments.Sep 22 2021, 10:46 PM
llvm/tools/llvm-profgen/ProfileGenerator.cpp
148

uh, I just wrote it unintentionally, it can be removed.

221

Yes, in our case, function range apparently won't overlap and also there is no zero count in the sample.

This revision was landed with ongoing or failed builds.Sep 24 2021, 2:16 PM
This revision was automatically updated to reflect the committed changes.