This is an archive of the discontinued LLVM Phabricator instance.

[SampleFDO] Add symbol whitelist in the profile and use it when profile-sample-accurate is enabled
Needs ReviewPublic

Authored by wmi on Aug 16 2019, 4:56 PM.

Details

Reviewers
davidxl
mtrofin
Summary

For sampleFDO, because the optimized build uses profile generated from previous release, we cannot tell a function without profile is truely cold or just newly created. Currently we treat them conservatively and put them in .text section instead of .text.unlikely. The result is when we persuing the best performance by locking .text.hot and .text in memory, and use huge pages for them, we waste a lot of memory and huge pages for functions actually being cold.

The patch add a new section to the profile. The section contains a list of function names showing up in the binary used to generate the current profile. During profile use compilation, when profile-sample-accurate is enabled, a function without profile will be regarded as cold only when it is contained in that list.

In order to add the symbol whitelist to the profile, I extend the profile format so it is easier to add a new section or extend an existing section in the future.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi created this revision.Aug 16 2019, 4:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 16 2019, 4:56 PM
Herald added a subscriber: mgrang. · View Herald Transcript

Can you split the format extension change into a separate patch?

wmi added a comment.Aug 19 2019, 8:44 AM

Can you split the format extension change into a separate patch?

Ok, I will do that.

mgrang added inline comments.Aug 19 2019, 2:51 PM
lib/ProfileData/SampleProf.cpp
222

Please use a range-based llvm::sort function here. See https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements.

llvm::sort(SortedList);
245

Please use a range-based llvm::sort function here. See https://llvm.org/docs/CodingStandards.html#beware-of-non-deterministic-sorting-order-of-equal-elements.

llvm::sort(SortedList);
wmi added a comment.Aug 20 2019, 9:09 PM

Split the patch of format extension into https://reviews.llvm.org/D66513. The symbol whitelist patch is on top of D66513 and will be sent out later.

davidxl added inline comments.Aug 22 2019, 3:29 PM
include/llvm/ProfileData/SampleProf.h
116

Name it 'SectEdgeProfile' or 'SecLBRProfile'?

Also move this after the WhiteList and reserve some enum range between WhiteList and the firstProfile

enum SecType {

SecProfileSummary = 0,
SecNameTable = 1,
SecWhiteList = 2,
SecFuncProfileFirst = 0x10,   // marker for the first type of profile.
SecLBRProfile = SecFuncProfileFirst

}

We should shoot for one profile type per section, but the hybrid mode you mentioned can also be supported if it is a size win.

wmi added a comment.Aug 26 2019, 12:30 PM
In D66374#1638720, @wmi wrote:

Split the patch of format extension into https://reviews.llvm.org/D66513. The symbol whitelist patch is on top of D66513 and will be sent out later.

The second half is here: https://reviews.llvm.org/D66766

mtrofin added inline comments.Jun 15 2020, 8:46 AM
include/llvm/ProfileData/SampleProf.h
121

I'd initialize these to something, and perhaps have an invalid value for the Type you can use here for initialization. This type is used in a std::vector; so it'd be possible for one to write at a point "vector.resize()" and end up with garbage.

576

SymbolAllowList? (may be easier to understand)

580

Copy not copy

588

bool contains(..) const

same for size()

595

why not return size_t, which is what Syms.size() is?

597

Nit: why not just pass a StringRef with length CompressSize, and then rename UncompressSize to ExpectedUncompressedSize - I think it would clarify the responsibility of the function

604

is Syms expected to be large? IIUC, DenseSet is good for small sets, it's "quadratically probed". (https://llvm.org/docs/ProgrammersManual.html#llvm-adt-denseset-h)

605

can you add a comment explaining the role of Allocator? I think it answers my question about ownership below, too.

include/llvm/ProfileData/SampleProfReader.h
499

nit: I would find "AllowList" easier to understand (less term decoding)

include/llvm/ProfileData/SampleProfWriter.h
59

should this be passed in the ctor? also, if not, why virtual - would it ever do anything else than set a field (tradeoff: less virtual calls, for extra pointer field)

83

this could be const if Format were passed in the ctor - would Format ever mutate?

162

Please init field values - safe to do here, at definition (maintainability - wise)

lib/ProfileData/SampleProf.cpp
204

who takes ownership of Buffer?

lib/ProfileData/SampleProfReader.cpp
497

nit: define BufStart + Entry.Offset + EntrySize as EntryEnd, and then reuse below, too (easier to understand)

623

I not i

also, ++

why uint32_t and not size_t?

1175

why the braces? seems unrelated

lib/ProfileData/SampleProfWriter.cpp
287

I not i
++I
why not size_t?

same below

300

static_cast<uint64_t>?

304

i->I, ++I

312

static_cast?

490

would format passing make sense in the ctor of the profile writer? it's logically a const field of that, correct?

lib/Transforms/IPO/SampleProfile.cpp