This change adds duplication factor multiplier while accumulating body samples for line-number based profile. The body sample count will be duplication-factor * count. Base discriminator and duplication factor is decoded from the raw discriminator, this requires some refactor works.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/tools/llvm-profgen/duplication-factor.test | ||
---|---|---|
6 ↗ | (On Diff #373087) | Can you comment on the profile line where the duplication factor is being changed to 10? and what's the count if there's no duplication factor? |
30 ↗ | (On Diff #373087) | I think it's better to have a test that can naturally generate duplication factor. That way it's easier to update the test case in the future if needed. To do that we can grab one of the test cases from loop unrolling, and if we enable fdebug-info-for-profiling, i think we should see duplication factor. |
53 ↗ | (On Diff #373087) | is this intentional? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
547 | nit: Count *= getDuplicationFactor(LeafLoc->Callsite.Discriminator)? Same for other places. | |
llvm/tools/llvm-profgen/ProfiledBinary.cpp | ||
224 | Current this function is only used for dwarf based profile, but is it guaranteed that this path will never be used by pseudo probe? we will need to handle discriminator differently for probe case (extractProbeIndex). | |
543–546 | By removing this part, we will have duplication factor as part of location in NameStrings and call stack, is that what we want? Logically, duplication factor does not belong to location. |
rebase and addressing some review feedbacks
llvm/test/tools/llvm-profgen/duplication-factor.test | ||
---|---|---|
30 ↗ | (On Diff #373087) | I see, After adding the bigger test case(quick sort program), I found that will naturally generate a duplication factor case. So I just removed this test. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
547 | Fixed! | |
llvm/tools/llvm-profgen/ProfiledBinary.cpp | ||
224 | Here the code is guaranteed only used by line-number based because this is the function to generate the StringBasedCtxKey. std::shared_ptr<StringBasedCtxKey> FrameStack::getContextKey() { std::shared_ptr<StringBasedCtxKey> KeyStr = std::make_shared<StringBasedCtxKey>(); KeyStr->Context = Binary->getExpandedContext(Stack, KeyStr->WasLeafInlined); ... } | |
543–546 | I checked that our internal tools also used duplication factor this way, i. e decoding it when we need it in profile generation. Alternatively, we can also have a hash map like {offset, duplication factor} offset2DuplicationFactorMap which can be built on-demand as well, which one do you think is better? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
547 | Am I not on the latest version? This seems not fixed. | |
llvm/tools/llvm-profgen/ProfiledBinary.cpp | ||
543–546 | Looks like the duplication factor is dropped during profile key generation (for CS it's dropped in getExpandedContext), while it is used when computing sample counts. So it should work. Adding a map is cleaner, but I'm a bit concerned about its efficiency. |
Updating D109934: [llvm-profgen] Add duplication factor for line-number based profile
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
400 | It looks like we get all duplication factors from callsite duplication factor, which is a bit weird given duplication factor applies to normal locations too. However this is a confusion cause by misleading naming. The name SampleContextFrame::Callsite is misleading because when used to represent leaf, there's no call site. It's FuncName/Location, rather than CallerName/Callsite.. I think the struct name was rename from Callsite into SampleContextFrame, but member names were kept as call site. Perhaps we should rename too, Hongtao? | |
llvm/tools/llvm-profgen/ProfiledBinary.cpp | ||
543–546 | I guess practically we only need to keep duplication factor for leaf frame, but not for middle call sites.. That way we don't need extra map. And call sites won't have different location due to duplication factor. |
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
46 | nit: getBaseDiscriminator? (thought I commented on this before, but looks like I missed..) |
getDiscriminator --> getBaseDiscriminator
llvm/tools/llvm-profgen/ProfileGenerator.h | ||
---|---|---|
46 | Fixed! | |
llvm/tools/llvm-profgen/ProfiledBinary.cpp | ||
543–546 | I found that for CS line-number based context, the leaf frame of one address can be the middle of the whole context. In that case, we still need to manually decode the base discriminator for that frame. This might cause some inconsistencies. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
400 | Agreed. CallerName/Callsite is misleading as a field in SampleContextFrame. FuncName/Location sounds better. There are probably a lot places using them. Refactoring can be done in a separate change. struct SampleContextFrame { StringRef CallerName; LineLocation Callsite; SampleContextFrame() : Callsite(0, 0) {} SampleContextFrame(StringRef CallerName, LineLocation Callsite) : CallerName(CallerName), Callsite(Callsite) {} |
nit: getBaseDiscriminator? (thought I commented on this before, but looks like I missed..)