Change to use UINT64_MAX for sample count of dangling probe from llvm-profgen side. The compiler will identify this for further process, please refer to https://reviews.llvm.org/D95962 for the details.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
512 | Do we still want to count samples on dangling probe towards total samples? Without deduplication for dangling probes, we could have multiple dangling probes in the same block and counting samples covering these probe repeatedly may cause bloated total samples? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
512 | Good point. Since the samples collected on dangling probes are invalid, I would not count it against total samples. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
512 | I see, thanks for your suggestion. How about the count for isEntry for line506, I guess it's the original count not the zero? Or we won't have dangling probe which is the entry probe? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
512 | That's a good point. It's possible to have a dangling entry probe and we should bail out here. Or we can just return right after adding body samples for dangling probes. |
remove count towards total sample for dangling probes
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
512 | I see. Since the sample count are invalid, we shouldn't count for total sample nor using it to infer inliner's sample count. Thanks for your clarification. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | Thanks for the sharing. So for profgen side, we can keep this patch, right? see line 501, it only adds the body sample and doesn't accumulate the total sample. For compiler side, we need to avoid the addition when meeting the UINT64_MAX |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | May need to check dangling in CSProfileGenerator::populateFunctionBoundarySamples when updating the callsite target samples? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | I was wrong. The current implementation looks good. We are not propagating UINT64_MAX anywhere except for adding it as a body sample. On the compiler side, the merging of dangling probes are taken care of here: https://reviews.llvm.org/D95962 . Please rebase this change on top of D95962 to share the definition of FunctionSamples::InvalidProbeCount. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | Thanks for your clarification. Will rebase it after D95962 landed. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | Since we are using addBodySamples here, can we have probe count not equal to UINT64_MAX after the addition? Can we add assertion? |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | Good catch. Changed to update the count only it doesn't exist and add the assertion. |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | This is a good point. Probes with same ID may have both dangling and non-dangling copies. Using SampleRecord::merge should be safe, like R->merge(unctionSamples::InvalidProbeCount, 1) |
llvm/tools/llvm-profgen/ProfileGenerator.cpp | ||
---|---|---|
501 | Thanks for your reminding. I see, then we should also consider non-dangling case, so added a helper function addBodySamplesForProbe for this. See whether this looks good for you? |
Actually I UINT64_MAX may cause overflow to total samples. Even if it doesn't, profile merge may overflow too. That's one of the reasons we were using 0 as a special count. @wmi What do you think about keeping using 0?