Added ability to estimate the entry count of the extracted function and the branch probabilities of the exit branches.
Details
- Reviewers
davidxl silvas davide - Commits
- rGf801575fd059: CodeExtractor : Add ability to preserve profile data.
rG620892432332: CodeExtractor : Add ability to preserve profile data.
rL277411: CodeExtractor : Add ability to preserve profile data.
rL277313: CodeExtractor : Add ability to preserve profile data.
Diff Detail
Event Timeline
include/llvm/Analysis/BlockFrequencyInfo.h | ||
---|---|---|
67 | nit: perhaps just freqToProfileCount or getProfileCountFromFreq? | |
include/llvm/Analysis/BlockFrequencyInfoImpl.h | ||
485 | Same here | |
lib/Transforms/Utils/CodeExtractor.cpp | ||
745 | clang-format? add space | |
748 | EntryCount --> EntryFreq | |
803 | nit: OptEntryCount --> EntryCount | |
805 | When EntryCount does not exist, skip it instead of setting it to 0. | |
814 | Move this into a helper function |
Sorry for the delay on getting to this.
lib/Transforms/IPO/PartialInlining.cpp | ||
---|---|---|
155–157 | Can you add a comment before the false showing the name of the argument? E.g. CodeExtractor(ToExtract, &DT, /*TheArgName*/false, BFI).extractCodeRegion(); | |
lib/Transforms/Utils/CodeExtractor.cpp | ||
694 | Can you explain more how we would get into a situation with a weight of zero and why we need to check it? maybe parts of this should be proper methods on BFI? E.g. below you are directly accessing the impl class so it sounds like something here should maybe be a method on BFI. also, a nit: can't | |
743 | Should we change the getBPI method to return non-const? I assume the author made them const for a reason. E.g. BFI might be relying on BPI not changing under it. | |
805 | nit: no space after the open paren | |
test/Transforms/CodeExtractor/2016-07-20-MultipleExitBranchProb.ll | ||
30 ↗ | (On Diff #65625) | Can you do a check with a pattern match and capture a variable, like you do for the other test? You should be able to do something like test/Transforms/PGOProfile/branch1.ll to check the profile metadata on a particular instruction: ; USE: br i1 %cmp, label %if.then, label %if.end ; USE-SAME: !prof ![[BW_ENTRY:[0-9]+]] ; USE-DAG: ![[BW_ENTRY]] = !{!"branch_weights", i32 2, i32 1} |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
697 | This check does not look correct. It is possible for the frequency to be zero | |
802 | The Freq > 0 check can be removed. I don't see how it can be zero (it is not actual count). | |
805 | The check should be here: if optEntryCount has value (hasValue() returns true, not >0), set it to the newFunction entry. hasValue is false when PGO is not enabled. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
697 | Yeah, that was my understanding. Thanks for the sanity check! I need to add the weights without using the current methods given that they do not allow for weights of 0. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
805 | Thanks for the clarification, I was unsure if it should just be for the entry count. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
697 | Although, can you help clarify on what I should be doing in the case of a 0 frequency? I am unsure on whether the branch weight can be set to a hard 0 or if it needs to be modified to always have a non zero weight? |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
697 | zero weight is fine -- BFI will always adds 1 to it if it sees it due to some limitation (it is less than ideal -- a better way to is to fix BPI (which will be fixed in the near future). One the other hand if the total weight is zero, you may want to treat it as invalid. |
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
697 | Awesome. Thanks for the clarification! |
Some nits.
lib/Transforms/Utils/CodeExtractor.cpp | ||
---|---|---|
701 | Missing '.' at the end of this comment. | |
742 | should this be "entry frequency" or "entry count" instead of just "entry"? | |
779 | Add a comment here about what this loop is doing. | |
802 | The function is newly created, so this should probably say "Calculate and set" instead of "Update". |
nit: perhaps just freqToProfileCount or getProfileCountFromFreq?