This allows relative block frequency of call edges to be passed to the
thinlink stage where it will be used to compute synthetic entry counts
of functions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
include/llvm/Bitcode/LLVMBitCodes.h | ||
---|---|---|
259 ↗ | (On Diff #130310) | Could use a description of when this is used. |
include/llvm/IR/ModuleSummaryIndex.h | ||
58 ↗ | (On Diff #130310) | Do we need 64 bits in practice? Especially since these are scaled down by the entry freq. Concerned about the size increase for large indices that have huge numbers of edges. Also, since we'll never have both the Hotness and the RelBlockFreq, it would be great if we could union them, although that would require a flag on the index and I suppose we might want to support mixing modules built with and without profile data? Perhaps some other way to combine them. I.e. use a single field and reserve the lowest 3 bits for the hotness, shifting the relative frequency into the upper n-3 bits... |
lib/Analysis/ModuleSummaryAnalysis.cpp | ||
63 ↗ | (On Diff #130310) | Suggest moving this to the bitcode writer and always adding RelBlockFreq into the in-memory index here. Is there any reason to default off? Why not just default on from the get-go? |
290 ↗ | (On Diff #130310) | Any chance getEntryFreq will be 0? |
292 ↗ | (On Diff #130310) | Suggest saving the ValueInfo already created earlier on line 280. |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
58 ↗ | (On Diff #130310) | From our offline discussions I thought the use of 64 bits doesn't matter as they get compressed. Is the concern about in-memory size? I can reduce the size to, say, 29 bits (uint32_t with 3 bits reserved for the enum). Another thing to note is that only the per-module summary has this information. Is there a way to make use of this? |
lib/Analysis/ModuleSummaryAnalysis.cpp | ||
63 ↗ | (On Diff #130310) | Until this information actually gets used (first propagate the counts and then use the counts in the inliner), this just poses extra compile time and increased summary size if not turned off. Alternatively, I can drop this patch and then send it as part of a larger patch that builds the callgraph and propagates the counts. |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
58 ↗ | (On Diff #130310) | Right - concerned about in-memory size. Why does only the per-module summary contain this? Isn't that just temporary - I thought the idea was to propagate this into the backends? |
lib/Analysis/ModuleSummaryAnalysis.cpp | ||
63 ↗ | (On Diff #130310) | How long do you expect until the other patches to be sent? If not too long I don't know that it matters too much since we are going to incur the extra space/memory/time in the near term anyway. Better to have it tested. |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
58 ↗ | (On Diff #130310) | Made this into a 29 bit bitfield. |
lib/Analysis/ModuleSummaryAnalysis.cpp | ||
63 ↗ | (On Diff #130310) | Most of the patches are adding necessary infrastructure but the final component is tuning the inliner which could take a long time. I have now changed the code to compute relative block frequency unconditionally but have left the default value of the flag to false so it won't be written. |
290 ↗ | (On Diff #130310) | When they are computed they can't be 0, but there is an interface to set block frequency (this is used for incremental update) although in that case BFI should be recalculated after the pass. In any case, since the possibility exists I have guarded the division. |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
63 ↗ | (On Diff #130987) | This shouldn't be here - it is adding overhead to every single call edge. I confirmed that with this here, the new sizeof(CalleeInfo) is 16 bytes (4 bytes for Hotness+RelBlockFreq, 4 bytes for padding to start uint64_t, 8 bytes for uint64_t). Without this defined here the sizeof(CalleeInfo) goes down to 4 bytes as expected. |
lib/Analysis/ModuleSummaryAnalysis.cpp | ||
52 ↗ | (On Diff #130987) | Not needed now? |
include/llvm/IR/ModuleSummaryIndex.h | ||
---|---|---|
63 ↗ | (On Diff #130987) | Good catch. I have made this to a static constexpr and avoided the definition. Let me know if this looks okay. |
The patch caused two issues
- I used CI (of type CallInst*) to get the parent BB, but CI could be null. The fix is to use &BB directly.
- Use of HotnessType Hotness : 3 gave warnings. The fix is to make it uint32_t Hotness : 3 and use static_cast to convert back and forth from enum.
I will update the patch. PTAL.