This PR creates a utility class called ValueProfileOracle that tells PGOInstrumentationGen and PGOInstrumentationUse what to value-profile and where to attach the profile metadata.
It then refactors logic scattered in PGOInstrumentation.cpp into two plugins that plug into the ValueProfileOracle.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
In general, I like the cleanup of extracting out the value profiling handling into a uniform API. I'm not convinced it needs something as complex as the plugin interface though, as opposed to a wrapper class that has an instance of each type and invokes the appropriate run() command given the requested profile type. Although perhaps there isn't a big downside of doing it that way either, not sure of the overhead of the resulting template instantiation.
Comment below about naming conventions.
llvm/lib/Transforms/Instrumentation/ValueProfileOracle.cpp | ||
---|---|---|
35 ↗ | (On Diff #221336) | LLVM coding conventions don't use underscores: https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly |
This generally looks good, but I suggest changing the name of ValueProfileOracle to ValueProfileCollector to make it explicit.
I enclosed the VPOPluginChain template class in an unnamed namespace so that all instantiated member functions (and constructors) get internal linkage rather than external weak, which will hopefully give more reason to inline them and eliminate their definitions. Thanks to Hubert Tong for this suggestion.
This did not make a difference to the symbol export list inside the clang executable or inside the shared library (libLLVMInstrumentation.so.10svn) that contains ValueProfileOracle.cpp, because the build compiler (clang 8.0) I'm using was able to inline all VPOPluginChain member functions and eliminate their defs.
Renamed ValueProfileOracle to ValueProfileCollector. I agree that at the moment, this utility class has a single purpose and that's to collect value profiling candidates; so the new name makes sense.
One more minor nit about naming conventions. Otherwise it looks ok to me, but wait for David to sign off.
llvm/lib/Transforms/Instrumentation/ValueProfileCollector.cpp | ||
---|---|---|
47 ↗ | (On Diff #222219) | No underscore |
- Renamed Plugin_t to PluginT, and checked that no other type name has an underscore.
- Renamed VPCPluginChain to PluginChain
Sorry it takes long to review this.
I have a pending patch that value profiling the expensive MOD/REM instructions. It is more tightly coupled to edge profile (than current mem-size-op and icall value profile).
The difference is that it does not have <valuesite, value> pair. The counts are stored like the edge counts.
It seems to be fine with patch as the instrumentation says in PGOInstrumenation.cpp and only the candidate collecting is extracted out.
Overall, I'm fine with this refactor/improvement.
llvm/lib/Transforms/Instrumentation/ValueProfileCollector.h | ||
---|---|---|
49 ↗ | (On Diff #222227) | Sorry, I don't follow this paragraph. What is 'total" here, the BB count for "if ...", or the total count for memcpy"? We do need a match for the time of instrumentation and use. |
58 ↗ | (On Diff #222227) | For value profile, we usually find the target instruction and instrument right before it. The annotation will be on that instruction. |
Thanks Rong.
Let me know what you think.
llvm/lib/Transforms/Instrumentation/ValueProfileCollector.h | ||
---|---|---|
49 ↗ | (On Diff #222227) | Here I'm trying to demonstrate to the reader that at different points in a user program the value profile info that can be potentially collected for a particular expression are not the same. I understand that the IR has to match at the time of instrumentation and the time of use. |
58 ↗ | (On Diff #222227) | It is true that for the current two plugins (memop and ICP) the Insertion point and Annotated Instruction are the same. |
llvm/lib/Transforms/Instrumentation/ValueProfileCollector.h | ||
---|---|---|
49 ↗ | (On Diff #222227) | OK. The profiled value should reflect the value being profiled (the InsertPt instruction). The use part should know do different things based on what is instrumented. I think you really meant something like takes at runtime and the frequency of each value at the instrumented instructions" |
58 ↗ | (On Diff #222227) | OK. But in theory, the collect does not need to know the annotaedInstr as it will not be used in instrumentation. Right? |
llvm/lib/Transforms/Instrumentation/ValueProfileCollector.h | ||
---|---|---|
49 ↗ | (On Diff #222227) | Yes. the "at the instrumented instructions" addition is necessary to make the statement more precise. |
58 ↗ | (On Diff #222227) | I assume you are talking about the ValueProfileCollector when you say "the collect". The reason all three members are in one struct rather than in two structs (one for each phase) is that it was natural to generate all pieces of information about a particular candidate in one place in the code, and then have each phase use what it needs from it. |