Split out from https://reviews.llvm.org/D51240, see that patch for full context.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | I suggest moving this remapping support code out of the ProfData core library into llvm-profdata.cpp where an user option (for llvm_profdata) is provided so that the profile data with old naming schemes can be converted to new format: something like: llvm_profdata merge -o <new_format_prof_data> -remapping_file=<> <old_profile_data> By so doing, compiler does not need to be made aware of it. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | This approach doesn't really support that workflow: we don't have the ability to form remapped mangled names at all, only the ability to tell if two names are the same modulo a set of equivalences. (As noted on D51248, if we took the "remangling" path we'd add significant complexity and also put a significant maintenance burden on ourselves to keep a remangler up to date with extensions to the demangler, which is not necessary in this equivalence-based approach.) We could do something similar where the tool is given the old file, the remapping file, and a list of symbols in the new program, but that seems cumbersome to use (you'd need additional steps after compilation to extract the list of symbols and remap the profile data, and then you'd need to rerun the compilation with the updated profile data). The idea of this approach is that you merely need to add a remapping file and one flag to your compilations, and they will transparently be able to use profiling data from either before or after a symbol renaming with no change to the steps in the build process. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | Are there libraries to do name mangling and demangling? If that is the case, linking them into the profile tool has a lot of advantage over the approach that requires compiler awareness:
I think the most important point is 2) -- we may need to do it because of the performance reasons anyway. If we can not do mangling/remagnalling, I don't see the option to require passing symbol list too bad either. Note that llvm-cov tool also requires user to pass the execuable path. We can do the same here -- pass a newly compiled executable to the tool so that the tool can extract the newly named symbols. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | As far as I'm aware there is no library out there that supports demangling, mutating the demangled tree, and then re-mangling. I think it's probably possible to write such a library, but as noted, the engineering costs would be substantial. As noted in one of the other review threads, we can handle indirect call targets with the current approach if that's necessary, but I would rather add that functionality as a separate step after this patch series. As for doing this by changing the compiler, I still think that approach is a significantly superior option to requiring complications of the build process by adding an extra profile-data-mutation step. This patch makes it straightforward to make broad changes (such as changing the standard library) and retain applicability of most of the profile data, without requiring detailed knowledge of how each target program is built or how the build system feeds profile data into it, and without adding an extra compilation step to compute the set of new symbol names. I'm happy to teach llvm-profdata to apply remappings (given a remapping file and a symbol list for the new binary), but I would still like to provide the option to do the remapping on the fly. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | The assessment of the convenience to have the support directly in compiler depends on how often this feature is used. If it is used in daily build/workflow, you are absolutely right that this is more superior. However my view is that ABI breakages like this happen *very* rarely. Introducing compiler option for one-off change does not really bring much benefit. Assuming the functionality exists in llvm_profdata, it seems having the duplicated functionality not necessary. Besides, there is also the issue of distributed build system and dependency tracking. With this change in compiler, you will also need to invest time and effort to support it in bazel. On the other hand, doing one time profile conversion probably does not require build system support. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | My approach requires no changes to bazel. We need only add a remapping file to the toolchain and a flag naming it to the compilation flags. Doing profile conversion as a separate step means that whenever we're trying out a large-scale renaming change, we must manually invest in profile conversion, and cannot simply run the existing performance tests. That's the value added by having this feature in the compiler (even if it's also supported by llvm-profdata) -- it makes it easy to try out things that affect a lot of manglings and get an idea of the performance delta without going through the work of updating all the profiles to match. I'll post a patch to add remapping support to llvm-profdata, but I do not think that fully solves the problem I'm trying to solve, and I'd still like to add the on-the-fly translation to the compiler. |
lib/ProfileData/InstrProfReader.cpp | ||
---|---|---|
606 ↗ | (On Diff #162528) | I find it cleaner to make the name mapper a member field of InstrProfIndexReader class. The base mapper can be a dummy mapper while the ItaniumMapper can override the base behavior. |
include/llvm/ProfileData/InstrProfReader.h | ||
---|---|---|
351 ↗ | (On Diff #168896) | Space? template <typename HashTableImpl> |