This patch fixed two issues:
(1) value profile merging did not worked as the strong definition of the merge hook function never worked. We were not doing the merging of value profile. This patch remove the weak attribute and assign the hook dynamically.
(2) in case value profile is not merged, truncated the file so that we don't have garbage in the end of the profile.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | non value profiling data has fixed size, so why is this needed? |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | Because there are still value counts in both in-memory and in-disk profiles. Using the test in the patch as the example (running on my machine). if we don't do online merge, If we run "gen 1 && gen", the result binary is still 352 bytes. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | Without VpMergeHook, we should assert that there is no value profile data -- it should not be dumped in the first place. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | current behavior is to write out the in-memory value profile data and discard the value profile data in the file. if there is not value profile from the file, the result is still correct. I'm not familiar with the use cast the not having libc (where VPMergeHook will be nullptr). Is it not suppose to have any value profile? I can assert VPMergeHook==nullptr and in-memory valuesites are not empty. That will be easy. Is this what you suggested? |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | In lprofGetVPDataReader method, returns null if VPMergeHook is null. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | This will generate incorrect profile: the valuesite in the data section says there are valuesites but VP count is empty. if we do this, the behavior would be: llvm-profdata show or merge command will stop silently at the first function that has the value-site instrumentation. The rest profile will be discarded. Note, of course, this only applies to VPMergeHook==nullptr case. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | On the other hand, it seems safe to do truncation here unconditionally before the writing later. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
230 ↗ | (On Diff #139655) | one correction. When Merge is done, the namedata section will be skipped during dumping to speed up the profile write. Make sure you do the truncation to the right size. |
This is the updated patch. It fixed some issues in previous patch. It also adds a new API for updating value-count in batch mode. Without this, PGO instrumented Clang incurs 20x slow down in building itself (with %9m prefix -- with %m is even greater slowdown ).
lib/profile/InstrProfilingValue.c | ||
---|---|---|
138 ↗ | (On Diff #140479) | Is this call getting inlined in the library build? If not, it will introduce unnecessary overhead. One way to avoid this is to define a static always inline function, and then make the two APIs as wrappers to that common function. |
I checked the assembly in
clang_rt.profile-i386.dir/ and
clang_rt.profile-x86_64.dir/
Both are inlined.
My previous refers to using gcc as the compiler.
With llvm (itself, i.e. trunk) as the compiler, it's not inlined.
I will update the patch using a wrapper.
lib/profile/InstrProfilingValue.c | ||
---|---|---|
135 ↗ | (On Diff #140485) | need to put the attribute((..)) into InstrProfilingPort.h under GNUC section. |
lib/profile/InstrProfilingMerge.c | ||
---|---|---|
40 ↗ | (On Diff #140485) | Those pointers are available via llvm_profile_begin_xxx /llvm_profile_end_xxx. It is better to avoid duplicating the logic of computing the profile data size. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
189 ↗ | (On Diff #140509) | What I suggested is to use public APIs: llvm_profile_begin_data(), llvm_profile_end_data(), llvm_profile_begin_counters(), llvm_profile_end_counters(), so that you don't even need to to assume the raw profile layout here. |
lib/profile/InstrProfilingFile.c | ||
---|---|---|
189 ↗ | (On Diff #140509) | you are right! as we have checked the compatibility, both versions should have the same format. |
Hi,
It appears the commit related to this review has cause a bot failure on
Green Dragon. This failure was masked by an earlier failure which was
reverted a short time ago. Please have a look at:
http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/44030/ and
either revert your commit or notify me if you will be submitting a patch
within the next hour.
Respectfully,
Mike Edwards