Page MenuHomePhabricator

[InstrProfiling] Use !associated metadata for counters, data and values
AcceptedPublic

Authored by phosek on Mar 25 2020, 2:16 PM.

Details

Summary

The !associated metadata may be attached to a global object declaration
with a single argument that references another global object. This
metadata prevents discarding of the global object in linker GC unless
the referenced object is also discarded.

Furthermore, when a function symbol is discarded by the linker, setting
up !associated metadata allows linker to discard counters, data and
values associated with that function symbol. This is not possible today
because there's metadata to guide the linker. This approach is also used
by other instrumentations like sanitizers.

Note that !associated metadata is only supported by ELF, it does not have
any effect on non-ELF targets.

Diff Detail

Event Timeline

phosek created this revision.Mar 25 2020, 2:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2020, 2:16 PM
Herald added subscribers: Restricted Project, hiraditya. · View Herald Transcript

maybe adding.a test at profile runtime as well to check GC actually happens?

phosek planned changes to this revision.Mar 25 2020, 3:01 PM

This is an alternative solution to D64045, but it seems like !associated metadata, which are lowered to SHF_LINK_ORDER are incompatible with section groups which are used for external symbols, so this solution alone isn't sufficient.

Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2020, 12:07 AM
phosek updated this revision to Diff 265615.May 21 2020, 2:58 PM

I've added a test which is analogous to instrprof-darwin-dead-strip.c which tests the same functionality on Darwin. I've also updated the inliner to properly handle !associated metadata in D80186 which is necessary here since instr-profiling pass runs before the inliner.

phosek updated this revision to Diff 265665.May 21 2020, 9:02 PM

@davidxl would it be possible to take a look?

What is the instrumented size improvement for Clang? How about profile size reduction?

compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
37 ↗(On Diff #265665)

why adding the weak attribute ?

40 ↗(On Diff #265665)

what is this change for?

compiler-rt/test/profile/instrprof-gc-sections.c
4

Also check IR PGO?

73

Explain the expected output here?

phosek updated this revision to Diff 268083.Jun 3 2020, 1:28 AM
phosek marked 2 inline comments as done.

What is the instrumented size improvement for Clang? How about profile size reduction?

I tried building Clang using -fprofile-instr-generate with this change and without it and the size went from 1.7G to 820M, so roughly 50% reduction. I also tried compiling a trivial C file and the generated profile was 155M with this change and 978M without it.

compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
37 ↗(On Diff #265665)

See on the left.

40 ↗(On Diff #265665)

!associated metadata is lowered to SHF_LINK_ORDER and linker requires that all sections that are merged together have the same flags. Since these symbols are placed into the same section as the compiler generated symbols, and don't set the SHF_LINK_ORDER flag, we get a link-time error:

/usr/bin/ld: __llvm_prf_cnts has both ordered [`__llvm_prf_cnts' in /tmp/instrprof-value-prof-real-2c3254.o] and unordered [`__llvm_prf_cnts' in /src/clang-llvm/llvm-build/debug/lib/clang/11.0.0/lib/linux/libclang_rt.profile-x86_64.a(InstrProfilingPlatformLinux.c.o)] sections

One possible way around this is to define these symbols using inline assembly (since there's no way to set SHF_LINK_ORDER in pure C/C++):

asm(
  ".pushsection " INSTR_PROF_DATA_SECT_NAME ",\"ao\",@progbits,.text\n\t"
  ".popsection\n"
)

Another option is to avoid these symbols altogether, and instead make the __start/__stop symbols weak which means that they'll end up as NULL if those sections are empty, and any loop that iterates over those section contents using those __start/__stop will be no-op. This seemed like a cleaner solution to me, but I'm happy to change if you prefer. One alternative I considered would be to introduce a new attribute which would allow setting the SHF_LINK_ORDER flag from C/C++, but that's going to take a bit more work.

compiler-rt/test/profile/instrprof-gc-sections.c
73

I've already tried to do that in Note: comments above.

phosek updated this revision to Diff 268095.Jun 3 2020, 1:44 AM
phosek updated this revision to Diff 268097.Jun 3 2020, 1:55 AM

I've also tested this change in Fuchsia and saw similar improvements: fuchsia.zbi (the boot image) size reduced from 90M to 42M, fvm.sparse.blk (disk image) size reduced from 3.9G to 2.8G.

(Sorry my last comments was not submitted)

compiler-rt/lib/profile/InstrProfilingPlatformLinux.c
29 ↗(On Diff #268097)

Add a comment here about the need of the weak attribute

Looks good but can you do help with a clang instrumentation build (IR PGO), and see the binary and profile size change?

phosek updated this revision to Diff 269082.Jun 7 2020, 4:49 PM
phosek marked an inline comment as done.

Looks good but can you do help with a clang instrumentation build (IR PGO), and see the binary and profile size change?

With IR PGO, the difference is much smaller: for Clang the size decreased from 141M to 134M, for the generated profile the size decreased from 18M to 17M.

davidxl accepted this revision.Jun 7 2020, 5:47 PM

LGTM (thanks for the test. Will see how this works with very large internal apps.

This revision is now accepted and ready to land.Jun 7 2020, 5:47 PM
This revision was automatically updated to reflect the committed changes.

this broke libprofile tests for i386: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/27806 , also https://ci.chromium.org/p/chromium/builders/try/linux_upload_clang/966? (which has more failures, but look like the same cause)

Can you take a look, and revert if it takes a while to fix?

bsaleil added a subscriber: bsaleil.EditedJun 9 2020, 3:11 PM

Hello,
This patch also broke some of our PowerPC buildbots (e.g. http://lab.llvm.org:8011/builders/clang-ppc64le-rhel/builds/4263).
Can you please push a fix or revert ?
Thanks.

aeubanks added a subscriber: aeubanks.EditedJun 9 2020, 8:09 PM

This is still breaking 37 tests, even after the first attempted fix forward...
ninja check-profile

This revision is now accepted and ready to land.Jun 10 2020, 2:35 AM

I apologize about the late response, I didn't get any email from Phabricator for some reason :/

I'll try to reproduce and address the issue on i386, but I don't have access to any PowerPC machine, @bsaleil is there any way to get access to get access to your PowerPC bots?

I looked into these issue:

phosek updated this revision to Diff 270031.Jun 10 2020, 9:56 PM

I've updated the patch to handle the _IO_stdin_used symbol in the test. I couldn't reproduce this locally, but it's possible that this is due to different libc version on the bot.

I've updated the patch to handle the _IO_stdin_used symbol in the test. I couldn't reproduce this locally, but it's possible that this is due to different libc version on the bot.

I still can reproduce with the latest patch on my workstation

Exit Code: 1

Command Output (stderr):
--
/usr/bin/ld: __llvm_prf_data has both ordered and unordered sections
/usr/bin/ld: final link failed: bad value
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)

I've updated the patch to handle the _IO_stdin_used symbol in the test. I couldn't reproduce this locally, but it's possible that this is due to different libc version on the bot.

I still can reproduce with the latest patch on my workstation

Exit Code: 1

Command Output (stderr):
--
/usr/bin/ld: __llvm_prf_data has both ordered and unordered sections
/usr/bin/ld: final link failed: bad value
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)

This is the other issue I mentioned which also affects Chromium bots. The root cause is the lack of support for the extended semantics of SHF_LINK_ORDER that !associated metadata rely on in bfd.ld. I'm trying to figure out what's the best way forward. We could just disable all affected profile tests when bfd.ld is used as the linker, but that's pretty drastic. Another option would be to gate this feature on a backend flag; targets that use gold or lld as their linker could turn this on by default.

phosek updated this revision to Diff 270263.Jun 11 2020, 4:42 PM

I still can reproduce with the latest patch on my workstation

Exit Code: 1

Command Output (stderr):
--
/usr/bin/ld: __llvm_prf_data has both ordered and unordered sections
/usr/bin/ld: final link failed: bad value
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)

This is the other issue I mentioned which also affects Chromium bots. The root cause is the lack of support for the extended semantics of SHF_LINK_ORDER that !associated metadata rely on in bfd.ld. I'm trying to figure out what's the best way forward. We could just disable all affected profile tests when bfd.ld is used as the linker, but that's pretty drastic. Another option would be to gate this feature on a backend flag; targets that use gold or lld as their linker could turn this on by default.

I've found a workaround for the bfd.ld issue. The problem is that __llvm_prf_data input sections have sh_addralign=8 but size is 4mod8. bfd.ld is considering the alignment padding insertions to be "unordered sections" in the input. Inserting artificial padding in the __llvm_prf_data input section seems to work around the issue. @vitalybuka can you test this locally on your machine please?

Most failures are fixed, thanks! But I still get one failure with this patch:
LLVM Profile Warning: Unable to merge profile data: source profile file is not compatible.
LLVM Profile Error: Profile Merging of file /usr/local/google/home/aeubanks/repos/llvm-project3/build_cmake/projects/compiler-rt/test/profile/Profile-x86_64/Output/instrprof-gc-sections.c.tmp.pgo.profraw/default_15822682845097088506_0.profraw failed: File exists
LLVM Profile Error: Failed to write file "/usr/local/google/home/aeubanks/repos/llvm-project3/build_cmake/projects/compiler-rt/test/profile/Profile-x86_64/Output/instrprof-gc-sections.c.tmp.pgo.profraw/default_15822682845097088506_0.profraw": File exists
warning: /usr/local/google/home/aeubanks/repos/llvm-project3/build_cmake/projects/compiler-rt/test/profile/Profile-x86_64/Output/instrprof-gc-sections.c.tmp.pgo.profraw/default_15822682845097088506_0.profraw: Unsupported instrumentation profile format version
error: No profiles could be merged.

@davidxl Do you have any preference between the current solution which uses the artificial padding in __llvm_prf_data section to workaround the BFD ld bug, and a backend feature flag that could be set to enable this feature?

@davidxl Do you have any preference between the current solution which uses the artificial padding in __llvm_prf_data section to workaround the BFD ld bug, and a backend feature flag that could be set to enable this feature?

The latter might be preferable because there's another bug in trunk BFD ld which causes to enter infinite loop, see D72904 for more details.

@phosek, we just updated the clang-ppc64le-rhel bot to lld 10.0.0, is that recent enough for SHF_LINK_ORDER support you need ?

phosek updated this revision to Diff 283655.Aug 6 2020, 10:43 AM

@davidxl does this look good to you? This feature is now gated on -counter-associated-metadata flag (I'd welcome suggestions for better name if you have some) because we require recent lld. We might be able to set that flag in the frontend when we know that the latest lld is being used in the future.

phosek updated this revision to Diff 283711.Aug 6 2020, 12:58 PM
pcc added a subscriber: pcc.Aug 6 2020, 1:05 PM

@davidxl does this look good to you? This feature is now gated on -counter-associated-metadata flag (I'd welcome suggestions for better name if you have some) because we require recent lld. We might be able to set that flag in the frontend when we know that the latest lld is being used in the future.

Can you find a better way of communicating this infomation than via an llvm::cl flag (e.g. use a function attribute)? The flag won't be compatible with LTO.

phosek added a comment.Aug 6 2020, 2:29 PM
In D76802#2200926, @pcc wrote:

@davidxl does this look good to you? This feature is now gated on -counter-associated-metadata flag (I'd welcome suggestions for better name if you have some) because we require recent lld. We might be able to set that flag in the frontend when we know that the latest lld is being used in the future.

Can you find a better way of communicating this infomation than via an llvm::cl flag (e.g. use a function attribute)? The flag won't be compatible with LTO.

Do I understand your suggestion correctly that we would set function attribute when -fprofile-*-generate -fuse-lld is used that would enable the use of metadata in the backend?

pcc added a comment.Aug 6 2020, 2:37 PM
In D76802#2200926, @pcc wrote:

@davidxl does this look good to you? This feature is now gated on -counter-associated-metadata flag (I'd welcome suggestions for better name if you have some) because we require recent lld. We might be able to set that flag in the frontend when we know that the latest lld is being used in the future.

Can you find a better way of communicating this infomation than via an llvm::cl flag (e.g. use a function attribute)? The flag won't be compatible with LTO.

Do I understand your suggestion correctly that we would set function attribute when -fprofile-*-generate -fuse-lld is used that would enable the use of metadata in the backend?

Yes, that's what I had in mind.