Page MenuHomePhabricator

[PGO] Adjust -vp-counters-per-site.
ClosedPublic

Authored by yamauchi on Dec 4 2020, 9:57 AM.

Details

Summary

Addressing clang bootstrap under the dynamic linking mode running out of static
allocation of value profile nodes, reported in D81682. The vnodes section size
goes from 1501k to 2171k (+670k bytes).

Diff Detail

Event Timeline

yamauchi created this revision.Dec 4 2020, 9:57 AM
yamauchi requested review of this revision.Dec 4 2020, 9:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 4 2020, 9:57 AM
dmajor added a subscriber: dmajor.Dec 4 2020, 10:02 AM

Thanks a lot! Glad to see it's fine with a value of 1.5. (We have been setting it to 2 in our build as a temporary workaround)

how much size increase is in clang instrumented build?

how much size increase is in clang instrumented build?

I take it you mean a clang statically-linked build. It goes from 1699k with 1.0 to 2457k with 1.5, which is about a ~760k bytes increase.

The size increase is significant. It is likely that some very large apps may run into size limit issue at link time with this change.

yamauchi updated this revision to Diff 310684.Dec 9 2020, 3:29 PM

Adjust only for dynamic linking mode. Rebase.

The size increase is significant. It is likely that some very large apps may run into size limit issue at link time with this change.

Ok. Now adjust the flag value only when the PGO instrumented build and dynamic linking mode.

davidxl added inline comments.Dec 9 2020, 5:06 PM
llvm/cmake/modules/HandleLLVMOptions.cmake
923

is version 11 when the option is introduced?

This revision is now accepted and ready to land.Dec 10 2020, 10:01 AM
yamauchi marked an inline comment as done.Dec 10 2020, 1:38 PM
yamauchi added inline comments.
llvm/cmake/modules/HandleLLVMOptions.cmake
923

Version 11 is when the extended value profile buckets was enabled (and presumably this issue started.)

This revision was landed with ongoing or failed builds.Dec 11 2020, 9:43 AM
This revision was automatically updated to reflect the committed changes.
yamauchi marked an inline comment as done.

Hm, we tell users not to use Xclang or mllvm flags because they're implementation details. Furthermore, we control the default value of these things. Could we instead change the default value to just work, instead of needing tweaks?

tambre added a subscriber: tambre.Dec 14 2020, 9:30 AM

In my case 1.5 is better, but not enough. 2.0 is good and is what I've used so far. However, -vp-counters-per-site can only be specified once so this has broken my builds and I've patched it out for now.
Could we maybe parameterize this or add an option to turn it off?

I can add "cl::ZeroOrMore" to "-vp-counters-per-site". Would it be good?

tambre added a comment.EditedDec 14 2020, 10:43 PM

I can add "cl::ZeroOrMore" to "-vp-counters-per-site". Would it be good?

I pass the option using -DCMAKE_CXX_FLAGS_INIT="-mllvm -vp-counters-per-site=2", which means it gets added before the one with 1.5.

Run Build Command(s):/usr/bin/ninja cmTC_353b0 && [1/2] Building C object CMakeFiles/cmTC_353b0.dir/src.c.o
FAILED: CMakeFiles/cmTC_353b0.dir/src.c.o
/usr/bin/cc -D_GNU_SOURCE  -g -fdebug-prefix-map=/opt/deb/llvm=. -Wformat -Werror=format-security -O3 -Wdate-time -march=skylake -fsplit-machine-functions -mllvm -vp-counters-per-site=2 -fPIC -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wdelete-non-virtual-dtor -Wno-comment -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -fprofile-generate="/var/tmp/pgo/clang/ir" -Xclang -mllvm -Xclang -vp-counters-per-site=1.5 -flto=thin -DHAVE_DECL__BITSCANFORWARD  -Werror=unguarded-availability-new -MD -MT CMakeFiles/cmTC_353b0.dir/src.c.o -MF CMakeFiles/cmTC_353b0.dir/src.c.o.d -o CMakeFiles/cmTC_353b0.dir/src.c.o -c src.c
clang (LLVM option parsing): for the --vp-counters-per-site option: may only occur zero or one times!

I don't think that'd result in 2 being used with your proposed fix?

Logically -vp-counters-per-site only makes sense to be specified once. I think parameterizing the LLVM CMake code would be better than changing LLVM itself.
Patch: D93281

D93281 looks good to me.