This is an archive of the discontinued LLVM Phabricator instance.

[WPD] implement -funknown-vtable-visibility-filepaths
AbandonedPublic

Authored by modimo on Jun 12 2023, 12:00 PM.

Details

Reviewers
wenlei
tejohnson
Summary

The current WPD system is opt-in on a per-file basis with -fvisibility=hidden or on the entire LTO unit with --lto-whole-program-visibility. For types that are known to be in prebuilt native binaries there's no good way to exclude them outside of selectively removing -fvisibility=hidden or enumerating all their devirt sites with -mllvm -wholeprogramdevirt-skip=.

The idea behind -funknown-vtable-visibility-filepaths is that the interface between our built from source LTO unit files and the native binaries are header files packaged alongside the native binaries that are usually located in specific directories (e.g. third-party). Identifying these types automatically and excluding them from participating in WPD is a lightweight solution to enabling WPD correctly in these scenarios without any additional changes.

Ultimately, we're aiming for a more robust checked solution but this provides an on-ramp to evaluate and iterate on WPD.

Testing:
ninja check-all
large internal services at Meta that links against native binaries runs correctly with WPD enabled

Diff Detail

Event Timeline

modimo created this revision.Jun 12 2023, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 12:00 PM
Herald added subscribers: hoy, wenlei. · View Herald Transcript
modimo requested review of this revision.Jun 12 2023, 12:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 12:00 PM
modimo retitled this revision from -fskip-vtable-filepaths to [WPD] implement -fskip-vtable-filepaths.Jun 12 2023, 12:20 PM
modimo edited the summary of this revision. (Show Details)
modimo added reviewers: wenlei, tejohnson.
modimo edited the summary of this revision. (Show Details)Jun 12 2023, 12:24 PM

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

Also, isn't this hiding possibly necessary info from WPD that might be needed for correct class hierarchy analysis affecting other IR modules? I.e. in the type-metadata-skip-vtable-filepaths.cpp test, what if A was derived from a struct B, which was also defined/used in another module without this skipping option. We would lose information about the override of f in A, and possibly do an incorrect devirtualization elsewhere. It seems like a dangerous option to provide.

It might be better to provide an option that can somehow mark vtables in a given module as unsafe for devirt, and propagate that info to WPD.

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

Also, isn't this hiding possibly necessary info from WPD that might be needed for correct class hierarchy analysis affecting other IR modules? I.e. in the type-metadata-skip-vtable-filepaths.cpp test, what if A was derived from a struct B, which was also defined/used in another module without this skipping option. We would lose information about the override of f in A, and possibly do an incorrect devirtualization elsewhere. It seems like a dangerous option to provide.

It might be better to provide an option that can somehow mark vtables in a given module as unsafe for devirt, and propagate that info to WPD.

That would nicely side-step mismatched flags. Public vcall_visibility describes this case but with --lto-whole-program-visibility there's no a distinction between Public because of deferred vs. Public because the type is known unsafe. Thoughts on an unsafe vcall_visibility to capture the latter notion?

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

I'm confused - how would you pass this new option then? I was assuming you were passing this option to some LLVM built files at the interface of those libraries. In which case not passing -flto-unit and -fwhole-program-visibility should have a similar effect (suppress the type metadata).

Also, isn't this hiding possibly necessary info from WPD that might be needed for correct class hierarchy analysis affecting other IR modules? I.e. in the type-metadata-skip-vtable-filepaths.cpp test, what if A was derived from a struct B, which was also defined/used in another module without this skipping option. We would lose information about the override of f in A, and possibly do an incorrect devirtualization elsewhere. It seems like a dangerous option to provide.

It might be better to provide an option that can somehow mark vtables in a given module as unsafe for devirt, and propagate that info to WPD.

That would nicely side-step mismatched flags. Public vcall_visibility describes this case but with --lto-whole-program-visibility there's no a distinction between Public because of deferred vs. Public because the type is known unsafe. Thoughts on an unsafe vcall_visibility to capture the latter notion?

That would be better I think. Maybe "unknown".

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

I'm confused - how would you pass this new option then? I was assuming you were passing this option to some LLVM built files at the interface of those libraries. In which case not passing -flto-unit and -fwhole-program-visibility should have a similar effect (suppress the type metadata).

Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We want to avoid manual allowlists/blocklists because code changes make it less flexible and scalable than an automatic option. This can also be pretty tricky to do correctly since we can get type metadata from multiple TUs and all of them would need to be opted out for WPD to not kick in.

Also, isn't this hiding possibly necessary info from WPD that might be needed for correct class hierarchy analysis affecting other IR modules? I.e. in the type-metadata-skip-vtable-filepaths.cpp test, what if A was derived from a struct B, which was also defined/used in another module without this skipping option. We would lose information about the override of f in A, and possibly do an incorrect devirtualization elsewhere. It seems like a dangerous option to provide.

It might be better to provide an option that can somehow mark vtables in a given module as unsafe for devirt, and propagate that info to WPD.

That would nicely side-step mismatched flags. Public vcall_visibility describes this case but with --lto-whole-program-visibility there's no a distinction between Public because of deferred vs. Public because the type is known unsafe. Thoughts on an unsafe vcall_visibility to capture the latter notion?

That would be better I think. Maybe "unknown".

Sounds good, I'll rework the patch.

modimo updated this revision to Diff 531149.Jun 13 2023, 7:10 PM

Implement using VCallVisibilityUnknown

modimo updated this revision to Diff 531151.Jun 13 2023, 7:15 PM

Remove unrelated change, fix typo

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

I'm confused - how would you pass this new option then? I was assuming you were passing this option to some LLVM built files at the interface of those libraries. In which case not passing -flto-unit and -fwhole-program-visibility should have a similar effect (suppress the type metadata).

Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We want to avoid manual allowlists/blocklists because code changes make it less flexible and scalable than an automatic option.

It seems like you need allowlists or blocklists in either case - either it is passed as a regex via the option proposed here, or the build system modifies the options for that set of files.

This can also be pretty tricky to do correctly since we can get type metadata from multiple TUs and all of them would need to be opted out for WPD to not kick in.

But clang is presumably compiling a single TU at a time, so your regex needs to cover them all anyway? I'm not sure I understand the distinction between doing something like -fskip-vtable-filepaths=third-party/.* vs something like applying -funknown-vtable-visibility to each third-party/*.cc compile.

I really think the logic for which files to apply this option to belongs in the build system, not in the clang driver - just like any other clang option. It isn't clear to me why this particular option should be applied based on a file regex.

I like the approach of communicating this via vcall visibility, but just think that it should be applied to the current TU whenever a TBD new option is provided.

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

I'm confused - how would you pass this new option then? I was assuming you were passing this option to some LLVM built files at the interface of those libraries. In which case not passing -flto-unit and -fwhole-program-visibility should have a similar effect (suppress the type metadata).

Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We want to avoid manual allowlists/blocklists because code changes make it less flexible and scalable than an automatic option.

It seems like you need allowlists or blocklists in either case - either it is passed as a regex via the option proposed here, or the build system modifies the options for that set of files.

This can also be pretty tricky to do correctly since we can get type metadata from multiple TUs and all of them would need to be opted out for WPD to not kick in.

But clang is presumably compiling a single TU at a time, so your regex needs to cover them all anyway? I'm not sure I understand the distinction between doing something like -fskip-vtable-filepaths=third-party/.* vs something like applying -funknown-vtable-visibility to each third-party/*.cc compile.

The blocklists need to be enforced on internal files that interact with native libraries and those live in many different areas:

; /third-party/include/boost.h

class A {}

; /internal-source/a.cpp
#include "boost.h"

class B : public A

; /third-party/lib/boost.a
#include "boost.h"

class C : public A

That being said, this is something the build system can detect and mark.

I really think the logic for which files to apply this option to belongs in the build system, not in the clang driver - just like any other clang option. It isn't clear to me why this particular option should be applied based on a file regex.

The big advantage of doing this in the FE is that we know which types are actually coming from the native headers. Blocking all types in the TU is overly conservative and also less stable as header changes can effectively turn on/off unrelated large chunks of WPD.

modimo updated this revision to Diff 533048.Jun 20 2023, 2:45 PM

Remove leftover code from original implementation

I think I understand the motivation, but not sure I agree this is the right approach - can you simply not pass -flto-unit and -fwhole-program-vtables for these files?

For our third-party libraries, they're pre-built into native files by GCC so that's unfortunately not an option.

I'm confused - how would you pass this new option then? I was assuming you were passing this option to some LLVM built files at the interface of those libraries. In which case not passing -flto-unit and -fwhole-program-visibility should have a similar effect (suppress the type metadata).

Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We want to avoid manual allowlists/blocklists because code changes make it less flexible and scalable than an automatic option.

It seems like you need allowlists or blocklists in either case - either it is passed as a regex via the option proposed here, or the build system modifies the options for that set of files.

This can also be pretty tricky to do correctly since we can get type metadata from multiple TUs and all of them would need to be opted out for WPD to not kick in.

But clang is presumably compiling a single TU at a time, so your regex needs to cover them all anyway? I'm not sure I understand the distinction between doing something like -fskip-vtable-filepaths=third-party/.* vs something like applying -funknown-vtable-visibility to each third-party/*.cc compile.

The blocklists need to be enforced on internal files that interact with native libraries and those live in many different areas:

; /third-party/include/boost.h

class A {}

; /internal-source/a.cpp
#include "boost.h"

class B : public A

; /third-party/lib/boost.a
#include "boost.h"

class C : public A

That being said, this is something the build system can detect and mark.

I really think the logic for which files to apply this option to belongs in the build system, not in the clang driver - just like any other clang option. It isn't clear to me why this particular option should be applied based on a file regex.

The big advantage of doing this in the FE is that we know which types are actually coming from the native headers. Blocking all types in the TU is overly conservative and also less stable as header changes can effectively turn on/off unrelated large chunks of WPD.

Ok what I missed is that you don't want to apply this to entire TUs, but rather just some paths that are header files, which may be included in many source files. So in your above example, you really only need to apply to the path of third-party/include/boost.h - is that correct? That would mark class A, and therefore anything derived from it won't get devirtualized. I guess in your example above, you are trying to prevent the devirtualization in a.cpp since there are hidden overrides (class C) in boost.a native objects.

The example included with the patch applies the option to the source file of the test case, and therefore its entire TU. It would be helpful to have a test case structured like your example above, where the file path is just that of the header.

Maybe a better option name is something like -funknown-vtable-visibility-filepaths= ? It seems a bit more descriptive.

Ok what I missed is that you don't want to apply this to entire TUs, but rather just some paths that are header files, which may be included in many source files. So in your above example, you really only need to apply to the path of third-party/include/boost.h - is that correct?

Yep!

That would mark class A, and therefore anything derived from it won't get devirtualized. I guess in your example above, you are trying to prevent the devirtualization in a.cpp since there are hidden overrides (class C) in boost.a native objects.

Exactly, we saw this scenario causing issues when enabling WPD.

The example included with the patch applies the option to the source file of the test case, and therefore its entire TU. It would be helpful to have a test case structured like your example above, where the file path is just that of the header.

Makes sense and yeah the test case is confusing. Changed it to apply to just the header file.

Maybe a better option name is something like -funknown-vtable-visibility-filepaths= ? It seems a bit more descriptive.

Sure, changed.

modimo updated this revision to Diff 533326.Jun 21 2023, 10:33 AM

Address feedback. Allow empty string to unset the flag

modimo retitled this revision from [WPD] implement -fskip-vtable-filepaths to [WPD] implement -funknown-vtable-visibility-filepaths.Jun 21 2023, 10:34 AM
modimo edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Jun 23 2023, 12:14 PM

The big advantage of doing this in the FE is that we know which types are actually coming from the native headers. Blocking all types in the TU is overly conservative and also less stable as header changes can effectively turn on/off unrelated large chunks of WPD.

This is clearly going to be selective in punting unsafe devirt, however do you have data comparing the effectiveness of the two (module granularity vs header granularity)?

I also think introducing unknown visibility is a good idea for this to work, but this is going to be exposed to users (not hidden implementing only), so we would probably need to have spec/rule to handle conflicting visibility from different source and make those explicit here: https://clang.llvm.org/docs/LTOVisibility.html.

There's a spectrum of solutions we could use to make WPD safer, but we need to be careful not to make this whole thing too convoluted with multiple solutions implemented, but little differentiation in their incremental value (extra perf, extra safety). So having concrete data backing the incremental value of this solution would be helpful.

clang/include/clang/Basic/CodeGenOptions.h
191

Pattern string doesn't seem to be used anywhere, can we simplify this using llvm::Regex instead of RegexWithPattern?

198

update comment

clang/include/clang/Driver/Options.td
3223

nit: filepaths -> filepath - this does not accept a comma separated list of paths(regexs).

3224

Pass empty string to unset

Remove this to be concise?

clang/lib/Frontend/CompilerInvocation.cpp
2001

rename err_drv_optimization_remark_pattern so it's not specific to remarks.

llvm/include/llvm/IR/GlobalObject.h
41

This probably needs doc change too: https://clang.llvm.org/docs/LTOVisibility.html.

We also need to define when flag derived visibility conflicts with attribute derived visibility, what is the priority order.

modimo marked 4 inline comments as done.Jun 23 2023, 5:26 PM

The big advantage of doing this in the FE is that we know which types are actually coming from the native headers. Blocking all types in the TU is overly conservative and also less stable as header changes can effectively turn on/off unrelated large chunks of WPD.

This is clearly going to be selective in punting unsafe devirt, however do you have data comparing the effectiveness of the two (module granularity vs header granularity)?

Some data would help quantify the difference, I'll hack up a module granularity implementation and compare to the current one.

I also think introducing unknown visibility is a good idea for this to work, but this is going to be exposed to users (not hidden implementing only), so we would probably need to have spec/rule to handle conflicting visibility from different source and make those explicit here: https://clang.llvm.org/docs/LTOVisibility.html.

The ordering for conflicts is embedded in the logic for CodeGenModule::GetVCallVisibilityLevel which has priority order of:

  1. Unknown
  2. Public
  3. LinkageUnit
  4. TranslationUnit

I'll update the documentation to reflect this.

There's a spectrum of solutions we could use to make WPD safer, but we need to be careful not to make this whole thing too convoluted with multiple solutions implemented, but little differentiation in their incremental value (extra perf, extra safety). So having concrete data backing the incremental value of this solution would be helpful.

For concrete data are you talking about between the different solutions e.g. --lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI based, FatLTO based etc or something else?

clang/include/clang/Basic/CodeGenOptions.h
191

It's used here in CompilerInvocation.cpp:

if (Opts.SkipVtableFilepaths.hasValidPattern())
  GenerateArg(Args, OPT_funknown_vtable_visibility_filepaths_EQ,
              Opts.SkipVtableFilepaths.Pattern, SA);
modimo updated this revision to Diff 534134.EditedJun 23 2023, 5:26 PM

Feedback, add documentation for flag and unknown visibility, rebase.

For concrete data are you talking about between the different solutions e.g. --lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI based, FatLTO based etc or something else?

Right, between the different solutions. RTTI based solution doesn't exist yet, so maybe just compare using -fwhole-program-vtables on a known safe set of files vs using -funknown-vtable-visibility-filepaths on a known unsafe set of files first.

Some data would help quantify the difference, I'll hack up a module granularity implementation and compare to the current one.

I wasn't asking about implementing -funknown-vtable-visibility-filepaths with module (instead of header) granularity, but just the comparison mentioned above.

The ordering for conflicts is embedded in the logic for CodeGenModule::GetVCallVisibilityLevel which has priority order of

I was thinking about different source of visibility instead of absolute order of visibility itself - i.e. what is the rule if __attribute__((visibility("..."))) conflicts with -funknown-vtable-visibility-filepaths setting for a specific type? This may not be an immediately important question, but just as example of the knock on effect of added complexity, which may or may not be justified depending on the benefit, which goes back to data from experiments.

We have -wholeprogramdevirt-skip; with this patch, we will have -funknown-vtable-visibility-filepaths; later on, we will have another RTTI based solution, then there's FatObj solution. It feels like a lot of stuff trying to solve one problem, so wondering if this addition here is going to provide enough value in the end state.

For concrete data are you talking about between the different solutions e.g. --lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI based, FatLTO based etc or something else?

Right, between the different solutions. RTTI based solution doesn't exist yet, so maybe just compare using -fwhole-program-vtables on a known safe set of files vs using -funknown-vtable-visibility-filepaths on a known unsafe set of files first.

On a large Meta service with manually opting in an internal source folder with -fwhole-program-vtables there's 2,933 single implementation methods that get devirtualized. Using -funknown-vtable-visibility-filepath on the same service for the third-party directory there's 32,800 single implementation method devirts.

The ordering for conflicts is embedded in the logic for CodeGenModule::GetVCallVisibilityLevel which has priority order of

I was thinking about different source of visibility instead of absolute order of visibility itself - i.e. what is the rule if __attribute__((visibility("..."))) conflicts with -funknown-vtable-visibility-filepaths setting for a specific type? This may not be an immediately important question, but just as example of the knock on effect of added complexity, which may or may not be justified depending on the benefit, which goes back to data from experiments.

That complexity already exists with -fvisibility=hidden interacting with __attribute__((visibility("..."))) where the most conservative annotation wins out. Having a type annotated with unknown visibility is just adding a more conservative option than public.

We have -wholeprogramdevirt-skip; with this patch, we will have -funknown-vtable-visibility-filepaths; later on, we will have another RTTI based solution, then there's FatObj solution. It feels like a lot of stuff trying to solve one problem, so wondering if this addition here is going to provide enough value in the end state.

My current prototype RTTI implementation doesn't really need an unknown visibility because it's generating and passing a blocklist at symbol resolution time. For FatObj, the input into WPD is identical to when everything is built with ThinLTO so unknown isn't that valuable either. The original intent was to use this to roll out WPD to select services but performance-wise opting in folders with -fwhole-program-vtables proves just as effective without having to modify LLVM. With that use case gone, there's no longer a need on my side for this change. Others may find value for this in the interim to on-board/evaluate WPD but that's not very concrete value.

modimo abandoned this revision.Jul 11 2023, 6:10 PM