This is an archive of the discontinued LLVM Phabricator instance.

Add option -fkeep-persistent-storage-variables to emit all variables that have a persistent storage duration
ClosedPublic

Authored by qianzhen on May 9 2023, 12:48 PM.

Details

Summary

This patch adds a new option -fkeep-persistent-storage-variables to emit all variables that have a persistent storage duration, including global, static and thread-local variables. This could be useful in cases where the presence of all these variables as symbols in the object file are required, so that they can be directly addressed.

Diff Detail

Event Timeline

qianzhen created this revision.May 9 2023, 12:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 12:48 PM
qianzhen requested review of this revision.May 9 2023, 12:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 12:48 PM

GCC has -fkeep-static-consts (which Clang supports) but not -fkeep-static-variables.

This could be useful in cases where the presence of all static variables as symbols in the object file are required.

Can you give a more compelling reason that this option is needed?

Can you give a more compelling reason that this option is needed?

This is intended to prevent "excessive transformation" to enable migration of existing applications (using a non-Clang compiler) where users further manipulate the object or assembly after compilation.

cebowleratibm added inline comments.
clang/lib/CodeGen/CodeGenModule.cpp
2209

why not fold the if and else if together given that they perform the same action.

3090

Fold to a single return.

Can you give a more compelling reason that this option is needed?

This is intended to prevent "excessive transformation" to enable migration of existing applications (using a non-Clang compiler) where users further manipulate the object or assembly after compilation.

I don't get what you mean by this? I don't currently see motivation for this? @aaron.ballman would have to make a decision here eventually, but I don't think he'd be compelled by this either.

This is intended to prevent "excessive transformation" to enable migration of existing applications (using a non-Clang compiler) where users further manipulate the object or assembly after compilation.

I don't get what you mean by this? I don't currently see motivation for this?

The intent is to apply the __attribute__((__used__)) semantic to static variables (since the front-end is likely to discard them). Additional reasons for using __attribute__((__used__)) applies: The compiler cannot optimize with the assumption that it sees all direct accesses to the variable.

This is useful in keeping the static variables in a patchable function (https://clang.llvm.org/docs/AttributeReference.html#patchable-function-entry), so that they can be directly addressed by a hot patch when the optimization to merge them is enabled (https://llvm.org/docs/doxygen/GlobalMerge_8cpp_source.html).

aaron.ballman added subscribers: efriedma, rjmccall.

This is useful in keeping the static variables in a patchable function (https://clang.llvm.org/docs/AttributeReference.html#patchable-function-entry), so that they can be directly addressed by a hot patch when the optimization to merge them is enabled (https://llvm.org/docs/doxygen/GlobalMerge_8cpp_source.html).

I would expect that the user could write __attribute__((used)) themselves in that case rather than use the heavy hammer of keeping all statics in the TU, right?

This is intended to prevent "excessive transformation" to enable migration of existing applications (using a non-Clang compiler) where users further manipulate the object or assembly after compilation.

I don't get what you mean by this? I don't currently see motivation for this?

The intent is to apply the __attribute__((__used__)) semantic to static variables (since the front-end is likely to discard them). Additional reasons for using __attribute__((__used__)) applies: The compiler cannot optimize with the assumption that it sees all direct accesses to the variable.

The part I'm having a hard time understanding is why this should be a TU-level option when the user can write __attribute__((used)) on any static that matters to them that is being dropped? This feels like a heavy hammer.

Adding @rjmccall and @efriedma as codegen code owners.

clang/lib/CodeGen/CodeGenModule.cpp
2200–2207

Reformatted with whatever clang-format does for that, as I doubt I have the formatting correct.

3090

Yup, similar suggestion here as above.

It's not unprecedented to add flags to copy the behavior of other compilers, to make porting easier, especially when it doesn't place much burden on compiler maintainers. But what compiler preserves the names/values of static variables by default? It's not the sort of feature I'd expect anyone to advertise. And we don't really want to encourage people to use global flags for this sort of thing; applying behavior everywhere has weird effects, and users often don't understand the flags they're using.

It's not unprecedented to add flags to copy the behavior of other compilers, to make porting easier, especially when it doesn't place much burden on compiler maintainers. But what compiler preserves the names/values of static variables by default? It's not the sort of feature I'd expect anyone to advertise. And we don't really want to encourage people to use global flags for this sort of thing; applying behavior everywhere has weird effects, and users often don't understand the flags they're using.

This is an adaptation of the IBM XL compiler's -qstatsym option, which is meant to generate symbol table entries for static variables. An artifact of that compiler is that static variables are often not discarded even when unused. We attempt to achieve the combined effect using -fkeep-static-variables.

This is an adaptation of the IBM XL compiler's -qstatsym option, which is meant to generate symbol table entries for static variables. An artifact of that compiler is that static variables are often not discarded even when unused.

Oh, I see; the compiler actually doesn't guarantee that the variables are preserved, it just ends up preserving them by accident because it's bad at optimizing global variables?

Do you have any idea how widespread this is?

This is an adaptation of the IBM XL compiler's -qstatsym option, which is meant to generate symbol table entries for static variables. An artifact of that compiler is that static variables are often not discarded even when unused.

Oh, I see; the compiler actually doesn't guarantee that the variables are preserved, it just ends up preserving them by accident because it's bad at optimizing global variables?

That is my understanding, yes. However, that behaviour seems to serve the use case: Users can enable functions for hot-patching proactively to allow for bug fixes in live instances, and they would not necessarily know which static variables they need preserved for use in the replacement code up front. Additionally, sometimes the variables are not eliminated but their location is obfuscated by GlobalMerge (and potentially other optimizations).

The -fkeep-static-variables option only addresses the use case when users do not apply LTO though. LTO internalization exposes more variables to the obfuscation/removal issue.

Hmm. Preserving unused global variables in case a hot patch might use them doesn't seem very compelling to me — hot patches need to be able to introduce globals of their own anyway. What I find more convincing is related to the *other* meaning of __attribute__((used)), that the compiler needs to treat the global as potentially referenced and used in ways that it cannot analyze. Considered from that perspective, a hot patch is essentially extra code that can be loaded into any translation unit dynamically. If there are hot-patchable functions in a translation unit, the compiler needs to broadly avoid optimizing any globals referenceable from that translation unit as if it understood all their uses. So it's less that we need to preserve globals that are actually unused and more that we need to preserve the use-patterns of globals. For example, we can't decide that a mutable global variable is never reassigned and so forward its constant initializer to all the loads from it, because we need to treat that global as potentially reassigned by the code loaded in a hot patch.

I'm not sure if it's better to represent that by using __attribute__((used)) on every global variable or by setting something more globally in the module. The latter seems more impervious to the LTO problem, at least.

I'm not sure if it's better to represent that by using __attribute__((used)) on every global variable or by setting something more globally in the module. The latter seems more impervious to the LTO problem, at least.

__attribute__((__used__)) has the advantage of being something that already exists. If the "more global" property can hook in as treating all variables as __attribute__((__used__)) where the query for __attribute__((__used__)) is implemented in the middle-end/back-end (including places where the symbol is marked as used with respect to linker garbage collection), then maybe it is not too intrusive in terms of implementation.

As for which aspect of __attribute__((__used__)) is more important: The explicit user request we had was for the variables to be kept in an obvious manner in the symbol table. It seems multiple aspects of the __attribute__((__used__)) semantics are helpful here.

The hot-patch use case actually shouldn't care if the compiler/linker throws away unused things because it can't possibly affect how the patch interoperates with existing code. This does rely on the unanalyzable-use part of the semantics, though, to stop the compiler from removing uses that could become significant after hot-patching.

Force-emitting every static in the translation unit can be very expensive; there are a lot of headers that declare all their constants as static const. And removing dead globals is a pretty important optimization, e.g. to eliminate the string literals used by assertions that the compiler was able to fold to true. So I don't think it's unreasonable for us to ask whether we should just be naively adding this option instead of figuring out a more targeted way of satisfying these users.

Force-emitting every static in the translation unit can be very expensive; there are a lot of headers that declare all their constants as static const. And removing dead globals is a pretty important optimization, e.g. to eliminate the string literals used by assertions that the compiler was able to fold to true. So I don't think it's unreasonable for us to ask whether we should just be naively adding this option instead of figuring out a more targeted way of satisfying these users.

Seeing as there are application developers paying the cost of--and relying on at least some aspects of--this behaviour, I am looking to see how they can move to Clang with less migration overhead. It seems to me that more targeted ways to solve the issue would require that the application developers apply a mechanism to select variables for this treatment. That involves a trade-off for them between performance and uptime-friendly serviceability. If they don't have an analysis to make those decisions, then having a more complicated tunable interface does not help them.

qianzhen updated this revision to Diff 523546.May 18 2023, 1:56 PM

Refactor as suggested

Gentle ping.

I think I'm okay with the semantics as-is.

clang/include/clang/Driver/Options.td
1703

Can you add a bit more of a description of what this does, and how it's expected to be used?

qianzhen updated this revision to Diff 526090.May 26 2023, 9:03 AM

Update the option text to be more descriptive

clang/lib/CodeGen/CodeGenModule.cpp
2200–2207

@qianzhen, I don't think the suggestion was applied/applied correctly.

There should not be an isa followed by dyn_cast. That said, I think dyn_cast_or_null is perhaps appropriate instead of plain dyn_cast. Additionally, VD being non-null should be part of the condition.

clang/test/CodeGen/keep-static-variables.cpp
2

@qianzhen, a driver option was added. IMO, there should be tests (probably in another file) for the driver option too.

This seems a little short on tests as well, are there any semantic implications here that should be validated? How does this work with function-local statics? What does the behavior look like in C?

Else, I think @rjmccall felt the strongest about the semantics/design here, so I'd like to see him be ok with them before accepting this.

clang/lib/CodeGen/CodeGenModule.cpp
2200–2207

Note dyn_cast_or_null is deprecated, but making the condition:

if (const auto *VD = dyn_cast_if_present<VarDecl>(D))

is probably the best way to simplify this here.

qianzhen updated this revision to Diff 529601.Jun 8 2023, 8:17 AM

More examples have been identified for the adaptation of IBM XL compiler's -qstatsym option for the hot patch use case, which was mentioned previously. Therefore, this option is extended to cover the following cases.

  1. Function-local static variables
  2. Thread-local variables

The test case is updated to add more coverage accordingly.
Since the option is now covering not only the variables with static storage duration, but also those with thread storage duration, the option name is changed to -fkeep-persistent-storage-variables. Any suggestions would be appreciated.

@erichkeane Thanks for the comments!

How does this work with function-local statics?

This is a valid case. The function-local statics should be kept; the missing implementation is added now.

What does the behavior look like in C?

The C behavior should be the same as those examples in the included C++ test case (except for the C++ specific ones). Do you have any special considerations about C in mind?

clang/include/clang/Basic/CodeGenOptions.def
475

Minor nit: Typo.

clang/include/clang/Driver/Options.td
1700

Please update the patch title and description to match.

clang/test/CodeGen/keep-persistent-storage-variables.cpp
18–19 ↗(On Diff #529601)

If we don't care which of llvm.compiler.used or llvm.used is used (no pun intended), then maybe we can use a regex? For example, @llvm{{(\.compiler)?}}.used.

32–39 ↗(On Diff #529601)

Why add functions that use g1 and g2? Is there some reason to suspect that using the variables will cause them not to be emitted as explicitly used?

If removing these functions, please also remove g3 and g4 as redundant.

50 ↗(On Diff #529601)

Same comment re: why add a function?

clang/test/Driver/fkeep-persistent-storage-variables.c
1 ↗(On Diff #529601)

Please add test for -fkeep-persistent-storage-variables -fno-keep-persistent-storage-variables.

qianzhen updated this revision to Diff 537175.Jul 4 2023, 2:10 PM

Update to address review comments

qianzhen retitled this revision from Add option -fkeep-static-variables to emit all static variables to Add option -fkeep-persistent-storage-variables to emit all variables that have a persistent storage duration.Jul 4 2023, 2:13 PM
qianzhen edited the summary of this revision. (Show Details)
qianzhen added inline comments.
clang/test/CodeGen/keep-persistent-storage-variables.cpp
32–39 ↗(On Diff #529601)

Right, I don't think using the variables in a function will cause then not to be emitted as explicitly used. Patch updated accordingly.

Update (looping back from offline discussion): The LTO use case is covered. There was some confusion over which meaning of "static" was meant by -fkeep-static-consts. The "static" meant was storage duration.

Furthermore, there is some discussion over covering more than just variables, but also artifacts with reasonable mangled names such as the symbols for temporaries bound to references with "persistent storage" and guard variables.

qianzhen updated this revision to Diff 540060.Jul 13 2023, 8:53 AM

Add two more test cases

Furthermore, there is some discussion over covering more than just variables, but also artifacts with reasonable mangled names such as the symbols for temporaries bound to references with "persistent storage" and guard variables.

We can follow up with a separate patch to add the extra functionality. This patch LGTM for the scope it covers.

clang/include/clang/Driver/Options.td
1703

Minor nit: Use hyphen in "thread-local".

This revision is now accepted and ready to land.Jul 13 2023, 9:05 AM