This is an archive of the discontinued LLVM Phabricator instance.

[IR][GlobalDCE] Extend !vcall_visibility to be able to specify an offset range that it applies to
Needs ReviewPublic

Authored by kubamracek on Aug 25 2021, 4:16 PM.

Details

Summary

As part of codesize optimization work for Swift, we'd like to add Virtual Function Elimination to Swift, very similarly to how GlobalDCE supports C++ VFE. This PR adds support to GlobalDCE for vtables containing references that are not virtual functions and should not participate in VFE. This is achieved by extending !vcall_visibility to be able to specify an offset range that it applies to.

Diff Detail

Event Timeline

kubamracek requested review of this revision.Aug 25 2021, 4:16 PM
kubamracek created this revision.

https://reviews.llvm.org/D107645 [GlobalDCE] Handle relative pointers in VFE (for Swift vtables)

kubamracek updated this revision to Diff 370092.Sep 1 2021, 3:55 PM
kubamracek updated this revision to Diff 370110.Sep 1 2021, 4:33 PM
kubamracek retitled this revision from [GlobalDCE] Handle non-vfunc entries in vtables during VFE (for Swift vtables) to [GlobalDCE] Handle non-vfunc entries in vtables during VFE.Sep 13 2021, 9:23 AM
fhahn added a comment.Sep 13 2021, 9:54 AM

As part of codesize optimization work for Swift, we'd like to add Virtual Function Elimination to Swift, very similarly to how GlobalDCE supports C++ VFE.

I think it would be good to drop this part from the description, as the change is not really related to swift; it just changes the code to ignore entries at slots without !type.

It seems sensible to me to ignore entries without corresponding !type metadata and this should be in line with the specification in langref. @pcc, @tejohnson Are there any issues you could think of?

llvm/lib/Transforms/IPO/GlobalDCE.cpp
133

It seems like it may be quite expensive to iterate over all types for a global, getting the pointer at the offsets and so on, possibly multiple times for the same GV. If that's the case, could we pre-compute the set of valid pointers in VTable and then just check against the set?

llvm/test/Transforms/GlobalDCE/virtual-functions-non-vfunc-entries.ll
2

Can you pre-commit this test and just include the diff with the changes caused by the patch?

kubamracek added inline comments.
llvm/lib/Transforms/IPO/GlobalDCE.cpp
133

Done.

llvm/test/Transforms/GlobalDCE/virtual-functions-non-vfunc-entries.ll
2

Done.

As part of codesize optimization work for Swift, we'd like to add Virtual Function Elimination to Swift, very similarly to how GlobalDCE supports C++ VFE.

I think it would be good to drop this part from the description, as the change is not really related to swift; it just changes the code to ignore entries at slots without !type.

It seems sensible to me to ignore entries without corresponding !type metadata and this should be in line with the specification in langref. @pcc, @tejohnson Are there any issues you could think of?

Is this specified in the langref? I looked through the type metadata documentation but don't see it mentioned there that all vfuncs will have type metadata. Looks like that was added a bit later than the original type metadata, in D47567. I think what is specified in langref is that it has type metadata for the address point. Probably the documentation needs updating. But from what I can see, it looks like this should always be emitted now. @pcc ?

Updating TypeMetadata.rst to be explicit about what happens for vtable slots without a matching !type attachment.

fhahn added a comment.Sep 17 2021, 1:39 AM

As part of codesize optimization work for Swift, we'd like to add Virtual Function Elimination to Swift, very similarly to how GlobalDCE supports C++ VFE.

I think it would be good to drop this part from the description, as the change is not really related to swift; it just changes the code to ignore entries at slots without !type.

It seems sensible to me to ignore entries without corresponding !type metadata and this should be in line with the specification in langref. @pcc, @tejohnson Are there any issues you could think of?

Is this specified in the langref? I looked through the type metadata documentation but don't see it mentioned there that all vfuncs will have type metadata. Looks like that was added a bit later than the original type metadata, in D47567. I think what is specified in langref is that it has type metadata for the address point. Probably the documentation needs updating. But from what I can see, it looks like this should always be emitted now. @pcc ?

Thanks for checking! I think the update to the langref in the latest version should spell out things clearly.

llvm/docs/TypeMetadata.rst
289

Not sure about the rst syntax here. Does this render correctly as list?

llvm/lib/Transforms/IPO/GlobalDCE.cpp
140

nit: unrelated change

214

can we instead just declare VFuncs in the loop?

233

Does C being nullptr here indicate that the type id is invalid?

kubamracek added inline comments.Sep 17 2021, 9:33 PM
llvm/docs/TypeMetadata.rst
289

Fixed!

llvm/lib/Transforms/IPO/GlobalDCE.cpp
140

Removed.

214

Done.

233

Yes, it means the offset mentioned in the !type attachment isn't found in the vtable. I restructured this to avoid the ternary operator.

fhahn accepted this revision.Sep 20 2021, 6:45 AM

LGTM with a few remaining nits. Please wait a couple of days in case there are additional comments.

llvm/lib/Transforms/IPO/GlobalDCE.cpp
134

nit: contains instead of count

135
220

nit: .... also collect the set of virtual functions in the vtable.

235
This revision is now accepted and ready to land.Sep 20 2021, 6:45 AM

LGTM with a few remaining nits. Please wait a couple of days in case there are additional comments.

Addressed the format comments, will wait a couple of days. Thanks!

pcc added a comment.Sep 20 2021, 3:42 PM

As part of codesize optimization work for Swift, we'd like to add Virtual Function Elimination to Swift, very similarly to how GlobalDCE supports C++ VFE.

I think it would be good to drop this part from the description, as the change is not really related to swift; it just changes the code to ignore entries at slots without !type.

It seems sensible to me to ignore entries without corresponding !type metadata and this should be in line with the specification in langref. @pcc, @tejohnson Are there any issues you could think of?

Is this specified in the langref? I looked through the type metadata documentation but don't see it mentioned there that all vfuncs will have type metadata. Looks like that was added a bit later than the original type metadata, in D47567. I think what is specified in langref is that it has type metadata for the address point. Probably the documentation needs updating. But from what I can see, it looks like this should always be emitted now. @pcc ?

Thanks for checking! I think the update to the langref in the latest version should spell out things clearly.

It isn't great that VFE apparently relies on the type metadata that we added to support member function pointer CFI in C++. I was unaware that this was the case, and it shouldn't be necessary for frontends that don't have something like C++'s peculiar member function pointer ABI (I assume that Swift doesn't?) It would be better if we designed an independent way of marking these entries as subject to VFE.

It isn't great that VFE apparently relies on the type metadata that we added to support member function pointer CFI in C++. I was unaware that this was the case, and it shouldn't be necessary for frontends that don't have something like C++'s peculiar member function pointer ABI (I assume that Swift doesn't?) It would be better if we designed an independent way of marking these entries as subject to VFE.

As far as I can tell, VFE was added via https://reviews.llvm.org/D63932 and it has since always used type metadata and the !type annotations on vtables. Are you asking for a reversal on how VFE is done at the IR level, including how it's done for C++ in Clang? I'd love to see more thought on that (and what exactly do you think the design should aim to do better), but it seems to me that that's out of scope for this particular small tweak that I'm pursuing.

@pcc, could I ask for some details about your last comment? I'm specifically wondering whether you're against proceeding with the change from this review.

pcc requested changes to this revision.Sep 24 2021, 4:04 PM

It isn't great that VFE apparently relies on the type metadata that we added to support member function pointer CFI in C++. I was unaware that this was the case, and it shouldn't be necessary for frontends that don't have something like C++'s peculiar member function pointer ABI (I assume that Swift doesn't?) It would be better if we designed an independent way of marking these entries as subject to VFE.

As far as I can tell, VFE was added via https://reviews.llvm.org/D63932 and it has since always used type metadata and the !type annotations on vtables. Are you asking for a reversal on how VFE is done at the IR level, including how it's done for C++ in Clang? I'd love to see more thought on that (and what exactly do you think the design should aim to do better), but it seems to me that that's out of scope for this particular small tweak that I'm pursuing.

Sorry, I think I managed to confuse myself when I was trying to figure out what VFE actually relies on. It seems that VFE doesn't rely on what I thought it did. Although VFE relies on type metadata, it only relies on it to determine the locations of the address points, and not the locations of the VFE-eligible function slots.

Historically we've only ever used VFE with C++, which is fine with any virtual function slot being eligible for VFE. As such, we implemented it to make that assumption. What we should really be doing is have a way of marking slots as eligible, and we never got a chance to design that because it wasn't necessary. Now that you're proposing to use VFE with a language which presumably does have some non-VFE-eligible virtual function slots, I think we need to design such a mechanism.

@pcc, could I ask for some details about your last comment? I'm specifically wondering whether you're against proceeding with the change from this review.

Sorry, but I think I'm against it because the mechanism that you've chosen seems like an abuse of the !type metadata, since the metadata is meant to be used as a reference point for accesses relative to it, and not as a way of marking up specific fields by itself. As a more practical concern, it has no way of representing the situation where the pointer at the address point is not eligible for VFE.

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

This revision now requires changes to proceed.Sep 24 2021, 4:04 PM

@pcc, could I ask for some details about your last comment? I'm specifically wondering whether you're against proceeding with the change from this review.

Sorry, but I think I'm against it because the mechanism that you've chosen seems like an abuse of the !type metadata, since the metadata is meant to be used as a reference point for accesses relative to it, and not as a way of marking up specific fields by itself. As a more practical concern, it has no way of representing the situation where the pointer at the address point is not eligible for VFE.

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

A potentially more lightweight solution may be providing a way to opt-in to the stricter interpretation. That way, we ensure no existing users are disturbed while not having to add more invasive changes. Do you think the additional complexity a new type of Constant carries its weight if it is only used to support the swift VFE case?

pcc added a comment.Sep 27 2021, 1:11 PM

@pcc, could I ask for some details about your last comment? I'm specifically wondering whether you're against proceeding with the change from this review.

Sorry, but I think I'm against it because the mechanism that you've chosen seems like an abuse of the !type metadata, since the metadata is meant to be used as a reference point for accesses relative to it, and not as a way of marking up specific fields by itself. As a more practical concern, it has no way of representing the situation where the pointer at the address point is not eligible for VFE.

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

A potentially more lightweight solution may be providing a way to opt-in to the stricter interpretation. That way, we ensure no existing users are disturbed while not having to add more invasive changes. Do you think the additional complexity a new type of Constant carries its weight if it is only used to support the swift VFE case?

Adding a new Constant type isn't very difficult, see e.g. D108478

We shouldn't be hesitant to extend the IR in order to model things correctly. And given that we're introducing all these new Constant types it may make sense to refactor to make adding new Constants easier (doesn't need to happen now though). In the end I think more complexity would result from an incorrect modelling (e.g. frontends would have to make up unnecessary type names to cause virtual function slots to be subject to VFE, and what if a frontend wants VFE for globals, or exempt its address points from VFE -- would we need another opt-in for those?).

Sorry, I think I managed to confuse myself when I was trying to figure out what VFE actually relies on. It seems that VFE doesn't rely on what I thought it did. Although VFE relies on type metadata, it only relies on it to determine the locations of the address points, and not the locations of the VFE-eligible function slots.

@pcc, I apologize I think I got confused the same way. I think I had the wrong understanding that if any slot in a vtable does not have a matching !type annotation (on the vtable global) with the right offset, it's basically trivially removable by VFE, but apparently, as you're pointing out, that's not the case -- it could still be used by a non-zero offset load via llvm.type.checked.load. In such a case, the slot at (vcall offset + vtable annotation offset) is actually being matched. So even if a vtable slot at offset X is never mentioned in any !type annotation, it could still be matched by any preceding !type type id with a non-zero offset at the call site.

Could you confirm that this understanding I just mentioned is actually correct now? Thanks :)

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

Just to make sure I understand this proposal, would we *require* frontends to mark all VFE eligible slots with this? I assume the only existing (publicly known) user of VFE is Clang, so as part of the change we'd fix Clang too? And the way this would look on a simple example would be something like the following?

@vtable = internal unnamed_addr constant ... {[
  i8* bitcast (void ()* vfe_eligible @vfunc1 to i8*),
  i8* bitcast (void ()* vfe_eligible @vfunc2 to i8*),
  i8* bitcast (void ()* @non_eligible to i8*)
]}, align 8, !type ...

And in the case of "relative pointers" that Swift uses in its data structures, the Constant would be present at the inner-most reference, like this?

@vtable = internal unnamed_addr constant { [2 x i32] } { [2 x i32] [
  i32 trunc (i64 sub (i64 ptrtoint (void ()* vfe_eligible @vfunc1 to i64), i64 ptrtoint ({ [2 x i32] }* @vtable to i64)) to i32),
  i32 trunc (i64 sub (i64 ptrtoint (void ()* vfe_eligible @vfunc2 to i64), i64 ptrtoint ({ [2 x i32] }* @vtable to i64)) to i32)
]}, align 8, !type ...
kubamracek retitled this revision from [GlobalDCE] Handle non-vfunc entries in vtables during VFE to [IR][GlobalDCE] Add ability to mark vtable methods as eligible for VFE and avoid eliminating non-eligible vfunc in VFE in GlobalDCE.
pcc added a comment.Sep 27 2021, 8:54 PM

Sorry, I think I managed to confuse myself when I was trying to figure out what VFE actually relies on. It seems that VFE doesn't rely on what I thought it did. Although VFE relies on type metadata, it only relies on it to determine the locations of the address points, and not the locations of the VFE-eligible function slots.

@pcc, I apologize I think I got confused the same way. I think I had the wrong understanding that if any slot in a vtable does not have a matching !type annotation (on the vtable global) with the right offset, it's basically trivially removable by VFE, but apparently, as you're pointing out, that's not the case -- it could still be used by a non-zero offset load via llvm.type.checked.load. In such a case, the slot at (vcall offset + vtable annotation offset) is actually being matched. So even if a vtable slot at offset X is never mentioned in any !type annotation, it could still be matched by any preceding !type type id with a non-zero offset at the call site.

Could you confirm that this understanding I just mentioned is actually correct now? Thanks :)

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

Just to make sure I understand this proposal, would we *require* frontends to mark all VFE eligible slots with this? I assume the only existing (publicly known) user of VFE is Clang, so as part of the change we'd fix Clang too? And the way this would look on a simple example would be something like the following?

@vtable = internal unnamed_addr constant ... {[
  i8* bitcast (void ()* vfe_eligible @vfunc1 to i8*),
  i8* bitcast (void ()* vfe_eligible @vfunc2 to i8*),
  i8* bitcast (void ()* @non_eligible to i8*)
]}, align 8, !type ...

And in the case of "relative pointers" that Swift uses in its data structures, the Constant would be present at the inner-most reference, like this?

@vtable = internal unnamed_addr constant { [2 x i32] } { [2 x i32] [
  i32 trunc (i64 sub (i64 ptrtoint (void ()* vfe_eligible @vfunc1 to i64), i64 ptrtoint ({ [2 x i32] }* @vtable to i64)) to i32),
  i32 trunc (i64 sub (i64 ptrtoint (void ()* vfe_eligible @vfunc2 to i64), i64 ptrtoint ({ [2 x i32] }* @vtable to i64)) to i32)
]}, align 8, !type ...

Yes, that's what I was thinking. We'll need to fix Clang, but old bitcode should still work without the optimization because all slots will be considered non-eligible.

@pcc, I've updated the diff to introduce a new "vfe_eligible" constant and to also change Clang to start emiting that on vfuncs in vtables.

fhahn added a comment.Sep 30 2021, 9:05 AM

@pcc, could I ask for some details about your last comment? I'm specifically wondering whether you're against proceeding with the change from this review.

Sorry, but I think I'm against it because the mechanism that you've chosen seems like an abuse of the !type metadata, since the metadata is meant to be used as a reference point for accesses relative to it, and not as a way of marking up specific fields by itself. As a more practical concern, it has no way of representing the situation where the pointer at the address point is not eligible for VFE.

Here's one alternative proposal: can we make it so that any vtable slots that are eligible for VFE are represented using a new type of Constant (similar to dso_local_equivalent)? This also has the advantage that frontends could also use it for global variables as well as functions.

A potentially more lightweight solution may be providing a way to opt-in to the stricter interpretation. That way, we ensure no existing users are disturbed while not having to add more invasive changes. Do you think the additional complexity a new type of Constant carries its weight if it is only used to support the swift VFE case?

Adding a new Constant type isn't very difficult, see e.g. D108478

Ah right, that indeed looks much less work than I thought, thanks!

fhahn added inline comments.Oct 4 2021, 11:14 AM
llvm/docs/TypeMetadata.rst
294

IIUC the must here seems to be stronger than it needs to be. Pointers in the vtable *can* be wrapped as vie_eligible to consider them for virtual function elimination. But if they aren't, it is still a valid vtable, right?

kubamracek marked an inline comment as done.
kubamracek added inline comments.
llvm/docs/TypeMetadata.rst
294

Rephrased.

kubamracek marked an inline comment as done.Oct 4 2021, 12:25 PM

@pcc does this approach look good to you now? Can you review? Thanks :)

Adding a few additional reviewers to reflect the expanded scope of the patch.

clang/lib/CodeGen/CGVTables.cpp
805 ↗(On Diff #376989)

It might be better to explain *why* the function pointer is wrapped (to allow virtual function elimination), rather than just state the fact that it is wrapped (which can be seen in the code directly below)

llvm/docs/TypeMetadata.rst
294

Thanks. I think now this could even just be a follow-up paragraph, rather than a list. vfe_eligible is not really a rule to be followed, but something that enables additional optimizations.

llvm/include/llvm/AsmParser/LLParser.h
65 ↗(On Diff #376989)

Why is this needed? Can this be handled in a similar fashion to how other constants are handled?

llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
518 ↗(On Diff #376989)

Does this need a new test or is this covered by the existing tests?

kubamracek added inline comments.Oct 11 2021, 8:48 AM
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
518 ↗(On Diff #376989)

Yes, this is a fix because existing LTO tests that attempt VFE were failing after wrapping the vtable pointer in VFEEligibleValue.

fhahn added inline comments.Oct 11 2021, 9:00 AM
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
518 ↗(On Diff #376989)

Hm, I guess what I am missing is where vfe_eligible is added to LTO tests. Looks like none are touched by this patch. Or is this a test involving Clang + LTO optimzations?

kubamracek marked 7 inline comments as done.
kubamracek added inline comments.
llvm/include/llvm/AsmParser/LLParser.h
65 ↗(On Diff #376989)

I don't think it's really needed, I took this approach from https://reviews.llvm.org/D108478. I've just replaced it with the more "local" approach of handling everything in the lltok::kw_vfe_eligible case in LLParser::parseValID.

kubamracek marked 7 inline comments as done.Oct 11 2021, 9:44 AM
kubamracek added inline comments.Oct 11 2021, 11:54 AM
llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
518 ↗(On Diff #376989)

Exactly, existing Clang LTO tests that use -fvirtual-function-elimination, see CodeGenCXX/vcall-visibility-metadata.cpp in Clang.

I don't entirely understand the purpose of vfe_eligible. It sounds like it expresses that it's okay for this value to be replaced if we can prove that nothing ever uses it. Surely that's universally true, albeit usually extremely hard to prove. So maybe the right general approach is that we should be saying something about the v-table object: that we know certain fields in it are only accessed in specially blessed patterns, and then this !type metadata or whatever lets us prove that certain patterns don't exist. I don't know that a new constant is the best way to express that, as opposed to some sort of metadata on the v-table. At the very least, maybe that gives us some direction towards a more general name for the constant.

@rjmccall, please see the history of this review, what you're saying was proposed already, and then declined by @pcc, who IIUC was involved in the original design and implementation of VFE, and who's requesting that the right way to express this in the IR is via a new constant that wraps the function pointers. @pcc provided some rationale for that -- do you disagree with it?

I did skim the thread, and it seemed to me that Peter's objection was largely to the reuse of !type, which he feels wasn't meant to carry these semantics. Peter leapt to the idea of a new constant instead of a more general metadata, but I don't understand what that leap is meant to achieve, which is why I'm engaging with the thread.

My objection to the new constant is that it doesn't seem to have any independent meaning, it's just a way of identifying particular fields in a global initializer for the purposes of a specific optimization. If you end up with a use of this constant as an instruction operand, it's meaningless and has to be stripped. So I'm not really sure how it reinforces correctness.

As far as I understand it, your optimization is basically: certain global objects have their addresses escape in ways you can't directly reason about, but you know that certain fields of those objects can only be accessed by corresponding blessed operations, and you can sometimes prove that some blessed operations don't exist in the program, which means you can prove that those fields are irrelevant, which enables various optimizations. Within that setup, you really only have one correctness problem, which is that it's vitally important that *accesses* be reliably marked as blessed, which the constant does nothing to address: failing to mark a global object reliably just introduces a failure to optimize, which is not a correctness problem. You also have a *expressiveness* problem, which is that you're treating the entire object uniformly instead of tracking blessed operations field-by-field, which will create correctness problems if you try to apply the optimization to objects which don't restrict access to all the fields. But I don't see how the new constant actually helps there at all; really you need finer-grained "blessings", both on access sites and in the object definition, and the constant doesn't carry those annotations, so...

Thank you for the detailed response, @rjmccall ! Let me try to explain my ultimate motivation here :)

I want to implement Virtual Function Elimination (VFE) on methods in classes and subclasses in Swift code, and given that LLVM and Clang already support VFE for C++ and that there's a lot of existing infrastructure for that in LLVM, it seems like an reasonable strategy to reuse all the LLVM IR parts for recognizing vtables and vcall sites and for proving that vtable slots are unused, etc. We "just" need to change the Swift language frontend to mark vtables and vcall sites in the right way so that the existing LLVM VFE logic does the right thing.

This *almost* works, expect for the fact that the Swift frontend embeds pointers to vfunctions inside large data structures, namely nominal type descriptors and full type metadata structs -- i.e. there is no standalone vtable symbol per class (and changing the layout of this in not feasible because it's ABI). Instead, a part of a larger struct is used as a vtable, and the larger struct does actually contain function pointers that are irrelevant to VFE (they're not a vtable slot). But the existing implementation of VFE happily "proves" (incorrectly) that this non-vptr function pointer is unused and removes it.

What I'm concluding from that is that the problem I need to solve is that somehow I need to be able to express in LLVM IR that only the "vtable portion" of the struct should actually participate in VFE, and these vptr-irrelevant references should just not be touched at all when doing VFE. What would be your recommended approach to achieve that? So far, I think these were on the table:

  • Reuse !type -- not correct, breaks VFE in C++, as @pcc pointed out.
  • Have a new attribute on the global that changes the semantics of !type. I am not sure if this is (a variation of) what you've suggested?
  • Use vfe_eligible constants to mark individual entries in the global. Then "not marked" references will not be touched by VFE. Current implementation in this diff.

Or do you think I'm solving the wrong problem? Note that I'm not trying to change anything about how VFE is done today for C++ code.

Okay. Thanks for the explanation, I think that was helpful.

I think I have two concerns about how we can use the existing infrastructure for VFE on Swift.

The first is that the existing infrastructure is definitely assuming that the emitted type-checked loads are to static offsets in the global constants you want to optimize. The only flexibility here is that the address point doesn't have to end up being offset zero in the global constant. In Swift, those conditions are only going to be true for pretty narrow cases: the caller needs to be able to lay out the class/protocol fully statically, and the emitter of the class/conformance needs to be emitting the metadata/wtable as a global object rather than an instantiation pattern, which generally means fully non-resilient and non-generic. There are important cases which do look like that, but it's pretty limiting.

The only way around that that I can see is to break the assumption of a static offset. You will need to tie the load to its loadee more symbolically, like by saying that you are conceptually accessing Foo.bar and annotating the field in the global as only accessed through loads tagged Foo.bar. That would get you a certain amount of the optimization, like when a virtual function is never used virtually in the program, and then you can find a way to add in hierarchy analysis to get to parity with what's done for C++ now.

The second is that !type and !vcall_visibility are surprisingly inexpressive. Do you really not to encode the inheritance hierarchy in !type metadata? All that's there right now is compatibility that happens to start at the same offset, which I'm surprised is enough. And !vcall_visibility applies to the entire global constant; you can't say that the parts of a v-table that correspond to a subclass are less visible.

If you don't need to worry about the non-static cases from the first point (which would require major changes), then I would solve your problem by tackling that last section and introducing new metadata which subsumes the information in !type and !vcall_visibility: list out all the regions in the global with explicit identities and address points, where identities describe their extents, super-identities, VFE-ability, and visibility of use.

The only way around that that I can see is to break the assumption of a static offset. You will need to tie the load to its loadee more symbolically, like by saying that you are conceptually accessing Foo.bar and annotating the field in the global as only accessed through loads tagged Foo.bar.

Yes, that is exactly the idea. I'm going to tag the loads and the slots with a name of the base class and name of the method (actually it'll be the mangled name of the base method).

That would get you a certain amount of the optimization, like when a virtual function is never used virtually in the program

This is exactly what I'm after, yes.

I have talked with @rjmccall offline, and I'm going to attempt to summarize his thought. @rjmccall, please correct me where I' wrong :)

  • What feels weird about the current patch (adding the "vfe_eligible" constant) is the fact that in the IR we already have 2 "things" (!type and !vcall_visibility) that are needed for VFE correctness and effectivity, and we're now just adding a third one.
  • It would be better to see some sort of consolidation so that *all* the information that VFE needs to know about a vtable is contained in a single "entity", perhaps a new piece of metadata, with a format that is extended to support all the features everyone (Clang, Swift) needs.
  • There is at least one interesting opportunity that we should consider and make it expressible in IR: In C++, having a subclass with a more restricted vcall visibility than the base class should allow VFE to optimize a part of the vtable with more precise information. However, that's currently not expressible because there's only one !vcall_visibility on the global.
  • The original problem I was trying to solve here can be generalized and tackled together with this C++ vtable expressivity issue: We could extend !vcall_visibility to allow specifying a "range" of offsets in the vtable that the visibility marker applies to, and also allow specifying multiple !vcall_visibility attributes on a single vtable. While this does not achieve the IR design "consolidation", we could perhaps consider that as a intermediate minimally-invasive step in the right direction.

Based on that, I would like to actually propose continuing the IR design discussion on llvm-dev (I'm going to start an email thread on that), while at the same time I'd like to pursue the change of extending !vcall_visibility (add "offset range" to it) to address the immediate problem I need solving for VFE on Swift code. Practically that means !vcall_visibility would accept two formats:

  1. !vcall_visibility !{i64 2} -- visibility only
  2. !vcall_visibility !{i64 2, i64 0, i64 32} -- visibility, range start, range end

The first form means there is an implicit range that covers the entire vtable.

@pcc, would you be okay with such a direction instead of the suggested vfe_eligible constant? I have changed the diff with an implementation of that.

kubamracek retitled this revision from [IR][GlobalDCE] Add ability to mark vtable methods as eligible for VFE and avoid eliminating non-eligible vfunc in VFE in GlobalDCE to [IR][GlobalDCE] Extend !vcall_visibility to be able to specify an offset range that it applies to.Oct 15 2021, 10:12 PM
kubamracek edited the summary of this revision. (Show Details)
fhahn added a comment.Oct 19 2021, 4:15 AM

IMO this approach is preferable to adding a new type of constant, because it is much more lightweight. The use-case also seems to fit somewhat with the original goals of vcall_visibility, which we already use to determine of the *whole* vtable is safe for VFE. Extending it to also specify which functions are safe for VFE makes sense to me.

It restricts the eligible functions to a single range, which is not as powerful as the new constant approach. Just to double check, will this be sufficient for your use cases?

llvm/include/llvm/Transforms/IPO/GlobalDCE.h
52

Could this subsume VFESafeVTables? IIUC VFuncMap should only contain VTables that are safe for VFE together with the set of functions for which VFE is safe. Maybe also use a more descriptive name, like VFESafeVTablesAndFns.

llvm/lib/Transforms/IPO/GlobalDCE.cpp
126

nit: you can just use cast instead of dyn_cast and get rid of the assert; cast will assert if the result is null.

128

nit: just return VFuncMap[VTable].contains(VPtr)?

248

Is this callback here really necessary or could it be in FindVirtualFunctionsInVTable? It seems to make it harder to see what's going on when reading FindVirtualFunctionsInVTable in isolation.

llvm/test/Transforms/GlobalDCE/virtual-functions-non-vfunc-entries.ll
34

we should also have a test with a start offset different than 0.

@rjmccall, please correct me where I'm wrong :)

No, I think that accurately summarizes our conversation. Thanks!

kubamracek updated this revision to Diff 380845.

It restricts the eligible functions to a single range, which is not as powerful as the new constant approach. Just to double check, will this be sufficient for your use cases?

Yes, I just double checked, all the data structures that Swift should apply VFE to do have a consecutive range that is "the vtable region" inside it, so being able to express a range in !vcall_visibility will cover the needs for the Swift compiler.

Could this subsume VFESafeVTables?

Updated diff to do that.

we should also have a test with a start offset different than 0.

Added.

Is this callback here really necessary or could it be in FindVirtualFunctionsInVTable? It seems to make it harder to see what's going on when reading FindVirtualFunctionsInVTable in isolation.

Removed the callback.

fhahn added inline comments.Oct 20 2021, 5:08 AM
llvm/docs/TypeMetadata.rst
296

might be good to spell out that the range is a half-open (< end)

llvm/include/llvm/IR/GlobalObject.h
141

nit: could just use std::pair?

llvm/include/llvm/Transforms/IPO/GlobalDCE.h
52

nit: I think it would be good to include some of the original wording, saying that contains global variables which are vtables for which we have enough information to safely do VFE, together with the functions in the vtable for which VFE is safe.

llvm/lib/IR/Metadata.cpp
1546

it would be good to have a the IR verifier check that the second and third operands are integer constants < UINT64_MAX.

1551

nit: consider using the more C++ like std::numeric_limits<uint64_t>::max()

llvm/lib/Transforms/IPO/GlobalDCE.cpp
129–130

nit: re-flow comment.

130–135

nit: should we use find to look up GVU here to avoid inserting new entries in the map?

169

nit: I think usually {} are omitted for single-line bodies.

184

could we use getPointerAtOffset here and below instead of recursing further? That way we would only add function pointers that fit the supported Vtable scheme and we should be able to avoid handling arbitrary constants by iterating over their operands, which might lead us to discover function pointers that are not valid VTable entries.

llvm/test/Transforms/GlobalDCE/virtual-functions-non-vfunc-entries.ll
6

It might be good to also have a test with a relative pointer vtable

Updated diff, should have all comments addressed, added a test for relative pointer vtable.

could we use getPointerAtOffset here and below instead of recursing further? That way we would only add function pointers that fit the supported Vtable scheme and we should be able to avoid handling arbitrary constants by iterating over their operands, which might lead us to discover function pointers that are not valid VTable entries.

This feels like a weird restriction, and also I think we do need to support arrays/structs in other arrays/structs, so recursing seem like the only option here? The case of a sub(@fn1, @fn2) expression seems like not worth protecting against (it's not a "reasonable" valid relative pointer, and arguably it's not even clear whether it should or shouldn't be considered for VFE).

fhahn accepted this revision.Oct 21 2021, 9:04 AM

This approach LGTM and I think it should address the original concerns raised by @pcc.

Please wait with committing until next week, in case @pcc has additional concerns about the new approach.

llvm/lib/IR/Verifier.cpp
757

thanks for adding the verifier! It would be good to have a couple of tests for it too

llvm/lib/Transforms/IPO/GlobalDCE.cpp
237

nit: sink inside if?

pcc added a comment.Oct 27 2021, 4:38 PM

Thanks, something like this seems reasonable to me and I don't have any major issues with it.

I would still like to take this opportunity to drop the special case for Functions. Can we make it so that !vcall_visibility with a single operand means that none of the constant has the given visibility, and teach Clang to emit the correct ranges? I think this will also require supporting more than one !vcall_visibility range per global (this could be done by accommodating more than one !vcall_visibility annotation, or allowing the single annotation to have 2n+1 operands).

I would still like to take this opportunity to drop the special case for Functions. Can we make it so that !vcall_visibility with a single operand means that none of the constant has the given visibility, and teach Clang to emit the correct ranges? I think this will also require supporting more than one !vcall_visibility range per global (this could be done by accommodating more than one !vcall_visibility annotation, or allowing the single annotation to have 2n+1 operands).

Could you expand on what you mean by "drop the special case for Functions"? Is my understanding that it's a *separate* request from Clang emitting correct ranges?

I think Peter is saying that if you can make Clang emit correct ranges for v-tables — specifically so that the ranges don't cover the type_info pointer, which we don't emit enough metadata to safely remove — then you can also go ahead and remove the special case where we only eliminate loads of function pointers. That's assuming we don't need to support old bitcode.

I think Peter is saying that if you can make Clang emit correct ranges for v-tables — specifically so that the ranges don't cover the type_info pointer, which we don't emit enough metadata to safely remove — then you can also go ahead and remove the special case where we only eliminate loads of function pointers. That's assuming we don't need to support old bitcode.

Makes sense, thanks for the explanation. @pcc, just to make sure I'm on the same page as you -- this special casing of Functions is already there in GlobalDCE today, correct? The patch as is right now is not adding a new restriction, or regressing the existing behavior, correct?

fhahn added a comment.Nov 1 2021, 9:14 AM

I think Peter is saying that if you can make Clang emit correct ranges for v-tables — specifically so that the ranges don't cover the type_info pointer, which we don't emit enough metadata to safely remove — then you can also go ahead and remove the special case where we only eliminate loads of function pointers. That's assuming we don't need to support old bitcode.

Can we auto-upgrade the bitcode, moving the logic to detect the special case in the IR upgrader? IIUC the worst case would be that we won't be able to optimise old bitcode, so it wouldn't be the end of the world. But if we can auto-upgrade to vcall_visibility then we should do it I think.

But I don't think we should pull this into this particular patch. I saw @kubamracek already put up D112929 and I think the rest of the cleanups (removing the special handling for VFE) should be done as follow-up patches, to avoid making the patch at hand even bigger.

But I don't think we should pull this into this particular patch. I saw @kubamracek already put up D112929 and I think the rest of the cleanups (removing the special handling for VFE) should be done as follow-up patches, to avoid making the patch at hand even bigger.

Right, I was planning to submit all the remainders as separate patches, and hopefully merge individual parts as long as they don't regress anything (like this one).

@pcc , @rjmccall , @fhahn See the other phabricator links -- I assume it might take a while to review and iterate on this. Would it be reasonable to merge *this* change in the meantime?

ormris removed a subscriber: ormris.Jan 24 2022, 11:28 AM
Via Web