This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: New vtable group representation: struct of vtable arrays.
ClosedPublic

Authored by pcc on Jul 12 2016, 7:47 PM.

Details

Summary

In a future change, this representation will allow us to use the new inrange
annotation on getelementptr to allow the optimizer to split vtable groups.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 63774.Jul 12 2016, 7:47 PM
pcc retitled this revision from to CodeGen: Attach !splitpoint metadata to vtables under Itanium ABI..
pcc updated this object.
pcc added reviewers: rsmith, eugenis.
pcc added a parent revision: D22295: Introduce GlobalSplit pass..
pcc added subscribers: cfe-commits, krasin.
pcc added a comment.Jul 18 2016, 6:00 PM

An amendment to the Itanium ABI requiring that a conforming program may not
adjust a virtual table pointer loaded from an object to another virtual table
in the same virtual table group would seem to be all that would be required
to guarantee that this scheme will be ABI compatible with future compilers,
and I'd be happy to drive such an amendment.

Actually, now that I've taken a closer look at the Itanium ABI, I think we're already covered, as the ABI specifies the calling sequence for virtual calls (https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller). As mentioned in that section, it would seem that the caller has to pass a pointer to the chosen subobject as the this pointer. This would seem to require that the caller choose a virtual function pointer appropriate to the chosen subobject, rather than some other subobject in another virtual table group (such a virtual function would by definition require a different this adjustment).

pcc updated this revision to Diff 65647.Jul 26 2016, 6:58 PM

Use new representation: D22793

pcc retitled this revision from CodeGen: Attach !splitpoint metadata to vtables under Itanium ABI. to [wip] CodeGen: New vtable group representation: struct of vtable arrays..Jul 26 2016, 7:00 PM
pcc updated this object.

For reference, here is the text of the proposal I sent to cxx-abi-dev:

Hi all,

The ABI currently requires that virtual tables for a class appear consecutively in a virtual table group. I would like to propose a restriction that would require that compilers may only access the virtual table associated with the address point stored in an object's virtual table pointer, and may not rely on any knowledge that the compiler may have about the relative layout of other virtual tables in the virtual table group.

The purpose of this restriction is to allow an implementation to split a virtual table group along virtual table boundaries.

Motivation

There are at least two scenarios which would benefit from vtable splitting: clients which want to place data either before or after the ABI-required part of a virtual table, and clients which want to control the layout of virtual tables for performance or security reasons.

As an example of the first scenario, when performing whole-program virtual call optimization, Clang will apply an optimization known as virtual constant propagation [0], which causes data to be laid out at a specific offset from the address point of each virtual table in a hierarchy. If that virtual table appears in a virtual table group, padding is required to place the data at an appropriate offset for each class. Because of the current restriction that vtables must appear consecutively, the optimizer may need to add more padding than necessary, or inhibit the optimization entirely if it would require too much padding.

As an example of the second scenario, an implementation may wish to lay out virtual tables hierarchically either in order to increase the likelihood of a cache hit when repeatedly making the same virtual call over a set of heterogeneous objects, or to efficiently implement a security mitigation (specifically control flow integrity [1]) based on checking virtual table addresses for set membership. Placing only virtual tables (rather than virtual table groups) consecutively would likely increase the cache hit likelihood further and reduces the amount of metadata required to implement set membership checks.

In an experiment involving the Chromium web browser, I have measured a binary size decrease of 1.5%, and a median performance improvement of about 1% on Chromium's layout benchmarks when comparing a binary compiled with control flow integrity and whole-program virtual call optimization against a binary compiled with control flow integrity, whole-program virtual call optimization and a prototype implementation of vtable splitting.

Commentary

Although the ABI specifies [2] the calling convention for virtual calls, which requires the call to be made using the this-adjustment appropriate for the object from which the virtual table pointer was loaded, the as-if rule could in principle allow a program to make a call using a different virtual table if the virtual table group contains multiple secondary virtual tables, as the distance between these virtual tables would be fixed (the same would be possible for all virtual tables if the dynamic type were known, but in that case the program could just call the appropriate virtual function directly).

The purported benefit would be to avoid an additional virtual pointer load from the object in cases where consecutive calls are made to virtual functions introduced in different bases. However, it seems to me that cases where this is beneficial would be rare: not only would you need at least three bases and a derived class which does not override any of the called virtual functions, but when performing two consecutive calls it seems likely that the vtable would need to be reloaded anyway, either from the object or from the stack, especially with majority caller-save ABIs such as x86-64, or in any event because the first virtual call may have changed the object's dynamic type. It seems (according to experiments [3] carried out at godbolt.org) that all major compilers (gcc, clang, icc) do already use the appropriate vtable group and therefore are compliant with the proposed restriction.

(There would also seem to be nothing preventing an implementation from choosing to load the RTTI pointer or offset-to-top from another virtual table group. However I would consider this even less likely to be beneficial than a virtual call via another virtual table.)

The ABI specifies that the vtables in a group shall be laid out consecutively when referenced via a vtable group symbol, and I'm not proposing to change this. The effect of this proposal would be to allow a vtable to be split if the vtable group symbol is not referenced directly by name outside of the translation unit(s) participating in the optimization. This may be the case when a class has internal linkage, or if the program is linked with LTO, which allows the compiler to know which symbols are referenced outside of the LTO'd part of the program.

Wording

I propose to add two paragraphs to the section of the ABI describing virtual table groups, as follows:

diff --git a/abi.html b/abi.html
index 79cda2c..fce0c60 100644

  • a/abi.html

+++ b/abi.html
@@ -1193,6 +1193,18 @@ and again excluding primary bases
(which share virtual tables with the classes for which they are primary).
</ul>

+<p>
+When performing a virtual call or loading any other data from an address
+derived from the address point stored in an object's virtual table pointer,
+a program may only load from the virtual table associated with that address
+point, and not from any other virtual table in the same virtual table group
+which might be presumed to be located at a fixed offset from the address
+point as a result of the above layout algorithm.
+
+<p>
+The purpose of this restriction is to allow an implementation to split a
+virtual table group along virtual table boundaries if its symbol is not
+visible to other translation units.

<p>
<a name="vtable-construction">

Thanks,
Peter

[0] http://lists.llvm.org/pipermail/llvm-dev/2016-January/094600.html
[1] http://clang.llvm.org/docs/ControlFlowIntegrityDesign.html
[2] https://mentorembedded.github.io/cxx-abi/abi.html#vcall.caller
[3] https://godbolt.org/g/wX7Ay6 is a three-bases test case by Richard Smith, https://godbolt.org/g/7eG8A1 is a dynamic-type-known test case by me

pcc updated this revision to Diff 70908.Sep 9 2016, 2:40 PM

Refresh and split out the inrange changes into a separate patch

pcc retitled this revision from [wip] CodeGen: New vtable group representation: struct of vtable arrays. to CodeGen: New vtable group representation: struct of vtable arrays..Sep 9 2016, 2:41 PM
pcc updated this object.
pcc added a comment.Sep 9 2016, 2:42 PM

This should now be ready for review and unblocked by other changes (I split out the inrange annotation to D24431).

pcc updated this revision to Diff 75493.Oct 21 2016, 2:51 PM

Refresh

pcc updated this revision to Diff 80709.Dec 7 2016, 6:34 PM

Refresh

pcc added a reviewer: rjmccall.Dec 7 2016, 6:36 PM

John, may I ask you to take a look?

rjmccall edited edge metadata.Dec 7 2016, 8:47 PM

This seems reasonable to me. Just a few representational / API requests.

clang/include/clang/AST/VTableBuilder.h
222 ↗(On Diff #80709)

Please use a struct instead of std::pair.

225 ↗(On Diff #80709)

Does this actually grow, or should you use a representation more like VTableComponents? Or something custom that exploits the fact that, in the vast majority of cases, this vector will have size 1?

243 ↗(On Diff #80709)

Since you're changing this interface anyway, would you mind fixing it to use ArrayRef for VTableComponents and VTableThunks, too?

260 ↗(On Diff #80709)

I know you didn't add this line, but could you remove this cast? IsMicrosoftABI is not a local variable.

Also, the assert above suggests that this can be:

... = AddressPoint.find(Base)->second;

which should be slightly more efficient than lookup(). Also, I think the assertion becomes unnecessary at that point.

clang/lib/AST/VTableBuilder.cpp
986 ↗(On Diff #80709)

Assuming you stop storing a std::vector in the VTableLayout, you should make this a SmallVector.

clang/lib/CodeGen/CGVTables.cpp
644 ↗(On Diff #80709)

SmallVector, please.

648 ↗(On Diff #80709)

This seems like a completely reasonable query to add to VTableLayout:

unsigned getVTableSize(unsigned vtableIndex) const;

or maybe:

ArrayRef<VTableComponent> getVTableComponents(unsigned vtableIndex) const;

I guess you'll also need a getNumVTables().

pcc updated this revision to Diff 80844.Dec 8 2016, 4:18 PM
pcc marked 6 inline comments as done.
pcc edited edge metadata.
  • Address review comments
clang/include/clang/AST/VTableBuilder.h
225 ↗(On Diff #80709)

I wrapped up the code for VTableComponents and VTableThunks into a class, and used it here with some special handling for the size == 1 case.

260 ↗(On Diff #80709)

Also, I think the assertion becomes unnecessary at that point.

Yes, and so does IsMicrosoftABI.

Also removed a similar assertion from one of the callers in CGVTT.cpp.

A couple minor suggestions, then LGTM.

clang/include/clang/AST/VTableBuilder.h
255 ↗(On Diff #80844)

Maybe this ought to be in LLVM as OwnedArrayRef? And the more minimal implementation approach would be to inherit from MutableArrayRef<T> and just add a destructor and a move constructor.

The implicit conversion to ArrayRef is dangerous, but only in ways that ArrayRef is already dangerous.

260 ↗(On Diff #80844)

Ah. Yes, that's a reasonable solution.

clang/lib/AST/VTableBuilder.cpp
986 ↗(On Diff #80844)

Same comment here. Might as well give this enough capacity for any reasonably foreseeable case.

clang/lib/CodeGen/CGVTables.cpp
641 ↗(On Diff #80844)

While 1 is definitely the most likely size, adding small amounts of extra inline capacity to a temporary SmallVector has basically zero cost, and of course those cases do happen. 4 sounds reasonable.

pcc updated this revision to Diff 81264.Dec 13 2016, 11:21 AM
  • Address review comments
pcc marked 3 inline comments as done.Dec 13 2016, 11:23 AM
pcc added inline comments.
clang/include/clang/AST/VTableBuilder.h
255 ↗(On Diff #80844)

Good suggestions -- sent out D27723.

rjmccall accepted this revision.Dec 13 2016, 11:42 AM
rjmccall edited edge metadata.

LGTM, thanks!

This revision is now accepted and ready to land.Dec 13 2016, 11:42 AM
This revision was automatically updated to reflect the committed changes.
pcc marked an inline comment as done.