This is an archive of the discontinued LLVM Phabricator instance.

Front-end of the implementation of the interleaving algorithm
Needs ReviewPublic

Authored by zhaomo on Sep 10 2018, 6:44 PM.

Details

Summary

This patch includes the front-end of the interleaving algorithm implemented in https://reviews.llvm.org/D50754.

This patch introduces a new front-end option named -enable-vtable-interleaving. When this option is set, the LTO visibility of the module is hidden, and the cfi-vcall is enabled for the module, this front-end creates a placeholder (marked with a new type of metadata called "offset_type" metadata) for each used offset in a vtable. These placeholders will be replaced with the actually offsets in the interleaved layout when the interleaved layout is decided, which is implemented in https://reviews.llvm.org/D50754.

Diff Detail

Event Timeline

zhaomo created this revision.Sep 10 2018, 6:44 PM
pcc added a reviewer: rsmith.Sep 11 2018, 12:57 PM
pcc added a subscriber: cfe-commits.
pcc added inline comments.
clang/lib/CodeGen/CGCXXABI.h
52

Why does this need to be stored separately on the CGCXXABI? Can't we always access it via the CodeGenOptions on CGM?

clang/lib/CodeGen/CGClass.cpp
2798–2799

Do you need this function as well as the one you're adding? Can't you change the code in MicrosoftCXXABI.cpp to form a constant?

clang/lib/CodeGen/CodeGenModule.cpp
5779

Comment should say why.

clang/lib/CodeGen/ItaniumCXXABI.cpp
993

I wouldn't add a function to GlobalObject for this, you can add the metadata here directly since this is the only place in the code where you need to add it.

1787

Remind me why this needs to be here? (And the explanation needs to be in a comment.)

clang/lib/Frontend/CompilerInvocation.cpp
1400

Use hasArg here instead, there's no need to support a flag for disabling interleaving in the frontend.

This change causes all compiler-rt cfi tests to be UNSUPPORTED for me locally, do you have any idea why that might be? The lit changes don't make it immediately clear.

clang/lib/CodeGen/CGVTables.cpp
1146

member function pointers*

clang/lib/CodeGen/ItaniumCXXABI.cpp
949

doxygen comments normally go in the corresponding header file.

compiler-rt/test/cfi/CMakeLists.txt
62

nit: spacing

compiler-rt/test/cfi/cross-dso-diagnostic.cpp
7 ↗(On Diff #164783)

Why? This test has to do with error printing to do with vcalls across DSOs, not cross-DSO CFI, so I wouldn't expect that this should be affected by this change.

compiler-rt/test/cfi/lit.cfg
33 ↗(On Diff #164783)

Can get rid of at least save-temps here, do we need to pass -enable-vtable-interleaving to the linker?

zhaomo added inline comments.Sep 12 2018, 1:19 PM
clang/lib/CodeGen/ItaniumCXXABI.cpp
1787

The calculation of address point is essentially base_vtable_addr + offset, where offset is from the indices of gep.
In the interleaving pass, we replace base_vtable_addr with (addr_point_in_interleaved_layout - offset).

The LLVM language reference says that the base address of a inbounds gep must be an in bound address of the object. The new base address addr_point_in_interleaved_layout - offset, however, may not be an in bound address.

zhaomo added inline comments.Sep 13 2018, 4:21 PM
clang/lib/CodeGen/ItaniumCXXABI.cpp
949

I just followed the style of the rest of the file. All the comments above functions use ///. Do I need to change them?

zhaomo added inline comments.Sep 13 2018, 4:31 PM
compiler-rt/test/cfi/lit.cfg
33 ↗(On Diff #164783)

I was debugging with save-temps and I forgot to remove it.

Currently we need to pass -enable-vtable-interleaving to the linker because the front-end and back-end are controlled by two separate flags.

rsmith added inline comments.Sep 13 2018, 5:58 PM
clang/include/clang/Driver/CC1Options.td
409–410

We usually only expose the non-default flag in -cc1, so that there are no ordering concerns and we can determine whether a feature is enabled with just hasArg. Also, -fvtable-interleaving would seem like a more natural flag name to me.

clang/include/clang/Frontend/CodeGenOptions.h
274 ↗(On Diff #164783)

Enable here seems redundant. Is thee a reason to declare this explicitly rather than adding it to CodeGenOptions.def?

clang/lib/CodeGen/ItaniumCXXABI.cpp
983–986

The __s here are atypical. If you want a name that can't collide with a name generated by the frontend, putting a period in it is the usual strategy. Also, a name prefix that indicates what this global is for would make the IR more readable and make this less likely to collide with other things. Maybe clang.vtable.index.<CLASS>.<OFFSET> or similar?

Also, there's no guarantee that RD->getNameAsString() produces something unique within the translation unit, so you'll need to make sure that name collisions are handled somehow (I *think* GlobalVariable already deals with this for you, but I'm not sure... however, I think its strategy is to add a number to the end of the mangling, which will lead to some quite peculiar results given that you already have a number there). It might be easier to base the name of this symbol on the mangled name of the class; then you could give the global variable the same linkage as the class, which might save your interleaving pass a little bit of work later on as the duplicates will already have been combined -- but that's up to you, and I don't have enough of the big picture here to give you advice.

997

Representing this offset as a ptrtoint on a global variable seems really strange. Is there really no better way to model this in IR? (I mean, if not, then carry on as you are, but this seems like a hack.)

1787

I still find the need for this confusing. Suppose we have:

struct A { virtual void f(); };
struct B { virtual void g(); virtual void h(); };
struct C : A, B {};

such that C has a naive vtable containing 7 elements {{offset, typeid, &A::f}, {offset, typeid, &B::f, &B::h}}, with address point indexes (0, 2) and (1, 2). Here we emit a gep inbounds @vtable, 0, 1, 2 to get the B-in-C vtable. Now, if this were instead emitted as gep inbounds (cast @vtable to i8*), 24, and the interleaving process were replacing @vtable with a non-inbounds GEP on its interleaved ubervtable, I could understand the need to drop inbounds here -- because the replacement for @vtable might be before the start of the ubervtable. But based on my reading of the algorithm description, that's not what happens: first we split @vtable into multiple individual vtables, and that splitting process presumably must identify these GEPs to figure out which slice of the vtable they need. That presumably either removes this GEP entirely, or leaves this being a correct and trivial inbounds GEP on a split-out piece of the vtable. (I'm imagining we effectively rewrite gep inbounds @vtable, 0, 1, 2 as @vtable-slice-2 as part of the splitting.) And then after interleaving, I'm imagining we replace all uses of @vtable-slice-2 with the appropriate address point in the ubervtable, which is again an inbounds GEP (but that would be a new GEP, unrelated to this one).

That is, if you need this change, it seems like that indicates a bug in the vtable splitter. But maybe I'm misunderstanding how this all fits together.

1862

Do you really need to change this to load the vptr at the "wrong" type and then cast it? (Could the cast, if necessary at all, live in EmitVTableTypeCheckedLoad instead?)

3776–3780

Shouldn't you also do this if the base is non-virtual? In that case,Offset is an offset within the class layout, not an index into the vtable.

This change causes all compiler-rt cfi tests to be UNSUPPORTED for me locally, do you have any idea why that might be? The lit changes don't make it immediately clear.

Not sure why that happened. How did you run the compiler-rt tests? Did you use ninja check-cfi?

zhaomo updated this revision to Diff 165415.Sep 13 2018, 9:08 PM

Fixed issues pointed out by Peter and Vlad.

Richard:
I just saw comments but I will travel tomorrow and this weekend so I will reply to your comments when I get a chance.

This change causes all compiler-rt cfi tests to be UNSUPPORTED for me locally, do you have any idea why that might be? The lit changes don't make it immediately clear.

Not sure why that happened. How did you run the compiler-rt tests? Did you use ninja check-cfi?

I ran them with llvm-lit against the tests under test/cfi. It looks like something caused the default test suite to change where all of the tests were marked unsupported (the test suite ran without lld and my default linker otherwise is bfd so LTO was unsupported.) Not a problem with the change.

clang/lib/CodeGen/ItaniumCXXABI.cpp
949

Disregard that, I hadn't realized that was the style elsewhere.

zhaomo updated this revision to Diff 195148.Apr 15 2019, 5:56 AM
zhaomo marked 7 inline comments as done.

Fixed bugs in the previous patch.

Herald added a project: Restricted Project. · View Herald TranscriptApr 15 2019, 5:56 AM

Addressed @rsmith's comments.

clang/include/clang/Driver/CC1Options.td
409–410

Changed it in the latest patch.

clang/include/clang/Frontend/CodeGenOptions.h
274 ↗(On Diff #164783)

Changed it in the latest patch.

clang/lib/CodeGen/ItaniumCXXABI.cpp
983–986

The encoding of the name of such a global variable is just for debugging purposes. The back-end pass does not care about the name. Instead, it relies on the metadata associated with the global variable, and the metadata does not rely on RD->getNameAsString() . Also, the back-end pass does not care how many global variables associated with the same metadata.

I did not know that it is using periods is a convention in LLVM. I changed the name to clang.vtable.offset.<CLASS>.<OFFSET>.

997

The global variable we create for each referenced vtable offset is actually a zero-length array, and we use the address of the array as the placeholder for a referenced vtable offset. I was told that this is a common trick used in LLVM.

@pcc: Peter, would you provide more information here?

1787

GlobalSplit, the pass that splits vtables, relies on inrange instead of inbounds.

From the language reference:

If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange.

inbounds means something different. Again from the language reference:

If the inbounds keyword is present, the result value of the getelementptr is a poison value if the base pointer is not an in bounds address of an allocated object, or if any of the addresses that would be formed by successive addition of the offsets implied by the indices to the base address with infinitely precise signed arithmetic are not an in bounds address of that allocated object.

We need to drop inbounds because that the base pointer may not be in bounds address of an interleaved vtable.

Here is how an address point is calculated in an original vtable. original_address_pt = start_original_vtable + original_offset, where start_original_vtable is the address of the original vtable and original_offset is the offset of the address point in the original vtable.

Our interleaving pass calculates the address of this address point in the interleaved object. Let's call it new_address_pt and new_address_pt = start_interleaved_vtable + new_offset.
Then we replace start_original_vtable with (new_address_pt - original_offset). Because start_original_vtable is the base pointer of the gep, and new_offset may be smaller than original_offset, we have to drop inbounds.

I'm imagining we effectively rewrite gep inbounds @vtable, 0, 1, 2 as @vtable-slice-2

gep inbounds @vtable, 0, 1, 2 is rewritten to gep inbounds @vtable-slice-2, 0, 2 in your example.

1862

I have to because both branches of the if statement rely on OffsetConstant, which is a byte offset.

3776–3780

You are right. This bug is fixed in the latest patch.

zhaomo updated this revision to Diff 221140.Sep 20 2019, 4:56 PM

In this new patch, I added a "level" field to type metadata when vtable interleaving is enabled.
Currently type metadata is a (byte offset, type) pair. However, suppose T1 and T2 are compatible with the same offset of a vtable, the interleaving pass cannot always determine which type is the less derived one because T1 and T2 may have the same number of compatible vtables. An observation is that types that are compatible with the same address point like T1 and T2 form a linear inheritance hierarchy. Based on this, I added a third field to the type metadata representing the level of the type in the linear type hierarchy. all-vtables' level value is 1 and any other type's level is greater than 1. The more derived a type is, the higher the level is.