This is an archive of the discontinued LLVM Phabricator instance.

CFI: Implement bitset emission for the Microsoft ABI.
ClosedPublic

Authored by pcc on Jun 17 2015, 4:03 PM.

Details

Summary

Clang's control flow integrity implementation works by conceptually attaching
"tags" (in the form of bitset entries) to each virtual table, identifying
the names of the classes that the virtual table is compatible with. Under
the Itanium ABI, it is simple to assign tags to virtual tables; they are
simply the address points, which are available via VTableLayout. Because any
overridden methods receive an entry in the derived class's virtual table,
a check for an overridden method call can always be done by checking the
tag of whichever derived class overrode the method call.

The Microsoft ABI is a little different, as it does not directly use address
points, and overrides in a derived class do not cause new virtual table entries
to be added to the derived class; instead, the slot in the base class is
reused, and the compiler needs to adjust the this pointer at the call site
to (generally) the base class that initially defined the method. After the
this pointer has been adjusted, we cannot check for the derived class's tag,
as the virtual table may not be compatible with the derived class. So we
need to determine which base class we have been adjusted to.

Specifically, at each call site, we use ASTRecordLayout to identify the most
derived class whose virtual table is laid out at the "this" pointer offset
we are using to make the call, and check the virtual table for that tag.

Because address point information is unavailable, we "reconstruct" it as
follows: any virtual tables we create for a non-derived class receive a tag
for that class, and virtual tables for a base class inside a derived class
receive a tag for the base class, together with tags for any derived classes
which are laid out at the same position as the derived class (and therefore
have compatible virtual tables).

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 27887.Jun 17 2015, 4:03 PM
pcc retitled this revision from to CFI: Implement bitset emission for the Microsoft ABI..
pcc updated this object.
pcc edited the test plan for this revision. (Show Details)
pcc added a reviewer: majnemer.
pcc added subscribers: rnk, Unknown Object (MLST).
rnk accepted this revision.Jun 18 2015, 10:06 AM
rnk added a reviewer: rnk.

Looks good

lib/AST/MicrosoftMangle.cpp
2765 ↗(On Diff #27887)

What about NoLinkage, which I believe applies to types declared in functions? This check appears equivalent to !RD->isExternallyVisible(), which I think is probably what you want.

This is the test case I'm imagining:
foo.h:

struct A { virtual int f(); };
void use_a(A&);

a.cpp:

static void f() { struct B : A { virtual int f() { return 1; } } b; use_a(b); }

b.cpp:

static void f() { struct B : A { virtual int f() { return 2; } } b; use_a(b); }

It seems like you'd need two distinct tags for B, right?

lib/CodeGen/MicrosoftCXXABI.cpp
1701–1705 ↗(On Diff #27887)

Sigh. I wish we didn't have to do this. We should be able to provide the record decl that originally established the vftable slot in the method location, but it's not easy and requires a good understanding of the MS vftable code, which is pretty opaque. So let's do this for now.

I guess your algorithm also computes something subtly different. In this example, you'll get B when calling f:

struct A { virtual void f(); };
struct B : A { virtual void g(); };

Whereas I would think of it as being the vftable introduced by A.

This revision is now accepted and ready to land.Jun 18 2015, 10:06 AM
This revision was automatically updated to reflect the committed changes.
pcc added inline comments.Jun 18 2015, 7:37 PM
lib/AST/MicrosoftMangle.cpp
2765 ↗(On Diff #27887)

Good catch. !RD->isExternallyVisible() does seem like the correct test. The Itanium implementation had the same problem; I've fixed that as well and added a test for both.

lib/CodeGen/MicrosoftCXXABI.cpp
1701–1705 ↗(On Diff #27887)

Right, but I believe that we semantically analyse this as an implicit derived-to-base conversion followed by a call, so we do end up getting A in the end. (Ideally we probably do want to use the most-derived class to improve the precision of the check. See also the XFAILed test case compiler-rt/test/cfi/sibling.cpp.)