This patch introduces the -fwhole-program-vtables flag, which enables the
whole-program vtable optimization feature (D16795) in Clang.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Sorry I haven't got to this sooner. I'll try review what I can over the next day or two.
Saying that, i'm not experienced enough in the clang codebase to give a LGTM. I can comment on style, but someone else will need to give the final ok.
lib/CodeGen/CGVTables.cpp | ||
---|---|---|
904–919 ↗ | (On Diff #46697) | It looks like putting a class in a sanitizer blacklist turns off the vptr optimizations for the class and putting it in the vptr blacklist turns off CFI checks for it. Can we avoid that, perhaps by using separate bitsets for the vptr checks and CFI? |
lib/CodeGen/CodeGenModule.h | ||
492 ↗ | (On Diff #46697) | Now might be a good time to rename the SanitizerBlacklist class to something more general (but not as part of this commit). |
lib/CodeGen/ItaniumCXXABI.cpp | ||
1605 ↗ | (On Diff #46697) | You can be a lot more aggressive than this -- you can make an assumption about the value of the vptr from within EmitTypeCheck in every case where the vptr sanitizer would emit a dynamic type check. I'm not sure that doing so will allow you to deduce a lot more vptrs, but it seems like it could help in some cases. |
lib/CodeGen/CGVTables.cpp | ||
---|---|---|
904–919 ↗ | (On Diff #46697) |
This is approximately what we want to do in any case. The blacklists identify classes defined outside of the linkage unit, and both CFI checks (modulo the cross-DSO feature) and devirtualization both rely on the classes being defined in the current linkage unit. One can imagine adding classes to the blacklist that are somehow incompatible with CFI checks, but in general I'd expect that list to be pretty small (Chromium's list [1] currently has 18 entries, most of which are non-whole-program issues) so it probably isn't a huge loss for devirtualization, and we can always try to find some way to improve on that later. Likewise, I think we could be a little smarter about using the correct blacklists in cross-DSO mode, but that can probably come later.
I think we could probably share bitsets for both use cases and filter for devirt/CFI at call sites. I suspect it will be important for devirt and CFI to share bitsets, as I will want to eliminate CFI checks in devirtualized calls. |
lib/CodeGen/CodeGenModule.h | ||
492 ↗ | (On Diff #46697) | Ack. |
lib/CodeGen/ItaniumCXXABI.cpp | ||
1605 ↗ | (On Diff #46697) | Maybe, but that probably wouldn't help yet. The devirtualization optimization pass looks for a very specific IR pattern, and if we generate extra vtable loads for casts to assume against, we would need additional machinery to ensure the load from the cast is invariant with those from any calls (maybe Piotr's work would help here, not sure). One thing that I'd like to do (and this would help CFI as well) is to specifically recognize cases like this: struct B { virtual void vf(); }; struct D1 { }; struct D2 : B { virtual void vf(); }; void f(D1 *d) { d->vf(); } In this case I'd like to devirtualize the virtual call in f() to B::vf(). But because the implicit cast to B in f() removes the information that d cannot be of type D2, we cannot eliminate D2::vf() as a candidate target (see also test/cfi/sibling.cpp in the CFI test suite). Although this could possibly be emitted and pattern matched at the IR level, it seems simpler (and would probably catch enough cases) to have Clang look through the implicit cast when IR gen'ing for the call. |
OK, let's go with this as a starting point. We don't need to get maximally-precise type information in the first iteration.