This is an archive of the discontinued LLVM Phabricator instance.

[AST] Various optimizations + refactoring in DeclarationName(Table)
ClosedPublic

Authored by riccibruno on Sep 19 2018, 8:26 AM.

Details

Summary

Introduce the following optimizations in DeclarationName(Table)

  1. Store common kinds inline in DeclarationName instead of DeclarationNameExtra. Currently the kind of C++ constructor, destructor, conversion function and overloaded operator names is stored in DeclarationNameExtra. Instead store it inline in DeclarationName. To do this align IdentifierInfo, CXXSpecialName, DeclarationNameExtra and CXXOperatorIdName to 8 bytes so that we can use the lower 3 bits of DeclarationName::Ptr. This is already the case on 64 bits archs anyway. This also allow us to remove DeclarationNameExtra from CXXSpecialName and CXXOperatorIdName, which shave off a pointer from CXXSpecialName. (This might be the thing that sink this patch since you might judge that aligning IdentifierInfo to 8 bytes even on 32 bits archs is not acceptable. Although if we care about 32 bits archs there is probably a lot more that can be done.)
  1. Synchronize the enumerations DeclarationName::NameKind, DeclarationName::StoredNameKind and Selector::IdentifierInfoFlag. This makes DeclarationName::getNameKind much more efficient since we can replace the switch table by a single comparison and an addition. static_asserts are provided in DeclarationName::VerifyStaticAsserts to ensure that the enumerations remain in sync.
  1. Put the overloaded operator names inline in DeclarationNameTable to remove an indirection. This increase the size of DeclarationNameTable a little bit but this is not important since it is only used in ASTContext, and never copied nor moved from. This also get rid of the last dynamic allocation in DeclarationNameTable.

Altogether these optimizations cut the run time of parsing all of Boost by about 0.8%.
While we are at it, do the following NFC modifications:

  1. Put the internal classes CXXSpecialName, CXXDeductionGuideNameExtra, CXXOperatorIdName, CXXLiteralOperatorIdName and DeclarationNameExtra in a namespace detail since these classes are only meant to be used by DeclarationName and DeclarationNameTable. Make this more explicit by making the members of these classes private and friending DeclarationName(Table).
  1. Make DeclarationName::getFETokenInfo a non-template since every users are using it to get a void *. It was supposed to be used with a type to avoid a subsequent static_cast.
  1. Change the internal functions DeclarationName::getAs* to castAs* since when we use them we already know the correct kind. This has no external impact since all of these are private.

Diff Detail

Repository
rC Clang

Event Timeline

riccibruno created this revision.Sep 19 2018, 8:26 AM

A few drive-by comments... This is a patch with interactions I'm not sure I feel comfortable approving myself, so I'll count on @rjmccall to do so.

include/clang/AST/DeclarationName.h
255

There is an llvm type for storing something in the lower bits of a pointer. I THINK it is llvm::PointerIntPair.

I'd MUCH prefer you do that instead of the reimplementation here.

lib/AST/DeclarationName.cpp
42

I'm a touch confused by this function. It shouldn't be necessary when the static asserts can all be put at the global/class level.

riccibruno added inline comments.Sep 19 2018, 9:45 AM
include/clang/AST/DeclarationName.h
255

Well it was already like this but point taken.

lib/AST/DeclarationName.cpp
42

Yes, but the enumeration StoredNameKind is private to DeclarationName.
Therefore I cannot just put the static_asserts into a .cpp and be done with it.
An alternative is of course to put them in the class definition in the header but
look at this mess... (and this is after clang-format).

Conceptually, this change looks great. And it should be fine to require extra alignment on IdentifierInfo on 32-bit hosts; I doubt that will have measurable impact.

I believe it's possible to eliminate the need for most, perhaps all, of these static_asserts by just defining the constants more sensibly in the first place.

include/clang/AST/DeclarationName.h
45–46

Should DeclarationNameExtra be moved here? I'm not sure why it's in IdentifierTable.h in the first place, and your refactor seems to make that even more pointless.

53

If this isn't a self-contained description of the name anymore, I think adding Extra to the class name would be appropriate. And maybe the class name should directly express that is for the kinds of names that are basically uniqued by type.

Please introduce a symbolic constant for 8 here, since you use and assume it across multiple places. Either #define or a const variable works.

97

This is probably pre-existing, but the comment should clarify that this doesn't include literal or conversion operators.

160

You can express this directly by defining StoredNameKind first and then defining the corresponding enumerators in NameKind in terms of them.

166

Same deal. Just define the enumerators of NameKind appropriately.

168

Is the criterion for inclusion in the first seven really just frequency of use, or is it a combination of that and relative benefit?

The only one I would really quibble about is that multi-argument selectors are probably more common and important to Objective-C code than conversion operators are to C++. But it's quite possible that the relative benefit is much higher for C++, since selectors only appear on specific kinds of declarations that know they're declared with selectors — relatively little code actually needs to be polymorphic about them — and since they have to be defined out-of-line.

255

Using PointerIntPair probably won't work here because the pointer type is basically a union, and you can't use void* because PointerIntPair wants to assert that the pointer type is sufficiently aligned.

riccibruno marked 9 inline comments as done.

Address rjmccall comments:

  1. Renamed CXXSpecialName to CXXSpecialNameExtra
  2. Introduced a constant IdentifierInfoAlignment for alignas(IdentifierInfoAlignment)
  3. Updated some comments
  4. Cleaned up the static_assert which checked the consistency of the numerical values of the enumerators.

Regarding using a PointerIntPair it would work if we pass a custom
PointerLikeTypeTraits to state that, yes, our void * are aligned to 8 bytes
instead of 4. However I am not sure this is a good idea since the PointerIntPair
do not really simplify the code. What we really want here is a pointer union but we
also need precise control over the low bits.

Addressed some inline comments.

include/clang/AST/DeclarationName.h
45–46

DeclarationNameExtra is unfortunately needed by MultiKeywordSelector
in IdentifierInfo.cpp.

168

I did not do an in-depth analysis regarding the selection of the first seven inline kinds.
My thought process was that C++ constructors and operator names should definitely be
inline. C++ destructors seemed much more frequent than c++ literal operators and
deduction guides. This leaves one slot available and since C++ conversion operators
share the same class CXXSpecialName including it as an inline kind simplifies the code.

Also a practical point is that I don't really have a representative sample of ObjC code
to benchmark this. For C++ I am using all of Boost which I hope is representative
enough. If you deem it necessary I will try to do some benchmarks with
ObjCMultiArgSelector stored inline.

rjmccall accepted this revision.Sep 20 2018, 11:46 AM
rjmccall added inline comments.
include/clang/AST/DeclarationName.h
45–46

Oh, and one is in Basic instead of AST.

I'm not sure there's really a good reason for Selector to exist at the Basic level instead of AST; it's probably more an artifact of old divisions than something that's really useful. But changing that would probably be painful.

168

I think I can accept this as-is without extra investigation.

187

Thanks, this looks great.

This revision is now accepted and ready to land.Sep 20 2018, 11:46 AM
This revision was automatically updated to reflect the committed changes.