This is an archive of the discontinued LLVM Phabricator instance.

[AST] Add Obj-C discriminator to MS ABI RTTI
AbandonedPublic

Authored by smeenai on Sep 28 2018, 2:59 PM.

Details

Summary

Obj-C classes are mangled as C++ structs with the same name (in both the
Itanium and the Microsoft ABIs), but we want to be able to distinguish
them for the purposes of exception handling, so we need to add a
discriminator to the RTTI and the RTTI name.

The chosen discriminator (.objc) is fairly arbitrary, and I'm open to
other suggestions. It's also perhaps not ideal to have the discriminator
as a prefix, since then the RTTI won't demangle (RTTI names never
demangle anyway, so that's less of a concern). Having it infix would
make for a cleaner mangling, but it would also require keeping some
state around in the mangler to indicate when we're mangling for RTTI,
which seems like a much uglier implementation and not worth it.

Diff Detail

Event Timeline

smeenai created this revision.Sep 28 2018, 2:59 PM

Conceptually this seems fine, but I think it would be good to stop and make sure we're using a consistent style when mangling extensions. Currently it feels like every patch to add a Clang extension to the Microsoft mangling ends up inventing its own rules and crossing its fingers.

Conceptually this seems fine, but I think it would be good to stop and make sure we're using a consistent style when mangling extensions. Currently it feels like every patch to add a Clang extension to the Microsoft mangling ends up inventing its own rules and crossing its fingers.

That's a fair concern.

I believe most of the Obj-C extensions have been handled by @compnerd, and he's been following a pretty consistent scheme using the __Objc namespace, e.g. void f(id<P>) {} is mangled as void __cdecl f(struct objc_object<struct __ObjC::Protocol<struct P> > *). I could certainly try to implement something similar here, except as I mentioned, I'm pretty sure it would require maintaining some state in the demangler for indicating whether we were mangling for RTTI.

Conceptually this seems fine, but I think it would be good to stop and make sure we're using a consistent style when mangling extensions. Currently it feels like every patch to add a Clang extension to the Microsoft mangling ends up inventing its own rules and crossing its fingers.

That's a fair concern.

I believe most of the Obj-C extensions have been handled by @compnerd, and he's been following a pretty consistent scheme using the __Objc namespace, e.g. void f(id<P>) {} is mangled as void __cdecl f(struct objc_object<struct __ObjC::Protocol<struct P> > *). I could certainly try to implement something similar here, except as I mentioned, I'm pretty sure it would require maintaining some state in the demangler for indicating whether we were mangling for RTTI.

State wouldn't really be the right solution anyway, although it might work because of the structural restrictions on Objective-C types. If you wanted to keep that same rule, I think you could probably just pass a (default=false) flag down to the ObjCObjectPointerType, ObjCObjectType, and ObjCInterfaceType cases. (The pointee type of the former will always be one of the latter two.)

smeenai planned changes to this revision.Oct 2 2018, 5:49 PM

Conceptually this seems fine, but I think it would be good to stop and make sure we're using a consistent style when mangling extensions. Currently it feels like every patch to add a Clang extension to the Microsoft mangling ends up inventing its own rules and crossing its fingers.

That's a fair concern.

I believe most of the Obj-C extensions have been handled by @compnerd, and he's been following a pretty consistent scheme using the __Objc namespace, e.g. void f(id<P>) {} is mangled as void __cdecl f(struct objc_object<struct __ObjC::Protocol<struct P> > *). I could certainly try to implement something similar here, except as I mentioned, I'm pretty sure it would require maintaining some state in the demangler for indicating whether we were mangling for RTTI.

State wouldn't really be the right solution anyway, although it might work because of the structural restrictions on Objective-C types. If you wanted to keep that same rule, I think you could probably just pass a (default=false) flag down to the ObjCObjectPointerType, ObjCObjectType, and ObjCInterfaceType cases. (The pointee type of the former will always be one of the latter two.)

Ah, there are fewer hops between the mangleType call and its ultimate destination than I was expecting, so a default parameter would work. I'll change this accordingly, thanks.

smeenai requested review of this revision.Oct 2 2018, 6:30 PM

Actually, I take that back ... I just misread the stack trace.

There are a bunch of hops between the mangleCXXRTTI call and the ultimate mangling function:

MicrosoftMangleContextImpl::mangleCXXRTTI(QualType, raw_ostream &)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectPointerType *, Qualifiers, SourceRange)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectType *, Qualifiers, SourceRange)

(the last one will be ObjCInterfaceType instead of ObjCObjectType if catching anything other than id)

Threading a ForRTTI flag or similar all the way to the final call seems pretty tricky. I can add an optional paramater to mangleType(QualType, SourceRange, QualifierMangleMode), but that function uses a generated switch case to call the specific mangleType functions, and I don't know how to special-case certain types in that switch case (to pass the extra parameter along) without doing something super ugly. Adding the extra parameter to every single mangleType overload seems highly non-ideal, which is why I was thinking of maintaining some internal state instead.

Actually, I take that back ... I just misread the stack trace.

There are a bunch of hops between the mangleCXXRTTI call and the ultimate mangling function:

MicrosoftMangleContextImpl::mangleCXXRTTI(QualType, raw_ostream &)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectPointerType *, Qualifiers, SourceRange)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectType *, Qualifiers, SourceRange)

(the last one will be ObjCInterfaceType instead of ObjCObjectType if catching anything other than id)

Threading a ForRTTI flag or similar all the way to the final call seems pretty tricky. I can add an optional paramater to mangleType(QualType, SourceRange, QualifierMangleMode), but that function uses a generated switch case to call the specific mangleType functions, and I don't know how to special-case certain types in that switch case (to pass the extra parameter along) without doing something super ugly. Adding the extra parameter to every single mangleType overload seems highly non-ideal, which is why I was thinking of maintaining some internal state instead.

Well, that's why I was talking about how the pointee type of an ObjCObjectPointerType is always an ObjCObjectType (of which ObjCInterfaceType is a subclass) — the implication being that you don't actually have to do the switch to dispatch to one of those two cases.

Actually, I take that back ... I just misread the stack trace.

There are a bunch of hops between the mangleCXXRTTI call and the ultimate mangling function:

MicrosoftMangleContextImpl::mangleCXXRTTI(QualType, raw_ostream &)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectPointerType *, Qualifiers, SourceRange)
MicrosoftCXXNameMangler::mangleType(QualType, SourceRange, QualifierMangleMode)
MicrosoftCXXNameMangler::mangleType(const ObjCObjectType *, Qualifiers, SourceRange)

(the last one will be ObjCInterfaceType instead of ObjCObjectType if catching anything other than id)

Threading a ForRTTI flag or similar all the way to the final call seems pretty tricky. I can add an optional paramater to mangleType(QualType, SourceRange, QualifierMangleMode), but that function uses a generated switch case to call the specific mangleType functions, and I don't know how to special-case certain types in that switch case (to pass the extra parameter along) without doing something super ugly. Adding the extra parameter to every single mangleType overload seems highly non-ideal, which is why I was thinking of maintaining some internal state instead.

Well, that's why I was talking about how the pointee type of an ObjCObjectPointerType is always an ObjCObjectType (of which ObjCInterfaceType is a subclass) — the implication being that you don't actually have to do the switch to dispatch to one of those two cases.

The intermediate mangleType(QualType, SourceRange, QualifierMangleMode) calls have a bunch of logic in them; it's not just a direct dispatch to the actual mangling function, so I don't think we can skip over it. I think it's hard to discuss this in the abstract though (and it's also entirely possible I'm missing your point completely), so I'll just actually try out the parameter approach and put up the resulting patch and we can see how it turns out.

@rjmccall I prototyped the ForRTTI parameter approach in D53546. It could definitely be cleaned up a bit, but it demonstrates the problems I saw with the added parameter. Namely, mangleType(QualType, SourceRange, QualifierMangleMode) has a bunch of additional logic which is needed for correctness, so we need to thread the parameter through the entire chain, and the only way I could think of doing that without adding the parameter to every single mangleType overload was to have an additional switch to dispatch to the overloads with the added ForRTTI parameter, which is pretty ugly.

As I see it, there are a few ways to proceed here:

  • The approach in D53546 (cleaned up a bit). I think it's pretty ugly, but you may have suggestions on how to do it better.
  • In the mangleType overload for ObjCObjectPointerType, skipping the generic mangleType dispatch and going directly to either the ObjCObjectType or ObjCInterfaceType overloads. That avoids the ugliness in the generic mangleType, but it also requires us to duplicate some logic from it in the ObjCObjectPointerType overload, which doesn't seem great either.
  • Maintaining ForRTTI state in the mangler instead of threading a parameter through (I'm generally not a fan of state like that, but it might be cleaner than the threading in this case?)
  • Just sticking with what I'm doing in this patch.

What do you think?

Ping @rjmccall. Let me know if the approach in D53546 is what you'd been envisioning, or if I'm just doing something completely brain-dead :)

@rjmccall I prototyped the ForRTTI parameter approach in D53546. It could definitely be cleaned up a bit, but it demonstrates the problems I saw with the added parameter. Namely, mangleType(QualType, SourceRange, QualifierMangleMode) has a bunch of additional logic which is needed for correctness, so we need to thread the parameter through the entire chain, and the only way I could think of doing that without adding the parameter to every single mangleType overload was to have an additional switch to dispatch to the overloads with the added ForRTTI parameter, which is pretty ugly.

As I see it, there are a few ways to proceed here:

  • The approach in D53546 (cleaned up a bit). I think it's pretty ugly, but you may have suggestions on how to do it better.
  • In the mangleType overload for ObjCObjectPointerType, skipping the generic mangleType dispatch and going directly to either the ObjCObjectType or ObjCInterfaceType overloads. That avoids the ugliness in the generic mangleType, but it also requires us to duplicate some logic from it in the ObjCObjectPointerType overload, which doesn't seem great either.
  • Maintaining ForRTTI state in the mangler instead of threading a parameter through (I'm generally not a fan of state like that, but it might be cleaner than the threading in this case?)
  • Just sticking with what I'm doing in this patch.

What do you think?

Sorry for the delay. I think duplicating the dispatch logic for ObjCObjectPointerType is probably the best approach; the pointee type will always an ObjCObjectType of some sort, and there are only two kinds of those.

smeenai added a comment.EditedNov 7 2018, 6:59 PM

@rjmccall I prototyped the ForRTTI parameter approach in D53546. It could definitely be cleaned up a bit, but it demonstrates the problems I saw with the added parameter. Namely, mangleType(QualType, SourceRange, QualifierMangleMode) has a bunch of additional logic which is needed for correctness, so we need to thread the parameter through the entire chain, and the only way I could think of doing that without adding the parameter to every single mangleType overload was to have an additional switch to dispatch to the overloads with the added ForRTTI parameter, which is pretty ugly.

As I see it, there are a few ways to proceed here:

  • The approach in D53546 (cleaned up a bit). I think it's pretty ugly, but you may have suggestions on how to do it better.
  • In the mangleType overload for ObjCObjectPointerType, skipping the generic mangleType dispatch and going directly to either the ObjCObjectType or ObjCInterfaceType overloads. That avoids the ugliness in the generic mangleType, but it also requires us to duplicate some logic from it in the ObjCObjectPointerType overload, which doesn't seem great either.
  • Maintaining ForRTTI state in the mangler instead of threading a parameter through (I'm generally not a fan of state like that, but it might be cleaner than the threading in this case?)
  • Just sticking with what I'm doing in this patch.

What do you think?

Sorry for the delay. I think duplicating the dispatch logic for ObjCObjectPointerType is probably the best approach; the pointee type will always an ObjCObjectType of some sort, and there are only two kinds of those.

Sorry, I'm still not sure how this will work.

Duplicating the dispatch logic for ObjCObjectPointerType ends up looking like P8114, which is fine. However, when we're actually mangling RTTI or RTTI names, we're still going through the main mangleType(QualType, SourceRange, QualifierMangleMode) overload, which still requires us to thread ForRTTI through that function, which still requires us to duplicate the switch in that function (to handle the ForRTTI case, since the other switch is generated via TypeNodes.def and not easily modifiable), which is the main ugliness in my opinion. Do you also want me to add special dispatching to mangleCXXRTTI and mangleCXXRTTIName to just call the ObjCObjectPointerType function directly when they're dealing with that type? That's certainly doable, but at that point just keeping some state around in the demangler starts to feel cleaner, at least to me.

Sorry, had a leftover draft which I forgot to clean up. Edited in Phabricator now.

@rjmccall I prototyped the ForRTTI parameter approach in D53546. It could definitely be cleaned up a bit, but it demonstrates the problems I saw with the added parameter. Namely, mangleType(QualType, SourceRange, QualifierMangleMode) has a bunch of additional logic which is needed for correctness, so we need to thread the parameter through the entire chain, and the only way I could think of doing that without adding the parameter to every single mangleType overload was to have an additional switch to dispatch to the overloads with the added ForRTTI parameter, which is pretty ugly.

As I see it, there are a few ways to proceed here:

  • The approach in D53546 (cleaned up a bit). I think it's pretty ugly, but you may have suggestions on how to do it better.
  • In the mangleType overload for ObjCObjectPointerType, skipping the generic mangleType dispatch and going directly to either the ObjCObjectType or ObjCInterfaceType overloads. That avoids the ugliness in the generic mangleType, but it also requires us to duplicate some logic from it in the ObjCObjectPointerType overload, which doesn't seem great either.
  • Maintaining ForRTTI state in the mangler instead of threading a parameter through (I'm generally not a fan of state like that, but it might be cleaner than the threading in this case?)
  • Just sticking with what I'm doing in this patch.

What do you think?

Sorry for the delay. I think duplicating the dispatch logic for ObjCObjectPointerType is probably the best approach; the pointee type will always an ObjCObjectType of some sort, and there are only two kinds of those.

Sorry, I'm still not sure how this will work.

Duplicating the dispatch logic for ObjCObjectPointerType ends up looking like P8114, which is fine. However, when we're actually mangling RTTI or RTTI names, we're still going through the main mangleType(QualType, SourceRange, QualifierMangleMode) overload, which still requires us to thread ForRTTI through that function, which still requires us to duplicate the switch in that function (to handle the ForRTTI case, since the other switch is generated via TypeNodes.def and not easily modifiable), which is the main ugliness in my opinion. Do you also want me to add special dispatching to mangleCXXRTTI and mangleCXXRTTIName to just call the ObjCObjectPointerType function directly when they're dealing with that type? That's certainly doable, but at that point just keeping some state around in the demangler starts to feel cleaner, at least to me.

Well, you could check for an ObjCObjectPointerType in the top-level routine.

But yeah, probably the cleanest thing to do would to be push the state through the main dispatch. Don't push a single bool through, though; make a TypeManglingOptions struct to encapsulate it, in case we have a similar problem elsewhere. It should have a method for getting options that should be propagated down to component type manglings, and that method can drop the "for RTTI" bit; that's the part that ObjCObjectPointerType can handle differently.

@rjmccall I prototyped the ForRTTI parameter approach in D53546. It could definitely be cleaned up a bit, but it demonstrates the problems I saw with the added parameter. Namely, mangleType(QualType, SourceRange, QualifierMangleMode) has a bunch of additional logic which is needed for correctness, so we need to thread the parameter through the entire chain, and the only way I could think of doing that without adding the parameter to every single mangleType overload was to have an additional switch to dispatch to the overloads with the added ForRTTI parameter, which is pretty ugly.

As I see it, there are a few ways to proceed here:

  • The approach in D53546 (cleaned up a bit). I think it's pretty ugly, but you may have suggestions on how to do it better.
  • In the mangleType overload for ObjCObjectPointerType, skipping the generic mangleType dispatch and going directly to either the ObjCObjectType or ObjCInterfaceType overloads. That avoids the ugliness in the generic mangleType, but it also requires us to duplicate some logic from it in the ObjCObjectPointerType overload, which doesn't seem great either.
  • Maintaining ForRTTI state in the mangler instead of threading a parameter through (I'm generally not a fan of state like that, but it might be cleaner than the threading in this case?)
  • Just sticking with what I'm doing in this patch.

What do you think?

Sorry for the delay. I think duplicating the dispatch logic for ObjCObjectPointerType is probably the best approach; the pointee type will always an ObjCObjectType of some sort, and there are only two kinds of those.

Sorry, I'm still not sure how this will work.

Duplicating the dispatch logic for ObjCObjectPointerType ends up looking like P8114, which is fine. However, when we're actually mangling RTTI or RTTI names, we're still going through the main mangleType(QualType, SourceRange, QualifierMangleMode) overload, which still requires us to thread ForRTTI through that function, which still requires us to duplicate the switch in that function (to handle the ForRTTI case, since the other switch is generated via TypeNodes.def and not easily modifiable), which is the main ugliness in my opinion. Do you also want me to add special dispatching to mangleCXXRTTI and mangleCXXRTTIName to just call the ObjCObjectPointerType function directly when they're dealing with that type? That's certainly doable, but at that point just keeping some state around in the demangler starts to feel cleaner, at least to me.

Well, you could check for an ObjCObjectPointerType in the top-level routine.

But yeah, probably the cleanest thing to do would to be push the state through the main dispatch. Don't push a single bool through, though; make a TypeManglingOptions struct to encapsulate it, in case we have a similar problem elsewhere. It should have a method for getting options that should be propagated down to component type manglings, and that method can drop the "for RTTI" bit; that's the part that ObjCObjectPointerType can handle differently.

All right, the struct is a good idea. It would require adding the TypeManglingOptions parameter to all the mangleType overloads (unless we want to stick with the second switch in the main mangleType dispatch function, but that seems super ugly, especially if we're doing a more general solution), so I wanted to confirm @rnk is okay with adding a parameter to all those overloads as well before proceeding.

rnk added a comment.Nov 9 2018, 12:22 PM

Threading a new options argument through mangleType that includes QualifierMangleMode as well as these obj-c options seems reasonable.

smeenai updated this revision to Diff 173965.Nov 13 2018, 4:48 PM

Stateful approach

@rnk pointed out on IRC that the MicrosoftCXXNameMangler is actually specifically designed to manage the mangling of only a single name, in which case adding state to it for handling RTTI seems like a natural approach. @rjmccall, what do you think? I think this is much cleaner than having to thread through the RTTI state to every individual method. The ForRTTI_t enum is modeled after the ForDefinition_t enum used in CGM, but I'm happy to switch to a more general struct (as you'd mentioned before) if you prefer.

I'm not worried about the mangler being re-used for multiple declarations, I'm worried about a global flag changing how we mangle all components of a type when we only mean to change it at the top level.

I'm not worried about the mangler being re-used for multiple declarations, I'm worried about a global flag changing how we mangle all components of a type when we only mean to change it at the top level.

Hmm, but don't we want it to affect all components? For example, consider something like:

@interface I
@end

template <class T> class C {};

void f();
void g() {
  try {
    f();
  } catch (C<I> *) {
  }
}

I would say that we want the RTTI for C<I> * to have the discriminator for I. It turns out my current patch doesn't actually do that; I guess there's a sub-mangler being constructed somewhere that's not inheriting the RTTI-ness. Or did you mean something else?

I'm not worried about the mangler being re-used for multiple declarations, I'm worried about a global flag changing how we mangle all components of a type when we only mean to change it at the top level.

Hmm, but don't we want it to affect all components? For example, consider something like:

@interface I
@end

template <class T> class C {};

void f();
void g() {
  try {
    f();
  } catch (C<I> *) {
  }
}

I would say that we want the RTTI for C<I> * to have the discriminator for I.

Why? IIUC, you're adding the discriminator to distinguish between two different RTTI objects. It's not like there are two different types. You should mangle C<I> normally here.

smeenai planned changes to this revision.Nov 15 2018, 1:43 PM

I'm not worried about the mangler being re-used for multiple declarations, I'm worried about a global flag changing how we mangle all components of a type when we only mean to change it at the top level.

Hmm, but don't we want it to affect all components? For example, consider something like:

@interface I
@end

template <class T> class C {};

void f();
void g() {
  try {
    f();
  } catch (C<I> *) {
  }
}

I would say that we want the RTTI for C<I> * to have the discriminator for I.

Why? IIUC, you're adding the discriminator to distinguish between two different RTTI objects. It's not like there are two different types. You should mangle C<I> normally here.

You're right – I wasn't thinking this through correctly. Back to the drawing board.

rnk added a comment.Nov 20 2018, 2:51 PM

Well, you could go further down the route of what we do for "structors", and store the top-level decl being mangled in the mangler. Would that solve the problem?

smeenai abandoned this revision.Apr 28 2022, 1:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2022, 1:23 PM