This is an archive of the discontinued LLVM Phabricator instance.

[clang] Extend ParsedAttr to allow custom handling for type attributes
AbandonedPublic

Authored by mboehme on Nov 19 2021, 4:38 AM.

Details

Summary

A typical use case would be to allow custom attributes for pointer types. See examples/Attribute/Attribute.cpp for an example.

Unfortuantely, it is hard to provide meaningful test coverage for this change. The existing handleDeclAttribute() covered by lit tests for
attributes because multiple declaration attributes are marked 'SimpleHandler', which causes ClangAttrEmitter.cpp to generate a
handleDeclAttribute() handler for them. However, all of the existing type attributes need custom semantic handling, so it is not possible to simply implement them with 'SimpleHandler'.

The modified Attribute.cpp example does at least demonstrate that it is possible to use the new API without any build errors, but this is all that it does.

Diff Detail

Event Timeline

mboehme requested review of this revision.Nov 19 2021, 4:38 AM
mboehme created this revision.

Aaron, have you had any chance to look at this yet?

I added you as a reviewer because you expressed interest in something like this in this discussion: https://lists.llvm.org/pipermail/cfe-dev/2021-October/069097.html

Is there someone else who would be more appropriate to review this?

Aaron, have you had any chance to look at this yet?

I added you as a reviewer because you expressed interest in something like this in this discussion: https://lists.llvm.org/pipermail/cfe-dev/2021-October/069097.html

Is there someone else who would be more appropriate to review this?

I've not had the chance, sorry! My review queue is rather full at the moment, so I'm still digging out from under quite a few reviews (around 50 at last count) and it may be a bit before I can give this one the attention it deserves. This one is more complex because we don't do much of any automatic checking for anything about type attributes (tablegen doesn't do much for them currently), and touching the type system can have significant (and surprising) performance impacts, so we need to be particularly careful here. It's definitely on my queue though and I do intend to get to it as soon as I can.

I've not had the chance, sorry! My review queue is rather full at the moment, so I'm still digging out from under quite a few reviews (around 50 at last count) and it may be a bit before I can give this one the attention it deserves. This one is more complex because we don't do much of any automatic checking for anything about type attributes (tablegen doesn't do much for them currently), and touching the type system can have significant (and surprising) performance impacts, so we need to be particularly careful here. It's definitely on my queue though and I do intend to get to it as soon as I can.

Thanks for the quick reply!

There's absolutely no rush here -- and I figured you might have a lot on your plate, just wanted to make sure this hadn't dropped through the cracks.

First off, thank you so much for your patience while I took the time to think about this. I know it can be frustrating to not hear review feedback in a timely manner, so I'm sorry for that.

I've been sitting on this for a while because type attributes are a complex beast. There's the attribute parsing component to it, which is generally pretty straightforward and is largely what's handled here in your patch. But the bigger issue is the effects on the type system, which is one of the only ways to make a type attribute particularly useful. Unfortunately, type system effects are scattered all over the code base and can have drastic impact on compiler performance (template instantiation depth limits, memory overhead pressures, etc), and we tend to have quite a few assertions in the compiler to help users catch the places they need to update.

Given that utility from this would require significant changes in Clang itself, I'm not certain what a plugin buys us in terms of functionality compared to the risk of type attribute plugins making the compiler somewhat unstable in terms of performance and potentially even assertions, etc. tl;dr: type attributes aren't easily pluggable yet so by the time you make them useful, you basically have a clang fork instead of a plugin augmenting Clang. Based on that, I'm inclined to not ask you to do further work on it (that work would be akin to me asking you "please rework large parts of the type system" which would be a high risk activity and not a reasonable request as part of this patch). However, I also wanted to hear what your ideas are on how you were expecting to use the functionality here from the patch, as maybe there's utility that I'm not thinking of.

Thanks for the feedback! And no worries about the delay -- I know you've got a lot on your plate, and the proposed change is invasive.

To make sure I understand correctly: The issue is that if a Type is replaced by an AttributedType in places where Clang does not (yet) expect this to happen, this can cause performance regressions or assertions?

The motivation behind making type attributes pluggable is that I'd like to annotate pointers with additional information for use in a memory safety static analysis. The goal here is the same as for the proposal I discussed with you a while ago of extending the annotate attribute to types (or possibly adding a new annotate_type attribute) on this lengthy mailing list thread (no need to reread it):

https://lists.llvm.org/pipermail/cfe-dev/2021-October/thread.html#69087

In this discussion, I mentioned that I was thinking about making type attributes pluggable too. I eventually realized that this might actually be an easier avenue to pursue than extending annotate to types. (As we discussed, there might be cases where the GNU spelling of the annotate attribute would associate with a declaration while the C++ spelling would associate with a type, and we hadn't reached a firm conclusion on how best to resolve this. An entirely new pluggable attribute wouldn't have this problem.)

From your feedback, it sounds as if I should return to my earlier idea of extending annotate to types. I wonder though: Wouldn't this suffer from the same problems that you raise above since, again, Clang would see AttributedTypes in places where it might not expect them?

If the same concerns apply to both of these approaches, do you have any suggestions for alternative ways that I could add annotations to types? Or does this mean that I would have to do the work of making sure that Clang can handle AttributedTypes in these new places after all?

aaron.ballman added a subscriber: erichkeane.

Thanks for the feedback! And no worries about the delay -- I know you've got a lot on your plate, and the proposed change is invasive.

To make sure I understand correctly: The issue is that if a Type is replaced by an AttributedType in places where Clang does not (yet) expect this to happen, this can cause performance regressions or assertions?

More that an attributed type generally carries extra information that needs to be stored somewhere (like, a calling convention type attribute needs to store which calling convention is represented by the attributed type) and that extra overhead can cause performance regressions. Additionally, depending on the kind of type attribute the plugin wants to create, there may be failed assertions without updating all of the places in clang that expect to handle a class type attribute (such as a plugin attempting to add a new calling convention attribute). And on top of that, there are very likely places where the compiler is inspecting the type via isa<> (etc) and that's not going to look through the attributed type because one isn't expected yet (so you may not get crashes, but obtuse compile errors about mismatched types).

The motivation behind making type attributes pluggable is that I'd like to annotate pointers with additional information for use in a memory safety static analysis. The goal here is the same as for the proposal I discussed with you a while ago of extending the annotate attribute to types (or possibly adding a new annotate_type attribute) on this lengthy mailing list thread (no need to reread it):

https://lists.llvm.org/pipermail/cfe-dev/2021-October/thread.html#69087

In this discussion, I mentioned that I was thinking about making type attributes pluggable too. I eventually realized that this might actually be an easier avenue to pursue than extending annotate to types. (As we discussed, there might be cases where the GNU spelling of the annotate attribute would associate with a declaration while the C++ spelling would associate with a type, and we hadn't reached a firm conclusion on how best to resolve this. An entirely new pluggable attribute wouldn't have this problem.)

From your feedback, it sounds as if I should return to my earlier idea of extending annotate to types. I wonder though: Wouldn't this suffer from the same problems that you raise above since, again, Clang would see AttributedTypes in places where it might not expect them?

An AttributedType for an annotation attribute would require a new type in the type system, so there could likely be places that need updating to look "through" the attributed type. But unlike with arbitrary plugins, the number of places is hopefully more reasonable. There's still the concern about memory overhead, but that should only be paid by people who use the annotated type and hopefully isn't too much of a problem.

If the same concerns apply to both of these approaches, do you have any suggestions for alternative ways that I could add annotations to types? Or does this mean that I would have to do the work of making sure that Clang can handle AttributedTypes in these new places after all?

No, I think you basically have to muck about with the type system no matter what. I love the idea of plugin type attributes in theory, but I don't think we're architecturally in a place where we can do that very easily. I suspect a dedicated attribute may be easier, but that's largely a guess. Both approaches strike me as likely to have a long tail of bugs we'll have to fight through.

Adding @erichkeane as he's also thought a fair amount about attributes before, just in case I'm forgetting anything.

Right, yeah, so there are a couple of problems with AttributedType. First, it gets lost almost as soon as you get out of SemaType about 90% of the time. Anything that does some level of canonicalization ends up losing it, so the AttributedType information is lost almost immediately. This is why the current ones all store information in the ExtInfo. The worst place for this ends up being in the template instantiator, which immediately canonicalizes/desugars types all over the place.

However, making AttributedType 'survive' is actually going to be troublesome as well. A lot of the type-checking compares types using == on their pointer values, so that would be broken if they are an AttributedType.

So I think the 'first' thing that needs to happen is to make the entire CFE 'AttributedType' maintaining, AND tolerant. I can't think of a good way to do that reliably (my naive thought would be to come up with some way to temporarily (during development) wrap EVERY type in an AttributedType with a random attribute (so that they are all unique!) and do comparisons that way. Additionally, you'd need SOMETHING to validate that the AttributedTypes are the only one that survives (again, during development) to make sure it 'works right'.

Additionally, you'll likely have a lot of work to do in the template engine to make sure that the attributes are inherited correctly through instantiations, specializations, etc.

Aaron and Eric,

I'm currently on an extended leave, so I'll only get back to this in a few weeks' time, but I did want to let you know that I've seen this. Thanks for the detailed comments!

Cheers,

Martin

mboehme added a comment.EditedFeb 23 2022, 6:53 AM

Sorry for the delay -- finally back from leave.

Thanks to both of you for the detailed feedback!

High-level, I think I'd like to return to my earlier approach of adding a new annotate_type attribute (see also discussion above).

An AttributedType for an annotation attribute would require a new type in the type system,

Not sure I understand what you mean here?

FWIW, the intent would be to consider types with different annotate_type attributes to be the same type (and also the same as the un-attributed type). For example, it would not be possible for two overloads to differ merely by annotate_type attributes on their parameters.

Or are you saying this would require us to add a new subclass of Type to the Clang implementation -- e.g. AnnotatedType?

so there could likely be places that need updating to look "through" the attributed type. But unlike with arbitrary plugins, the number of places is hopefully more reasonable.

IIUC, this already happens to some extent for other type attributes -- correct? For example, the nullability attributes (_Nonnull and the like). However, these can only be applied to pointer types (e.g. int * _Nonnull ptr), while an annotate_type attribute could also be applied to other types (e.g. int [[clang::annotate_type("foo")]] i). Is this an example of the places you're thinking of that would need to be touched?

If the same concerns apply to both of these approaches, do you have any suggestions for alternative ways that I could add annotations to types? Or does this mean that I would have to do the work of making sure that Clang can handle AttributedTypes in these new places after all?

No, I think you basically have to muck about with the type system no matter what. I love the idea of plugin type attributes in theory, but I don't think we're architecturally in a place where we can do that very easily. I suspect a dedicated attribute may be easier, but that's largely a guess. Both approaches strike me as likely to have a long tail of bugs we'll have to fight through.

What would be your recommendation for the best way to approach this? I.e. after I've implemented an annotate_type attribute (I already have a draft patch at https://reviews.llvm.org/D111548), how would I best test it for possible issues? I would obviously try to make my tests as comprehensive as possible -- but is there anything else I could do? There obviously isn't any existing code that uses this attribute -- so should I try to locally patch the compiler in some way where it would add the attribute to as many places in the AST as possible, in an attempt to break things?

Right, yeah, so there are a couple of problems with AttributedType. First, it gets lost almost as soon as you get out of SemaType about 90% of the time. Anything that does some level of canonicalization ends up losing it, so the AttributedType information is lost almost immediately. This is why the current ones all store information in the ExtInfo. The worst place for this ends up being in the template instantiator, which immediately canonicalizes/desugars types all over the place.

However, making AttributedType 'survive' is actually going to be troublesome as well. A lot of the type-checking compares types using == on their pointer values, so that would be broken if they are an AttributedType.

So I think the 'first' thing that needs to happen is to make the entire CFE 'AttributedType' maintaining, AND tolerant. I can't think of a good way to do that reliably (my naive thought would be to come up with some way to temporarily (during development) wrap EVERY type in an AttributedType with a random attribute (so that they are all unique!) and do comparisons that way. Additionally, you'd need SOMETHING to validate that the AttributedTypes are the only one that survives (again, during development) to make sure it 'works right'.

Additionally, you'll likely have a lot of work to do in the template engine to make sure that the attributes are inherited correctly through instantiations, specializations, etc.

Thanks for pointing out those issues!

I think, if possible, I'd like to avoid having to make AttributedType survive. In the testing I've done locally with my prototype change, it appears that Clang preserves the attribute in the places that I care about.

As a side note: Do your comments apply equally to AttributedTypeLoc as they do to AttributedType? The reason I ask is that I believe I would need to use AttributedTypeLoc (rather than AttributedType) to be able to do anything meaningful with the proposed new annotate_type attribute, as the AttributedType only provides access to the attr::Kind, which doesn't tell you anything about the name of the annotation or its arguments. It would of course be possible to add that information to AttributedType in the case where it is an annotate_type attribute, but I worry that this would bloat AttributedType in a way that would not be considered acceptable. Any guidance that you can provide on this?

Sorry for the incomplete comment -- I pressed "send" too soon. I've edited
my comment on Phabricator to add the rest of my reply. (Pointing this out
here for people who are reading this in email.)

Sorry for the delay -- finally back from leave.

High-level, I think I'd like to return to my earlier approach of adding a new annotate_type attribute (see also discussion above).

An AttributedType for an annotation attribute would require a new type in the type system,

Not sure I understand what you mean here?

FWIW, the intent would be to consider types with different annotate_type attributes to be the same type (and also the same as the un-attributed type). For example, it would not be possible for two overloads to differ merely by annotate_type attributes on their parameters.

Or are you saying this would require us to add a new subclass of Type to the Clang implementation -- e.g. AnnotatedType?

It would essentially be something like that, then you'd have to alter every single place we touch a 'type' and de-annotate it for the purposes of comparison/etc. This is actually quite a difficult problem.

so there could likely be places that need updating to look "through" the attributed type. But unlike with arbitrary plugins, the number of places is hopefully more reasonable.

IIUC, this already happens to some extent for other type attributes -- correct? For example, the nullability attributes (_Nonnull and the like). However, these can only be applied to pointer types (e.g. int * _Nonnull ptr), while an annotate_type attribute could also be applied to other types (e.g. int [[clang::annotate_type("foo")]] i). Is this an example of the places you're thinking of that would need to be touched?

_Nonnull has exactly the problems we are talking about; it is put on the attributed-type, which end up getting lost in canonicalization just about immediately.

If the same concerns apply to both of these approaches, do you have any suggestions for alternative ways that I could add annotations to types? Or does this mean that I would have to do the work of making sure that Clang can handle AttributedTypes in these new places after all?

No, I think you basically have to muck about with the type system no matter what. I love the idea of plugin type attributes in theory, but I don't think we're architecturally in a place where we can do that very easily. I suspect a dedicated attribute may be easier, but that's largely a guess. Both approaches strike me as likely to have a long tail of bugs we'll have to fight through.

What would be your recommendation for the best way to approach this? I.e. after I've implemented an annotate_type attribute (I already have a draft patch at https://reviews.llvm.org/D111548), how would I best test it for possible issues? I would obviously try to make my tests as comprehensive as possible -- but is there anything else I could do? There obviously isn't any existing code that uses this attribute -- so should I try to locally patch the compiler in some way where it would add the attribute to as many places in the AST as possible, in an attempt to break things?

This is going to be an incredibly heavy lift, I just don't see how you can make it 'last' through the type system. For example:

template<typename T>
struct S {
  using my_type = T;
};

using my_attr_type = int [[annotated_type("asdf")]];

S<int>::my_type;
S<my_attr_type>::my_type; <-- will have lost the attribute already, since you want them to behave as the same type.
std::is_same<int, my_attr_type>::value; // Do you want this to be true?

std::is_same<S<int>, S<my_attr_type>::value; // What about this one?

This problem ends up being generally intractable as far as I can imagine. The C++ language doesn't have the idea of strong-typedefs, a type is just the type. You can't have 1 instance of the type have an attribute, and others not without them being DIFFERENT types. So you'd have to have some level of AnnotatedType in the type system that does its best to ACT like its underlying type (at great development cost to make sure it doesn't get lost, basically everywhere), but actually isn't. For example, it would have to end up failing the 2nd is_same above, which, IMO, would be language breaking unless the 1st is_same ALSO stopped being true.

Right, yeah, so there are a couple of problems with AttributedType. First, it gets lost almost as soon as you get out of SemaType about 90% of the time. Anything that does some level of canonicalization ends up losing it, so the AttributedType information is lost almost immediately. This is why the current ones all store information in the ExtInfo. The worst place for this ends up being in the template instantiator, which immediately canonicalizes/desugars types all over the place.

However, making AttributedType 'survive' is actually going to be troublesome as well. A lot of the type-checking compares types using == on their pointer values, so that would be broken if they are an AttributedType.

So I think the 'first' thing that needs to happen is to make the entire CFE 'AttributedType' maintaining, AND tolerant. I can't think of a good way to do that reliably (my naive thought would be to come up with some way to temporarily (during development) wrap EVERY type in an AttributedType with a random attribute (so that they are all unique!) and do comparisons that way. Additionally, you'd need SOMETHING to validate that the AttributedTypes are the only one that survives (again, during development) to make sure it 'works right'.

Additionally, you'll likely have a lot of work to do in the template engine to make sure that the attributes are inherited correctly through instantiations, specializations, etc.

Thanks for pointing out those issues!

I think, if possible, I'd like to avoid having to make AttributedType

Edit: Sorry, pressed "submit" too soon. Currently editing the comment, will remove this line once I'm done.

so there could likely be places that need updating to look "through" the attributed type. But unlike with arbitrary plugins, the number of places is hopefully more reasonable.

IIUC, this already happens to some extent for other type attributes -- correct? For example, the nullability attributes (_Nonnull and the like). However, these can only be applied to pointer types (e.g. int * _Nonnull ptr), while an annotate_type attribute could also be applied to other types (e.g. int [[clang::annotate_type("foo")]] i). Is this an example of the places you're thinking of that would need to be touched?

_Nonnull has exactly the problems we are talking about; it is put on the attributed-type, which end up getting lost in canonicalization just about immediately.

For my purposes, it would be fine, and maybe even desirable, for the proposed annotate_type to have the same limitations as _Nonnull.

Are you saying that a) this would not be acceptable for a more general-purpose attribute such as a putative annotate_type, or even that b) the behavior of _Nonnull itself is seen as undesirable and should be changed if possible?

This is going to be an incredibly heavy lift, I just don't see how you can make it 'last' through the type system. For example:

template<typename T>
struct S {
  using my_type = T;
};

using my_attr_type = int [[annotated_type("asdf")]];

S<int>::my_type;
S<my_attr_type>::my_type; <-- will have lost the attribute already, since you want them to behave as the same type.
std::is_same<int, my_attr_type>::value; // Do you want this to be true?

Yes.

std::is_same<S<int>, S<my_attr_type>::value; // What about this one?

Yes.

Both of these are what the _Nullable attribute does today (godbolt), and I would be happy for the proposed annotate_type attribute to have the same semantics.

This problem ends up being generally intractable as far as I can imagine. The C++ language doesn't have the idea of strong-typedefs, a type is just the type. You can't have 1 instance of the type have an attribute, and others not without them being DIFFERENT types. So you'd have to have some level of AnnotatedType in the type system that does its best to ACT like its underlying type (at great development cost to make sure it doesn't get lost, basically everywhere), but actually isn't. For example, it would have to end up failing the 2nd is_same above, which, IMO, would be language breaking unless the 1st is_same ALSO stopped being true.

As noted above, this isn't actually what I'm trying to achieve.

Some colleagues and I are currently preparing an RFC for the specific use case we'd like to use the proposed annotate_type attribute for. That use case might give more context to the discussion. If you think it would help to put this discussion on hold until we've written up and posted the RFC, I'd be happy to do that. If you have any additional comments right now, I'd be happy to hear those too of course!

For my purposes, it would be fine, and maybe even desirable, for the proposed annotate_type to have the same limitations as _Nonnull.

Are you saying that a) this would not be acceptable for a more general-purpose attribute such as a putative annotate_type, or even that b) the behavior of _Nonnull itself is seen as undesirable and should be changed if possible?

Yes and Maybe? The non-type-modifying type attributes we have are here for compatibility with GCC/MSVC, but are, IMO, a complete mistake otherwise. The strange/obnoxious behavior of them disappearing off of something as soon as they are looked at funny makes them so non-intuitive that I see them as harmful.

Something as general as an 'annotate_type' would, IMO, need to be held to an even higher standard than our normal type attributes due to its nature. And I don't see how a type attribute with the behaviors you require as being both intuitive and working in the C++ type system.

mboehme abandoned this revision.Mar 3 2022, 5:32 AM

Thanks for all of the input!

Rather than going deeper into the discussion of the attribute here (on an only vaguely related change), I think it would be better to continue the discussion on the forum.

I'm currently in the process of writing up an RFC and will add a link here once I've published it. That RFC will also contain a link to a second RFC that describes our intended use case for the attribute in detail.

As I have come to the conclusion that I'd like to pursue a proposal for a new attribute rather than extending ParsedAttr to allow custom handling for type attributes (which is what this change is about), I'll retract this change from review.

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 5:32 AM

As mentioned in my previous comment, here is the RFC proposing a new annotate_type attribute:

https://discourse.llvm.org/t/rfc-new-attribute-annotate-type-iteration-2/61378