This is an archive of the discontinued LLVM Phabricator instance.

Generalize method overloading on addr spaces to C++
Needs ReviewPublic

Authored by Anastasia on Jan 30 2019, 10:03 AM.

Details

Summary

Here is my attempt to generalize the earlier work on method overloading with address spaces https://reviews.llvm.org/D55850 to work in C++ mode (not just OpenCL!).

This implementation doesn't apply yet to templated addr spaces because I would need to perform modifications on the extended function info. Also a couple of smaller corner cases are not handled. However, if the general direction is right, the rest can be build on top of this patch.

Diff Detail

Event Timeline

Anastasia created this revision.Jan 30 2019, 10:03 AM
ebevhan added inline comments.Jan 30 2019, 3:21 PM
lib/Parse/ParseDeclCXX.cpp
2309

Are we interested in preserving the spelling?

2313

Is there a reason that the attributes are parsed here and not in ParseFunctionDeclarator like the rest of the ref-qualifiers (and the OpenCL++ ASes)?

That is possibly why the parser is getting confused with the trailing return.

lib/Sema/SemaType.cpp
4883

This lambda is a bit unwieldy, it would probably be better as a separate function.

5865

"Returns value" could be more specific. "Returns true if the attribute was processed successfully."

Anastasia marked 4 inline comments as done.Jan 31 2019, 3:50 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2309

I think it is computed based on the properties of ParsedAttr (like Name or Syntax) https://clang.llvm.org/doxygen/classclang_1_1ParsedAttr.html#a5e53f185ac34c208f6e734d6d7103d8c.

At least I don't see any member that stores this in ParsedAttr.

2313

Good question! I have a feeling that this was done to unify parsing of all CXX members, not just methods. For collecting the method qualifiers it would certainly help making flow more coherent if this was moved to ParseFunctionDeclarator. I will try to experiment with this a bit more. What I found strange that the attributes here are attached to the function type itself instead of its qualifiers. May be @rjmccall can shed some more light on the overall flow...

lib/Sema/SemaType.cpp
4883

Sure! I will rewrite this with just a lang mode condition inside the loop. I was trying to avoid this extra check inside the loop but perhaps complexity is not worth it. This code is full of checks anyway.

5865

Sure! Thanks!

Anastasia marked an inline comment as done.Jan 31 2019, 4:18 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2303

I think I need to create this only if we parsed AS attr successfully....

Anastasia marked an inline comment as done.Jan 31 2019, 4:45 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2313

I looked at this a bit more and it seems that all the GNU attributes other than addr spaces belong to the function (not to be applied to this) for example https://blog.timac.org/2016/0716-constructor-and-destructor-attributes/. It seems correct to parse them here and apply to the function type. Although we could separate parsing of addr space attribute only and move into ParseFunctionDeclarator. However this will only be needed for the function type not for the data members. So not sure whether it will make the flow any cleaner.

ebevhan added inline comments.Jan 31 2019, 5:48 AM
lib/Parse/ParseDeclCXX.cpp
2313

I looked at this a bit more and it seems that all the GNU attributes other than addr spaces belong to the function (not to be applied to this)

Hm, I know what you mean. It's why Clang usually complains when you try placing an AS attribute after a function declarator, because it's trying to apply it to the function rather than the method qualifiers.

However this will only be needed for the function type not for the data members.

But we aren't really interested in data members, are we? Like struct fields, non-static data members cannot be qualified with ASes (they should 'inherit' the AS from the object type), and static ones should probably just accept ASes as part of the member type itself.

These patches are to enable AS-qualifiers on methods, so it only stands to reason that those ASes would be parsed along with the other method qualifiers.

ParseFunctionDeclarator calls ParseTypeQualifierListOpt to get the cv-qualifier-seq for the method qualifiers. Currently it's set to AR_NoAttributesParsed. If we enable parsing attributes there, then the qualifier parsing might 'eat' attributes that were possibly meant for the function.

This is just a guess, but what I think you could do is include attributes in the cv-qualifier parsing with AR_GNUAttributesParsed, look for any AS attributes, take their AS and mark them invalid, then move the attributes (somehow) from the qualifier-DeclSpec to the FnAttrs function attribute list.

(The reason I'm concerned about where/how the qualifiers are parsed is because this approach only works for custom ASes applied with the GNU attribute directly. It won't work if custom address spaces are given with a custom keyword qualifier, like in OpenCL. If the ASes are parsed into the DeclSpec used for the other ref-qualifiers, then both of these cases should 'just work'.)

test/SemaCXX/address-space-method-overloading.cpp
28

Just as an aside (I think I mentioned this on one of the earlier patches as well); treating a non-AS-qualified object as being in a 'generic' AS even for normal C++ isn't a great idea IMO. The object initialization code should be checking if the ASes of the actual object and the desired object type are compatible, and only if so, permit the overload.

Anastasia marked 2 inline comments as done.Feb 1 2019, 3:49 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2313

But we aren't really interested in data members, are we? Like struct fields, non-static data members cannot be qualified with ASes (they should 'inherit' the AS from the object type), and static ones should probably just accept ASes as part of the member type itself.

Pointer data members can be qualified by and address space, but this should be parsed before.

This is just a guess, but what I think you could do is include attributes in the cv-qualifier parsing with AR_GNUAttributesParsed, look for any AS attributes, take their AS and mark them invalid, then move the attributes (somehow) from the qualifier-DeclSpec to the FnAttrs function attribute list.

Ok, I will try that. Thanks for sharing your ideas!

test/SemaCXX/address-space-method-overloading.cpp
28

I think changing this would require changing AS compatibility rules in general, not just for overloading but for example for conversions too. This seems like a wider scope change that might affect backwards compatibility. We might need to start an RFC on this first. John has suggested adding a target setting for this on the other review. I think that should work. Another idea could be to add some compiler directives to specify what generic address space is (if any).

Using default (no address space) as generic has a lot of advantages since it doesn't need many parser changes. In OpenCL we weren't lucky that initial implementation was added that used default for private memory and therefore when generic was introduced we had to map it to a new lang addr space value. That required a lot of changes in the parser. But once we fix those actually, anyone should be able to map generic to anything. I initially thought it has no other use cases than in OpenCL but now I feel there might be a value to map default case (no address space) for something specific and then use generic to specify a placeholder address space on pointers or references. We could just expose generic address space generically and also have a mode with no generic address space. The latter means that conversions between different address spaces is never valid. Would this make sense?

ebevhan added inline comments.Feb 1 2019, 8:09 AM
lib/Parse/ParseDeclCXX.cpp
2313

Pointer data members can be qualified by and address space, but this should be parsed before.

Right, pointer-to-member is a thing. I always forget about those.

test/SemaCXX/address-space-method-overloading.cpp
28

The big problem with the address space support in Clang is that it isn't really very well formalized unless you're on OpenCL. There's no real way for a target to express the relations between its address spaces; most of the relations that exist are hard-coded.

The support should probably be generalized for both targets and for LangASes like OpenCL's. Maybe:

  • the ASes would be defined in a TableGen file which specifies which ASes exist, which ones are compatible/subsets/supersets, etc,
  • or, just have a target hook that lets a target express that a conversion from AS A to AS B is prohibited/permitted explicitly/permitted implicitly.

Just some loose ideas; an RFC for this is possibly the right way forward.

The reason I bring all of this up is primarily because in our target, the 'default' AS is disjoint from our other ones, so there's no 'generic' AS to be had. Both implicit and explicit conversion between these ASes is prohibited. Since Clang currently doesn't allow you to express that ASes are truly incompatible, we have a flag that blocks such conversions unconditionally. Ideally it would be target-expressible.


I think changing this would require changing AS compatibility rules in general, not just for overloading but for example for conversions too. This seems like a wider scope change that might affect backwards compatibility. We might need to start an RFC on this first. John has suggested adding a target setting for this on the other review. I think that should work. Another idea could be to add some compiler directives to specify what generic address space is (if any).

I'm unsure whether any changes to the current semantics really need to be done, at least for the overloading problem.

For example, explicit conversion from a non-address space qualified pointer type to an address space qualified pointer type (and vice versa) is permitted, but implicit conversion is an error in both C and C++: https://godbolt.org/z/UhOC0g

For an overload candidate to be viable, there has to be an implicit conversion sequence for the implicit object argument to the implicit object parameter type. But no such implicit conversion sequence exists for types in different ASes, even today, or the implicit example in the godbolt would pass. So, an overload candidate with a different AS qualification than the object should not be viable.

This is clearly not compatible with OpenCL's notion of the __generic AS, but OpenCL already has logic for __generic in Qualifiers to handle that case (isAddressSpaceSupersetOf). Arguably, that behavior should probably not be hard coded, but that's how it is today.

(Also, just because an AS is a superset of another AS does not necessarily mean that the language should permit an implicit conversion between them, but it's a reasonable behavior, I guess.)


Ultimately, whether or not a conversion of a pointer/reference from address space A to address space B is permitted should both be a function of what the target and the language allows, but that support doesn't exist. I also don't think that exposing a 'generic' AS is the right approach. There are ASes, and some conversions from ASes to other ASes are legal, while others are not. There's also the question of the difference between implicit and explicit conversions; blocking all implicit conversions between ASes would not work for OpenCL, so there must be a way to express that as well.


These are all just loose thoughts from my head. I'm not terribly familiar with OpenCL except for what I've seen around the Clang code, so there might be details there that I'm not aware of.

rjmccall added inline comments.Feb 1 2019, 9:19 AM
test/SemaCXX/address-space-method-overloading.cpp
28

just have a target hook that lets a target express that a conversion from AS A to AS B is prohibited/permitted explicitly/permitted implicitly.

I think it has to be this, given the possibility of target-specific subspacing rules in numbered address spaces. We can't make targets use disjoint sets of numbered address spaces.

I agree that, in general, the rule has to be that we check whether the object type is convertible to the expected address space of the method; we can't assume that there's a generic address space that works for everything. That's not even true in OpenCL, since constant is not a subspace of the generic address space.

John has suggested adding a target setting for this on the other review.

The suggestion is just that we recognize some way to control what the default address space for methods is.

This is difficult with member pointers because a function type can be turned into a member pointer type arbitrarily later; that is, we need to be able to canonically distinguish a function type that was written without any address space qualifier (and hence is generic if turned into a member pointer) from one that was written with our representationally-default address space qualifier (private). We can deal with this later, I think, but it might require rethinking some things.

ebevhan added inline comments.Feb 1 2019, 2:14 PM
test/SemaCXX/address-space-method-overloading.cpp
28

I think it has to be this, given the possibility of target-specific subspacing rules in numbered address spaces. We can't make targets use disjoint sets of numbered address spaces.

I'm in favor of this option as well.

This is difficult with member pointers because a function type can be turned into a member pointer type arbitrarily later; that is, we need to be able to canonically distinguish a function type that was written without any address space qualifier (and hence is generic if turned into a member pointer) from one that was written with our representationally-default address space qualifier (private). We can deal with this later, I think, but it might require rethinking some things.

Oh, now I get the problem with those. That is tricky.

So when a function type is used as the type of a pointer-to-member and it is method-qualified with Default, it should be changed to be method-qualified with __generic instead?

rjmccall added inline comments.Feb 1 2019, 2:37 PM
test/SemaCXX/address-space-method-overloading.cpp
28

If it isn't method-qualified at all, yes. If that's representationally the same as being method-qualified with private, we have a problem — we can fix that in the short term by banning method qualification with private, but we do need a real fix.

Anastasia marked an inline comment as done.Feb 4 2019, 9:41 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2313

It seems unfortunately that moving parsing of GNU attributes into ParseFunctionDeclarator might not be a good viable alternative. One of the issues I encountered that some attributes accept class members, example fragment from test/SemaCXX/warn-thread-safety-analysis.cpp:

class LateFoo {

...

void foo() __attribute__((requires_capability(mu))) { }

...

Mutex mu;
};

As CXX name lookup is not setup in ParseFunctionDeclarator it seems unreasonable to replicate all these logic there.

I have also tried to move just address space parsing only into ParseFunctionDeclarator but there is a corner cases of GNU attribute syntax where multiple attr are specified as a list:

__attribute__((overloadable, address_space(11)))

that means we can't always parse them separately. :(

Based on this, it seems better to leave the parsing here in ParseCXXMemberDeclaratorBeforeInitializer.

Can you explain a bit more why you prefer it to be in ParseFunctionDeclarator (other than avoiding two separate places with similar functionality that I do agree would be better to avoid)?

(The reason I'm concerned about where/how the qualifiers are parsed is because this approach only works for custom ASes applied with the GNU attribute directly. It won't work if custom address spaces are given with a custom keyword qualifier, like in OpenCL. If the ASes are parsed into the DeclSpec used for the other ref-qualifiers, then both of these cases should 'just work'.)

The keywords based address spaces can easily be handled in ParseFunctionDeclarator just like for OpenCL. If there is something I can generalize there I am happy to do. Just let me know.

ebevhan added inline comments.Feb 5 2019, 1:59 AM
lib/Parse/ParseDeclCXX.cpp
2313

Well, that's unfortunate...

As CXX name lookup is not setup in ParseFunctionDeclarator it seems unreasonable to replicate all these logic there.

I think the issue is rather that attributes parsed during a class construction are added to the LateAttrs list to be parsed after the class is finished (to enable the member lookup), and there's currently no way (as far as I can see) to get to the LateAttrs from ParseFunctionDeclarator. Also, ParseTypeQualifierListOpt does not take a LateParsedAttrList to give to its invocation of ParseGNUAttributes anyway. There's just no immediate way to 'defer' the parsing of the attributes when parsing the function declarator (because we don't necessarily have to be in class scope to be parsing one).

Only parsing AS attrs was an option, but that list syntax gets in the way...

Can you explain a bit more why you prefer it to be in ParseFunctionDeclarator (other than avoiding two separate places with similar functionality that I do agree would be better to avoid)?

It's more or less a combination of

  • not wanting code that does the same thing in multiple places
  • parsing the AS qualifier along with the other method qualifiers seems like The Right Thing to do; for example, it should solve the issue with the misparse on trailing return
  • only __attribute__ ASes will be parsed, not specifier-based ones; it could be solved by adding something that looks for AT_AddressSpace in ParseFunctionAttribute even though the parser won't ever parse a real AS attribute, but it doesn't feel like the cleanest solution

Ultimately it boils down to the parser not being aware of the class for which it is parsing a function declarator, so there's no way to tell it 'wait with parsing this until the class is finished'. None of the existing method qualifiers needs this, so there's no way of doing it (except for exception specifications, but those are handled specially).

I'm not really sure how to solve this in a way that makes me satisfied... John maybe has a different opinion, though.

Anastasia marked an inline comment as done.Feb 5 2019, 4:08 AM
Anastasia added inline comments.
lib/Parse/ParseDeclCXX.cpp
2313

Ok, agreed let's see if @rjmccall can suggest something.

only attribute ASes will be parsed, not specifier-based ones;

Do you refer to keyword-based ASes? Because this should work now after OpenCL was fixed. However, some generalization might be needed in several places to un-condition some code on OpenCL. I can try to look into it if I get an example code. :)

ebevhan added inline comments.Feb 5 2019, 5:37 AM
lib/Parse/ParseDeclCXX.cpp
2313

Yes, keyword based ones like in OpenCL. Basically, in our downstream we look for keywords in all of the qualifier/specifier parsing routines (ParseTypeQualifierListOpt, ParseDeclarationSpecifiers) and apply an artificial address_space attribute to the DeclSpec being parsed. This has the same effect as using a regular AS attribute, but it's applied with a keyword instead.

The parsing in ParseCXXMemberDeclaratorBeforeInitializer cannot parse anything but physical GNU address_space attributes, so it can't be used to parse these. It would work for us if the AS application part (that calls asOpenCLLangAS) in ParseFunctionDeclarator looked for both the OpenCL AS attributes and AT_AddressSpace, but it's hard to justify looking for AT_AddressSpace there since the parser will never actually parse those (it's disabled with AR_NoAttributesParsed).

I suppose a solution could be to rename asOpenCLLangAS and have it return ASes from regular AT_AddressSpace attributes as well?

Moving parsed attributes between lists isn't unreasonable if that's what you have to do; we already do that when processing the ObjC ARC qualifiers. The ambiguity with function attributes is pretty much inherent.

Alternatively, you shouldn't feel like you can't impose restrictions that make your life easier; this is a new language feature.

Moving parsed attributes between lists isn't unreasonable if that's what you have to do; we already do that when processing the ObjC ARC qualifiers. The ambiguity with function attributes is pretty much inherent.

The problem isn't strictly the moving of attributes; this would normally be possible. However, the problem is that the attributes cannot be parsed in the first place, as some attributes take a member variable argument and must therefore be parsed after a class is finalized to get proper member lookup.

Doing this deferral in ParseFunctionDeclarator is not possible at the moment without passing down a LateParsedAttributes from ParseCXXMemberDeclaratorBeforeInitializer, and then further passing it into ParseTypeQualifierListOpt and ParseGNUAttributes. Then the member lookup-dependent attributes can be deferred, the AS attribute will be parsed on the spot, and any other attributes will be added to the FnAttrs for the function declarator.

I think this would work, but it's a larger change, and it introduces a bit of complexity into the declarator parsing chain (suddenly, ParseDeclarator/ParseDirectDeclarator/ParseFunctionDeclarator will be made aware of whether or not a class is being parsed).

Perhaps LateAttrs for a class could be stored in a class parsing scope and fetched directly in ParseFunctionDeclarator? I'm not terribly familiar with the scope handling part of the code.

Anastasia marked an inline comment as done.Feb 6 2019, 9:33 AM

Btw, if we would implement something like this:
http://lists.llvm.org/pipermail/cfe-dev/2019-January/060798.html

Would it allow you to switch to more generic address space attributes and therefore help to decrease diverging from upstream too much?

lib/Parse/ParseDeclCXX.cpp
2313

Ok so it seems the difference to OpenCL is just that you don't map to special language address space attributes but instead reuse AT_AddressSpace? If that's the case then I feel OpenCL parsing flow would just work fine? We can surely generalize it to let's call it language address space concept by renaming or providing extra level of APIs, for example we can add asLangAS that would call asOpenCLLangAS. I think I would need to keep asOpenCLLangAS too because it's used in other places too.

Hmm. Richard, I've mostly let you stay out of this, but do you have any thoughts about the right manage to parse attributes here? We want to allow address_space attributes to be written in the method-qualifiers list, but when we do that, it's hard to avoid parsing arbitrary attributes and then potentially needing to move them to the declarator; also, doing that might mean propagating awkward information in to deal with of delayed attribute parsing.

Any more input on this?

Any more input on this?

I think not. :( But I am wondering if we could proceed for now in some general direction and then make any improvements later. Probably the biggest part of this patch is not going to change. Would this make sense?

I think not. :( But I am wondering if we could proceed for now in some general direction and then make any improvements later. Probably the biggest part of this patch is not going to change. Would this make sense?

Well, I'm still not convinced that it's the right way to do it... And it feels a bit off to fix things later if we pretty much know what the correct way is now. It feels a bit unreasonable to ask for a larger redesign since this has been laying for a while at this point, but I think that if we're going to do it we should do it right from the start.

I think that the way to approach it is either to

  • Pass a pointer to the LateParsedAttrList down through the Parse*Declarator functions and ParseTypeQualifierListOpt and use it appropriately in ParseFunctionDeclarator. That's fairly invasive in all of the callers of the declarator parsing functions, though.
  • Store a pointer to the LateParsedAttrList in Declarator and use it the same way as above. This avoids messing with most of the existing functions, but makes Declarator a bit larger.

I think I would lean towards the latter since it means less fudging around with a whole bunch of unrelated methods. Do @rjmccall or @rsmith have any further opinions on this?

Sorry, I've gotten trapped under a backlog and haven't had time to think much about this. I don't want to block progress here.

I think I would lean towards the latter since it means less fudging around with a whole bunch of unrelated methods. Do @rjmccall or @rsmith have any further opinions on this?

Ok, I can change the patch to prototype this approach. I might need some example test cases though.

I think I would lean towards the latter since it means less fudging around with a whole bunch of unrelated methods. Do @rjmccall or @rsmith have any further opinions on this?

Ok, I can change the patch to prototype this approach. I might need some example test cases though.

Alright!

Just to make sure of something here - are you waiting for me to provide some example cases? There hasn't been activity here in a while and I was just wondering if it was because you were waiting for this.

I think I would lean towards the latter since it means less fudging around with a whole bunch of unrelated methods. Do @rjmccall or @rsmith have any further opinions on this?

Ok, I can change the patch to prototype this approach. I might need some example test cases though.

Alright!

Just to make sure of something here - are you waiting for me to provide some example cases? There hasn't been activity here in a while and I was just wondering if it was because you were waiting for this.

Sorry for delay. Examples would be helpful. But I am not blocked on them, I just can't find extra bandwidth at the moment unfortunately and I am not sure I will be able to do this in the time frame of clang9 development. Feel free to pick it up if you have time and I will be happy to review your rework. Otherwise it would have to wait. :(

kpet added a subscriber: kpet.May 3 2019, 8:06 AM
bader added a subscriber: bader.Jan 22 2020, 6:52 AM
tra added a subscriber: tra.Mar 4 2020, 4:12 PM