This is an archive of the discontinued LLVM Phabricator instance.

[Attr] Fix parameter indexing for attributes
ClosedPublic

Authored by jdenny on Feb 13 2018, 10:57 AM.

Details

Summary

The patch fixes a number of bugs related to parameter indexing in
attributes:

  • Parameter indices in some attributes (argument_with_type_tag, pointer_with_type_tag, nonnull, ownership_takes, ownership_holds, and ownership_returns) are specified in source as one-origin including any C++ implicit this parameter, were stored as zero-origin excluding any this parameter, and were erroneously printing (-ast-print) and confusingly dumping (-ast-dump) as the stored values.
  • For alloc_size, the C++ implicit this parameter was not subtracted correctly in Sema, leading to assert failures or to silent failures of __builtin_object_size to compute a value.
  • For argument_with_type_tag, pointer_with_type_tag, and ownership_returns, the C++ implicit this parameter was not added back to parameter indices in some diagnostics.

This patch fixes the above bugs and aims to prevent similar bugs in
the future by introducing careful mechanisms for handling parameter
indices in attributes. ParamIdx stores a parameter index and is
designed to hide the stored encoding while providing accessors that
require each use (such as printing) to make explicit the encoding that
is needed. Attribute declarations declare parameter index arguments
as [Variadic]ParamIdxArgument, which are exposed as ParamIdx[*]. This
patch rewrites all attribute arguments that are processed by
checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp to be declared
as [Variadic]ParamIdxArgument. The only exception is xray_log_args's
argument, which is encoded as a count not an index.

An alternative approach is to attempt to hide the issue of parameter
index encoding at each use by somehow implicitly exposing exactly the
encoding needed. That approach is essentially the approach that
already existed, and I made several attempts to improve it, but I have
concluded that it is both futile and harmful. The trouble is that the
needed encoding varies among attribute arguments and even among uses
of a single attribute argument. Making the exposed encoding implicit
then requires developers to think about not only what encoding is
needed at every use but also what encoding is implicitly being
exposed. Thus, making the encoding explicit at every use should
actually reduce the cognitive burden and the potential for mistakes.

Diff Detail

Repository
rL LLVM

Event Timeline

jdenny created this revision.Feb 13 2018, 10:57 AM
aaron.ballman added inline comments.Feb 15 2018, 9:48 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

Is there much benefit to forcing the attribute author to pick a name for this? It seems like this is more of a Boolean value: either implicit this is allowed or not. It would be really nice if we could hide these mechanics as implementation details so that the user only needs to write VariadicParamIndexArgument<"name", DisallowImplicitThis> (or something similar) when defining the attribute, and ideally not have to do any extra work in SemaDeclAttr.cpp by taking care of it in ClangAttrEmitter.cpp if possible.

lib/Sema/SemaDeclAttr.cpp
355–357 ↗(On Diff #134085)

Formatting is off here -- the else if should go up a line.

4612 ↗(On Diff #134085)

This value cannot be different from the previously-read value, correct? Might want to assert that.

test/Sema/attr-print.cpp
1 ↗(On Diff #134085)

-ast-print tests generally live in Misc rather than Sema -- can you move this test over to that directory?

33 ↗(On Diff #134085)

Can you add some tests for __attribute__((format)) and __attribute__((format_arg)) to demonstrate that their pretty-printed argument values are also sensible?

jdenny added inline comments.Feb 15 2018, 10:09 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

Thanks for the review. I'll work on that.

lib/Sema/SemaDeclAttr.cpp
355–357 ↗(On Diff #134085)

I don't follow. It looks right to me.

4612 ↗(On Diff #134085)

Right. Hopefully this will go away after I apply your initial suggestion.

test/Sema/attr-print.cpp
1 ↗(On Diff #134085)

Sure. Should the existing test/Sema/attr-print.c move too?

33 ↗(On Diff #134085)

Will look into it.

aaron.ballman added inline comments.Feb 15 2018, 10:16 AM
lib/Sema/SemaDeclAttr.cpp
355–357 ↗(On Diff #134085)

Our coding standard is to write:

if (foo) {
} else if (bar) {
}

rather than

if (foo) {
}
else if (bar) {
}

One good way to catch this sort of thing is to run your patch through clang-format: https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting

test/Sema/attr-print.cpp
1 ↗(On Diff #134085)

Ah, I didn't see we already had that one here. In that case, it's fine to leave this where it is.

jdenny marked 3 inline comments as done.Feb 15 2018, 4:12 PM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

Is there much benefit to forcing the attribute author to pick a name for this? It seems like this is more of a Boolean value: either implicit this is allowed or not.

If the attribute author picks the name, then the attribute author can ensure there's only one of these per attribute. I could rewrite it to have one per VariadicParamIndexArgument and one per ParamIndexArgument if you like. In that case, ArgumentWithTypeTagAttr would end up with two of these, and future attributes could potentially have more, but they should all have the same value within a single attribute. I didn't investigate how that redundancy would actually impact memory usage. What do you think?

It would be really nice if we could hide these mechanics as implementation details so that the user only needs to write VariadicParamIndexArgument<"name", DisallowImplicitThis> (or something similar) when defining the attribute, and ideally not have to do any extra work in SemaDeclAttr.cpp by taking care of it in ClangAttrEmitter.cpp if possible.

So far, I haven't found a good way to accomplish that, or maybe I've misunderstood you....

The logic of checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp seems pretty tightly coupled with its users' logic. For example, handleNonNullAttr uses the indices as adjusted by checkFunctionOrMethodParameterIndex to determine which indices belong in the array to be passed to the NonNullAttr constructor. We could try to have NonNullAttr (in the constructor, presumably) perform the adjustment of indices so that SemaDeclAttr.cpp doesn't need that logic, but then it would be too late to for handleNonNullAttr to filter them.

The only extra work this patch adds to SemaDeclAttr.cpp beyond what's already there is to reuse the DisallowImplicitThis that is essentially already computed in checkFunctionOrMethodParameterIndex.

Another possibility is to have SemaDeclAttr.cpp fully encapsulate the index adjustment logic and pass an index offset to attribute constructors, so ClangAttrEmitter.cpp would just know it has to print indices with some given offset. But a general index offset is wider than the single bool being stored now. Again, I haven't investigated the actual impact on memory usage.

Do you see a better way to achieve the encapsulation you're looking for?

lib/Sema/SemaDeclAttr.cpp
355–357 ↗(On Diff #134085)

Thanks.

aaron.ballman added inline comments.Feb 16 2018, 7:23 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

In that case, ArgumentWithTypeTagAttr would end up with two of these, and future attributes could potentially have more, but they should all have the same value within a single attribute. I didn't investigate how that redundancy would actually impact memory usage. What do you think?

That redundancy could probably be worked around in ClangAttrEmitter.cpp by inspecting the attribute argument list and noticing duplicate [Variadic]ParamIndexArgument before generating the code for the arguments, perhaps?

Another possibility is to have SemaDeclAttr.cpp fully encapsulate the index adjustment logic and pass an index offset to attribute constructors, so ClangAttrEmitter.cpp would just know it has to print indices with some given offset.

This was more along the lines of what I was thinking of.

Basically, it seems like the declarative part in Attr.td should be able to specify an attribute argument is intended to be a function parameter positional index, and whether implicit this needs special handling or not. Past that, the semantics of the attribute really shouldn't matter -- when asking for the index, it should be automatically adjusted so that users of the attribute don't have to do anything, and when printing the attribute, the index should be appropriately adjusted back. I'd like to avoid needing collusion between the declarative part in Attr.td and the semantic part in SemaDeclAttr.cpp because that collusion can get out of sync. It feels like we should be able to hide that collusion in ClangAttrEmitter.cpp by using some extra flags on the attribute that get set (automatically) during construction. The way you have things set up right now is very close to this, except the information has to be passed to the constructor manually in SemaDeclAttr after being calculated by checkFunctionOrMethodParameterIndex().

I'm not too worried about the memory usage at this point; if it looks problematic, we can likely address it.

jdenny added inline comments.Feb 16 2018, 8:55 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

That redundancy could probably be worked around in ClangAttrEmitter.cpp by inspecting the attribute argument list and noticing duplicate [Variadic]ParamIndexArgument before generating the code for the arguments, perhaps?

Seems reasonable.

There's also the question of the constructor interface: does it accept the same information multiple times or one time?

For now, the memory optimization doesn't seem too important, so I won't work on detecting duplicates. If we implement that later, we should be able to easily adjust callers for the resulting interface change.

Another possibility is to have SemaDeclAttr.cpp fully encapsulate the index adjustment logic and pass an index offset to attribute constructors, so ClangAttrEmitter.cpp would just know it has to print indices with some given offset.

This was more along the lines of what I was thinking of.

Sorry, does "This" refer to the alternative proposal I just described above or your following paragraph?

Basically, it seems like the declarative part in Attr.td should be able to specify an attribute argument is intended to be a function parameter positional index, and whether implicit this needs special handling or not.

Attr.td cannot specify declaratively whether implicit this needs special handling (I've been calling this DisallowImplicitThisParam, but let's call it AdjustForImplicitThis for clarity) because part of that decision is per function: whether the function actually has an implicit this (HasImplicitThis). Attr.td might always be able to specify whether implicit this is allowed (AllowImplicitThis) and thus, when there actually is an implicit this (HasImplicitThis), whether an extra increment should be added for printing (AdjustForImplicitThis = HasImplicitThis && !AllowImplicitThis). ClangAttrEmitter.cpp would then have to learn how to compute AdjustForImplicitThis from the other two Booleans, but SemaDeclAttr.cpp already does that.

In my current patch, only SemaDeclAttr.cpp understands how to compute AdjustForImplicitThis, but ClangAttrEmitter.cpp documents it too and knows there's always another increment (to make indices one-origin) for printing. In my alternative proposal above, ClangAttrEmitter.cpp would not document or understand any of this. Instead, ClangAttrEmitter.cpp would only know that indices must be adjusted by some arbitrary value for the sake of printing, and SemaDeclAttr.cpp would tell it how much. That seems like the cleanest separation.

Past that, the semantics of the attribute really shouldn't matter -- when asking for the index, it should be automatically adjusted so that users of the attribute don't have to do anything, and when printing the attribute, the index should be appropriately adjusted back.

Agreed. Both my current patch and my alternative proposal above offer that.

I'd like to avoid needing collusion between the declarative part in Attr.td and the semantic part in SemaDeclAttr.cpp because that collusion can get out of sync.

The only collusion I see in my current patch (besides some perhaps unnecessary documentation in ClangAttrEmitter.cpp) is the understanding that the Boolean indicates that one increment beyond the usual single increment is needed when printing. I believe my alternative proposal eliminates even that: then there's just some arbitrary increment when printing, and ClangAttrEmitter.cpp doesn't know or care why.

It feels like we should be able to hide that collusion in ClangAttrEmitter.cpp by using some extra flags on the attribute that get set (automatically) during construction. The way you have things set up right now is very close to this, except the information has to be passed to the constructor manually in SemaDeclAttr after being calculated by checkFunctionOrMethodParameterIndex().

In my last response, I was explaining why checkFunctionOrMethodParameterIndex probably needs to calculate that information anyway for the sake of its users, and it seems that having both SemaDeclAttr and ClangAttrEmitter know how to calculate it is worse for maintenance than having SemaDeclAttr pass its result to ClangAttrEmitter. My alternative proposal above does not require ClangAttrEmitter to calculate the adjustment at all. It just receives the net adjustment from SemaDeclAttr.

I feel like my alternative proposal would satisfy your desire for a better separation of concerns. The difference from what I think you're describing is that my alternative proposal keeps the index adjustment amount computation in SemaDeclAttr rather than trying to move any of that computation to ClangAttrEmitter. Does that seem reasonable to you?

aaron.ballman added inline comments.Feb 16 2018, 9:56 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

Sorry, does "This" refer to the alternative proposal I just described above or your following paragraph?

Your proposal, which I thought was solving what I described in the following paragraph.

Attr.td might always be able to specify whether implicit this is allowed (AllowImplicitThis) and thus, when there actually is an implicit this (HasImplicitThis), whether an extra increment should be added for printing (AdjustForImplicitThis = HasImplicitThis && !AllowImplicitThis). ClangAttrEmitter.cpp would then have to learn how to compute AdjustForImplicitThis from the other two Booleans, but SemaDeclAttr.cpp already does that.

Okay, I think I see where we may not quite be on the same page. The Attr.td bit knows whether the attribute needs to adjust for implicit this or not, but only SemaDeclAttr.cpp knows about the actual function the attribute is trying to be applied to. What I was envisioning was to move the logic from SemaDeclAttr.cpp into something accessible from the code generated by ClangAttrEmitter.cpp, but that may not be sufficient because the semantic attribute needs that information from the specific Decl, which by definition requires something additional to be passed in from SemaDeclAttr.cpp.

Thanks for bearing with me. :-)

I feel like my alternative proposal would satisfy your desire for a better separation of concerns. The difference from what I think you're describing is that my alternative proposal keeps the index adjustment amount computation in SemaDeclAttr rather than trying to move any of that computation to ClangAttrEmitter. Does that seem reasonable to you?

I think it's reasonable, depending on how it's surfaced. Would it be reasonable to make it look something like this?

// Attr.td
// ...
  let Args = [VariadicParamIndexArgument<"Args", AdjustsImplicitThis>];
// ...

// SemaDeclAttr.cpp
// ...
ImplicitThisAdjustment AdjustsImplicitThis;
if (!checkFunctionOrMethodParameterIndex(..., ..., AdjustsImplicitThis))
  return;

// ...
D->addAttr(::new (S.Context) NonNullAttr(..., ..., AdjustsImplicitThis));

(I'm using enumerations in both files rather than Boolean values so it's more clear what's happening).

This implies that any attribute accepting a parameter index will have a constructor that requires you to specify whether that instance of the attribute adjusts implicit this or not.

jdenny added inline comments.Feb 16 2018, 10:30 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

What I was envisioning was to move the logic from SemaDeclAttr.cpp into something accessible from the code generated by ClangAttrEmitter.cpp, but that may not be sufficient because the semantic attribute needs that information from the specific Decl, which by definition requires something additional to be passed in from SemaDeclAttr.cpp.

Well, maybe we can do that. ClangAttrEmitter.cpp could generate for the attribute a static member function that massages one index into the form users expect. That function's parameters would be the original index and HasImplicitThis (or maybe the function Decl itself). AllowsImplicitThis would be declared in Attr.td as you want. checkFunctionOrMethodParameterIndex could use that function to compute each index before it examines it, and callers could pass that resulting index to the attribute constructor. It seems slightly messy because users of the attribute must know to call that function on indices before passing them to the constructor, but it means ClangAttrEmitter.cpp contains all the index adjustment logic, and Attr.td declares the property AllowsImplicitThis, which I agree feels like a static property of an attribute.

At this point, I think I'll just write a patch and see what you think. We can adjust if it's still too ugly.

Thanks for bearing with me. :-)

Likewise. It sounds like we're converging.

aaron.ballman added inline comments.Feb 16 2018, 10:32 AM
include/clang/Basic/Attr.td
182 ↗(On Diff #134085)

At this point, I think I'll just write a patch and see what you think. We can adjust if it's still too ugly.

I think that sounds like a good idea.

jdenny updated this revision to Diff 135839.Feb 25 2018, 11:35 AM
jdenny retitled this revision from [Attr] Fix printing of parameter indices in attributes to [Attr] Fix parameter indexing for attributes.
jdenny edited the summary of this revision. (Show Details)

After several attempts at strategies Aaron and I discussed, I ended up going a different way. See the new summary for details. I believe this version addresses all the issues Aaron raised so far.

jdenny marked 8 inline comments as done.Feb 25 2018, 11:38 AM

(Review is incomplete, but here are some initial comments.)

include/clang/AST/Attr.h
210–212 ↗(On Diff #135839)

I think it might be best to mash these together using bit-fields:

unsigned Idx : 30;
unsigned HasThis : 1;
unsigned IsValid : 1;
218 ↗(On Diff #135839)

Add some newlines between the various declarations in the class -- everything is condensed a bit too much without it.

225 ↗(On Diff #135839)

const auto *

227–228 ↗(On Diff #135839)

Might as well move this into the member initializer.

238–243 ↗(On Diff #135839)

Is this necessary? There should be an implicit copy assignment operator for this class, and it's a bit strange to have a copy assignment but no copy constructor.

aaron.ballman added inline comments.Feb 25 2018, 1:00 PM
include/clang/AST/Attr.h
264 ↗(On Diff #135839)

I'd prefer more clear names, like "getSourceIndex()" and "getASTIndex()".

282 ↗(On Diff #135839)

Perhaps getLLVMIndex() for this one.

291–292 ↗(On Diff #135839)

std::iterator was deprecated in C++17, so you should manually expose these fields.

jdenny updated this revision to Diff 136019.Feb 26 2018, 6:16 PM
jdenny marked 8 inline comments as done.
jdenny edited the summary of this revision. (Show Details)

This revision should address all issues raised.

include/clang/AST/Attr.h
210–212 ↗(On Diff #135839)

Good point. Thanks.

238–243 ↗(On Diff #135839)

Oops. Thanks.

291–292 ↗(On Diff #135839)

Now that sizeof(ParamIdx) == sizeof(unsigned), I can no longer find a reason to avoid ParamIdx arrays, and so I can no longer find a good justification for the iterator classes.

jdenny marked 3 inline comments as done.Feb 26 2018, 6:18 PM
jdenny marked 8 inline comments as done.Mar 1 2018, 7:10 AM
jdenny added a comment.Mar 1 2018, 7:41 AM

Hi Aaron. It occurs to me now that this patch has grown rather large and, in some places, a little subtle. Would it help the review if I were to break it up into a patch series that introduces ParamIdx to each attribute, one at a time? I'm not trying to rush you, but I hate for the review to be painful for you if it doesn't have to be.

aaron.ballman added inline comments.Mar 1 2018, 11:09 AM
include/clang/AST/Attr.h
206 ↗(On Diff #136019)

The name here can be improved. How about checkInvariants()? Might as well make this inline while you're at it.

236 ↗(On Diff #136019)

Is this constructor used anywhere? I didn't see it being used, but I could have missed something. If it's not used, go ahead and remove it.

267 ↗(On Diff #136019)

Please assert that Idx won't wrap before doing the return.

276 ↗(On Diff #136019)

Likewise here.

281–284 ↗(On Diff #136019)

Are all of these operations required for the class?

include/clang/Basic/Attr.td
172–174 ↗(On Diff #136019)

I still find the AllowThis parts to be hard to follow, so I want to be sure I understand your design properly. Everything that uses this new argument type sets AllowsThis to 0. As written, it sounds like setting that to 0 means that the parameter index cannot be used on a C++ instance method, and I know that's not the correct interpretation. Under what circumstances would this be set to 1 instead?

Looking at the existing code, the only circumstances under which checkFunctionOrMethodParameterIndex() was being passed true for "allow this" was for XRayLogArgs, which doesn't even use ParamIdx. Perhaps this member isn't necessary any longer?

lib/Sema/SemaChecking.cpp
2622 ↗(On Diff #136019)

const ParamIdx &

10083 ↗(On Diff #136019)

const ParamIdx &

12244 ↗(On Diff #136019)

Hoist the AST index so you don't have to call for it twice. (Same applies elsewhere.)

lib/Sema/SemaDeclAttr.cpp
785–786 ↗(On Diff #136019)

Good catch about this not being needed any longer!

1650–1651 ↗(On Diff #136019)

This change looks to be unrelated to the patch?

4549–4551 ↗(On Diff #136019)

Is this formatting produced by clang-format?

lib/StaticAnalyzer/Checkers/NonNullParamChecker.cpp
61 ↗(On Diff #136019)

const ParamIdx &

Hi Aaron. It occurs to me now that this patch has grown rather large and, in some places, a little subtle. Would it help the review if I were to break it up into a patch series that introduces ParamIdx to each attribute, one at a time? I'm not trying to rush you, but I hate for the review to be painful for you if it doesn't have to be.

No need to do that -- this review just takes a bit more time for me to complete, but it's reasonably well-factored. Thank you, though!

Hi Aaron. It occurs to me now that this patch has grown rather large and, in some places, a little subtle. Would it help the review if I were to break it up into a patch series that introduces ParamIdx to each attribute, one at a time? I'm not trying to rush you, but I hate for the review to be painful for you if it doesn't have to be.

No need to do that -- this review just takes a bit more time for me to complete, but it's reasonably well-factored. Thank you, though!

Sure! Thanks for the review.

include/clang/AST/Attr.h
206 ↗(On Diff #136019)

Sure, I can change the name.

It's inside the class, so specifying inline is against the LLVM coding standards, right?

236 ↗(On Diff #136019)

It's used by the deserialization code generated by ClangAttrEmitter.cpp.

281–284 ↗(On Diff #136019)

operator== and operator< are needed for sorting and finding. It seems strange to me not to finish the set.

include/clang/Basic/Attr.td
172–174 ↗(On Diff #136019)

I still find the AllowThis parts to be hard to follow, so I want to be sure I understand your design properly. Everything that uses this new argument type sets AllowsThis to 0. As written, it sounds like setting that to 0 means that the parameter index cannot be used on a C++ instance method, and I know that's not the correct interpretation.

Right. AllowsThis=0 means it is an error for the attribute in the source code to specify the C++ implicit this parameter (index 1).

Under what circumstances would this be set to 1 instead?

Looking at the existing code, the only circumstances under which checkFunctionOrMethodParameterIndex() was being passed true for "allow this" was for XRayLogArgs, which doesn't even use ParamIdx. Perhaps this member isn't necessary any longer?

Right. I also noticed this issue, but I probably should have mentioned that in a comment in the source instead of just rewording the last paragraph of the patch summary. Sorry.

I thought about removing AllowsThis, but I hesitated because I had already implemented it by the time I noticed this issue and because I assumed there must be some reason why attributes for C++ have index 1 mean the this parameter, so there might be some attribute that could eventually take advantage of AllowsThis=1. Moreover, it makes the related argument to checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp clearer.

I don't feel strongly either way, so I'm happy to remove it or keep it.

lib/Sema/SemaDeclAttr.cpp
1650–1651 ↗(On Diff #136019)

Sorry, I think clang-format suggested that at some point in history, and I overlooked it when reviewing the latest version.

4549–4551 ↗(On Diff #136019)

Yes.

aaron.ballman added inline comments.Mar 1 2018, 12:51 PM
include/clang/AST/Attr.h
206 ↗(On Diff #136019)

Derp, you're correct, it's already implicitly inline. Ignore that part of the suggestion.

236 ↗(On Diff #136019)

That'd explain why I hadn't seen it.

281–284 ↗(On Diff #136019)

I don't think it's actively harmful to do so, but I also don't think it's really needed either. Your call.

include/clang/Basic/Attr.td
172–174 ↗(On Diff #136019)

Right. AllowsThis=0 means it is an error for the attribute in the source code to specify the C++ implicit this parameter (index 1).

Then if we keep this functionality, I think a better identifier would be something like CanIndexImplicitThis and the comments could be updated to more clearly state what is allowed/disallowed. Then the other uses of "allow this" can be updated to use similar terminology for clarity.

so there might be some attribute that could eventually take advantage of AllowsThis=1. Moreover, it makes the related argument to checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp clearer.

My gut feeling is that this functionality isn't going to be needed all that frequently. If we get a second case where we need it, then I'd say it might make more sense to add it.

Attributes that use positional arguments should hopefully be the exception, not the rule, because those indexes are horribly fragile.

What do you think?

lib/Sema/SemaDeclAttr.cpp
4549–4551 ↗(On Diff #136019)

How funky. :-)

jdenny marked 3 inline comments as done.Mar 1 2018, 1:41 PM
jdenny added inline comments.
include/clang/Basic/Attr.td
172–174 ↗(On Diff #136019)

My gut feeling is that this functionality isn't going to be needed all that frequently. If we get a second case where we need it, then I'd say it might make more sense to add it.

Attributes that use positional arguments should hopefully be the exception, not the rule, because those indexes are horribly fragile.

What do you think?

I'm guessing you have more experience with attributes in general, so let's go with your gut. :-)

lib/Sema/SemaDeclAttr.cpp
4549–4551 ↗(On Diff #136019)

Agreed.

jdenny updated this revision to Diff 136624.Mar 1 2018, 4:03 PM
jdenny marked 23 inline comments as done.
jdenny edited the summary of this revision. (Show Details)

This update should address all outstanding comments.

include/clang/AST/Attr.h
206 ↗(On Diff #136019)

The name here can be improved. How about checkInvariants()?

I went with assertComparable because, in my view, this is not so much a class invariant as it is an assertion about correct usage of comparison operators. But I'm not married to it.

jdenny marked an inline comment as done.Mar 1 2018, 4:10 PM
aaron.ballman accepted this revision.Mar 2 2018, 5:00 AM

Aside from two minor nits, this LGTM. Thank you for working on it!

test/Sema/attr-ownership.cpp
1 ↗(On Diff #136624)

Please pass -fsyntax-only as well.

utils/TableGen/ClangAttrEmitter.cpp
763 ↗(On Diff #136624)

Given how oddly this is wrapped, might as well make this a stream argument rather than a string literal concatenation.

This revision is now accepted and ready to land.Mar 2 2018, 5:00 AM
jdenny added inline comments.Mar 2 2018, 8:34 AM
test/Sema/attr-ownership.cpp
1 ↗(On Diff #136624)

Sure. Would you like me to change test/Sema/attr-ownership.c while we're thinking about it?

aaron.ballman added inline comments.Mar 2 2018, 8:37 AM
test/Sema/attr-ownership.cpp
1 ↗(On Diff #136624)

Because it's unrelated to this patch, that can be done in a separate commit (no review required, you can just make the change and commit it).

jdenny added a comment.Mar 2 2018, 9:17 AM

Aaron, thanks for the review. I've applied your suggestions and am ready to commit.

I've noticed a variety of styles in commit logs, I've read the coding standards, and it seems there's a lot of freedom here. Below are the two commit logs I'm planning to use. Please let me know if you have any objections. Thanks.

[Attr] Fix parameter indexing for several attributes

The patch fixes a number of bugs related to parameter indexing in
attributes:

* Parameter indices in some attributes (argument_with_type_tag,
  pointer_with_type_tag, nonnull, ownership_takes, ownership_holds,
  and ownership_returns) are specified in source as one-origin
  including any C++ implicit this parameter, were stored as
  zero-origin excluding any this parameter, and were erroneously
  printing (-ast-print) and confusingly dumping (-ast-dump) as the
  stored values.

* For alloc_size, the C++ implicit this parameter was not subtracted
  correctly in Sema, leading to assert failures or to silent failures
  of __builtin_object_size to compute a value.

* For argument_with_type_tag, pointer_with_type_tag, and
  ownership_returns, the C++ implicit this parameter was not added
  back to parameter indices in some diagnostics.

This patch fixes the above bugs and aims to prevent similar bugs in
the future by introducing careful mechanisms for handling parameter
indices in attributes.  ParamIdx stores a parameter index and is
designed to hide the stored encoding while providing accessors that
require each use (such as printing) to make explicit the encoding that
is needed.  Attribute declarations declare parameter index arguments
as [Variadic]ParamIdxArgument, which are exposed as ParamIdx[*].  This
patch rewrites all attribute arguments that are processed by
checkFunctionOrMethodParameterIndex in SemaDeclAttr.cpp to be declared
as [Variadic]ParamIdxArgument.  The only exception is xray_log_args's
argument, which is encoded as a count not an index.

Differential Revision: https://reviews.llvm.org/D43248

For the test/Sema/attr-ownership.c change:

[Attr] Use -fsyntax-only in test

Suggested at: https://reviews.llvm.org/D43248

Aaron, thanks for the review. I've applied your suggestions and am ready to commit.

You're correct that we have a lot of freedom with the commit message, but the important piece is in describing what's changed and why (if it's not immediately obvious). Both commit messages look good to me.

This revision was automatically updated to reflect the committed changes.
jdenny marked 4 inline comments as done.Mar 2 2018, 11:08 AM
jdenny updated this revision to Diff 137964.Mar 11 2018, 5:42 PM

This commit was reverted by r326862 due to:

https://bugs.llvm.org/show_bug.cgi?id=36620

This revision includes a new test case and a fix.

While the difference from the last revision is small, it's not trivial, so another review is probably worthwhile. The main difference is two new ParamIdx member functions: serialize and deserialize. In ClangAttrEmitter.cpp, these functions enable significant simplifications and facilitate the bug fix.

jdenny reopened this revision.Mar 11 2018, 5:42 PM
This revision is now accepted and ready to land.Mar 11 2018, 5:42 PM
jdenny requested review of this revision.Mar 11 2018, 5:43 PM

Can you regenerate the patch using the same paths as https://reviews.llvm.org/D43248?id=136811? When I try to do a diff between what was accepted & committed and the current patch, Phabricator gets confused because the paths are too different from one another.

jdenny updated this revision to Diff 138111.Mar 12 2018, 5:15 PM

OK, this diff has the svn paths, and I've rebased to a more recent master.

jdenny updated this revision to Diff 138113.Mar 12 2018, 5:20 PM

Well, that didn't work. Here's another attempt at getting the paths right.

It seems like there are some other changes than just the serialize and deserialize that I'm not opposed to, but am wondering why they're needed. It seems some functions are now getFoo() calls and it seems like some declarations moved around. Are those intended as part of this patch?

cfe/trunk/test/Frontend/ast-attr.cpp
1–2 ↗(On Diff #138113)

Can you move this below the RUN line?

cfe/trunk/test/Sema/attr-print.cpp
62

Can you move this comment below the RUN line?

It seems like there are some other changes than just the serialize and deserialize that I'm not opposed to, but am wondering why they're needed. It seems some functions are now getFoo() calls

These were originally named getFoo, and my previous patch changed them to foo. I believe I did that to make ParamIdxArgument accessors more like VariadicParamIdxArgument accessors (which inherits accessors from VariadicArgument), but I probably shouldn't have done that. In any case, this new revision implements ParamIdxArgument using SimpleArgument, and that names accessors like getFoo.

and it seems like some declarations moved around. Are those intended as part of this patch?

Are you referring to the changes in SemaDeclAttr.cpp? Those changes are needed because the ParamIdx constructor now asserts that Idx is one-origin, but that requires validating that it's actually one-origin beforehand. Sorry, I should've mentioned the new asserts.

jdenny added inline comments.Mar 12 2018, 6:09 PM
cfe/trunk/test/Frontend/ast-attr.cpp
1–2 ↗(On Diff #138113)

Sure. I'm still trying to learn the LLVM coding standards. Is this specified there?

It seems like there are some other changes than just the serialize and deserialize that I'm not opposed to, but am wondering why they're needed. It seems some functions are now getFoo() calls

These were originally named getFoo, and my previous patch changed them to foo. I believe I did that to make ParamIdxArgument accessors more like VariadicParamIdxArgument accessors (which inherits accessors from VariadicArgument), but I probably shouldn't have done that. In any case, this new revision implements ParamIdxArgument using SimpleArgument, and that names accessors like getFoo.

Ahhh, thank you for the explanation, that makes sense.

and it seems like some declarations moved around. Are those intended as part of this patch?

Are you referring to the changes in SemaDeclAttr.cpp? Those changes are needed because the ParamIdx constructor now asserts that Idx is one-origin, but that requires validating that it's actually one-origin beforehand. Sorry, I should've mentioned the new asserts.

Ah, okay, thank you!

cfe/trunk/test/Frontend/ast-attr.cpp
1–2 ↗(On Diff #138113)

Nope! I'm just used to looking at the very first line of the test to know what it's running, and that seems consistent with other tests.

jdenny added inline comments.Mar 12 2018, 6:39 PM
cfe/trunk/test/Frontend/ast-attr.cpp
1–2 ↗(On Diff #138113)

OK, that makes sense. I'll be sure to change both of those.

aaron.ballman added inline comments.Mar 12 2018, 6:46 PM
cfe/trunk/test/Frontend/ast-attr.cpp
5 ↗(On Diff #138113)

Just to verify my understanding, this test is checking the serialization and deserialization?

jdenny added inline comments.Mar 12 2018, 6:59 PM
cfe/trunk/test/Frontend/ast-attr.cpp
5 ↗(On Diff #138113)

That's right, and it failed without the other changes in this revision because alloc_size's second argument serialized/deserialized as 0 when it was actually invalid (that is, unspecified). Of course, this test is more thorough than just exercising alloc_size.

aaron.ballman accepted this revision.Mar 12 2018, 7:02 PM

LGTM with the test comments fixed up.

cfe/trunk/test/Frontend/ast-attr.cpp
5 ↗(On Diff #138113)

Perfect, thank you!

This revision is now accepted and ready to land.Mar 12 2018, 7:02 PM

LGTM with the test comments fixed up.

Thanks! I'll commit tomorrow.

This revision was automatically updated to reflect the committed changes.
jdenny marked 8 inline comments as done.
echristo added inline comments.
test/Sema/attr-print.cpp
3 ↗(On Diff #138193)

Relatedly I don't think we use files as input files to other directories and I don't think we should really. What are you trying to test here? This breaks the hermeticness of each particular test directory.

jdenny added inline comments.Mar 13 2018, 1:25 PM
test/Sema/attr-print.cpp
3 ↗(On Diff #138193)

Grep for "%S/.." and you'll see that a few other tests do something like this as well.

In test/Sema/attr-print.cpp, I am testing printing of attributes. I chose to put that there because of the existing attr-print.c there.

In test/Frontend/ast-attr.cpp, I am testing serialization of attributes. I chose to put that there because I see other -emit-ast tests there and because, if I put it in test/Sema instead, I get the complaint "Do not use the driver in Sema tests".

The same C++ code makes sense in both of these, but replicating it in two files would worsen maintainability.

I could try to combine into one file in, say, tests/Misc if that works.

I have no strong opinions on where these live. Just for my own education, is something real breaking right now because of their current locations?

Thanks.

echristo added inline comments.Mar 13 2018, 1:56 PM
test/Sema/attr-print.cpp
3 ↗(On Diff #138193)

Basically it's breaking an external build system (bazel) that has fairly distinct and specific dependency requirements and so layering and other dependencies are pretty entertaining.

Right now we avoid testing those particular tests and have TODOs of fixing them and the requirements to use a single directory and I was trying to avoid one more here.

All of that said I totally understand the desire to keep the maintenance burden minimized and an external build system shouldn't affect whether or not we do one particular thing or another - I was trying to get it written so that we could enable it without much undue burden. I'm uncertain if a comment of:

// If you change this file you should also change blah file.

or moving it to another directory where you can do both tests at the same time.

I'd prefer to move it than to expect people to obey such a comment. Let's see what Aaron says.

I'd prefer to move it than to expect people to obey such a comment. Let's see what Aaron says.

I have a slight preference for moving the tests now that I know they're causing problems, unless that turns out to be overly onerous for some reason.

Thank you, @echristo for pointing out the issues with the tests, I hadn't considered that.

I'd prefer to move it than to expect people to obey such a comment. Let's see what Aaron says.

I have a slight preference for moving the tests now that I know they're causing problems, unless that turns out to be overly onerous for some reason.

Thank you, @echristo for pointing out the issues with the tests, I hadn't considered that.

No worries :)

That said, please don't revert and reapply to move. Just a followup commit is just fine - and whenever you get a chance. (Though if you let me know when you do I'd appreciate it :)

I'd prefer to move it than to expect people to obey such a comment. Let's see what Aaron says.

I have a slight preference for moving the tests now that I know they're causing problems, unless that turns out to be overly onerous for some reason.

Thank you, @echristo for pointing out the issues with the tests, I hadn't considered that.

No worries :)

That said, please don't revert and reapply to move. Just a followup commit is just fine - and whenever you get a chance. (Though if you let me know when you do I'd appreciate it :)

Sure.

So, I'm planning to remove test/Frontend/ast-attr.cpp, rename test/Sema/attr-print.cpp to test/Misc/attr-print-emit.cpp, and change its run lines to:

// RUN: %clang_cc1 %s -ast-print | FileCheck %s
// RUN: %clang -emit-ast -o %t.ast %s
// RUN: %clang_cc1 %t.ast -ast-print | FileCheck %s

That seems to work fine locally. I'll just leave the old test/Sema/attr-print.c alone as it's not part of this problem. Any objections?

So, I'm planning to remove test/Frontend/ast-attr.cpp, rename test/Sema/attr-print.cpp to test/Misc/attr-print-emit.cpp, and change its run lines to:

// RUN: %clang_cc1 %s -ast-print | FileCheck %s
// RUN: %clang -emit-ast -o %t.ast %s
// RUN: %clang_cc1 %t.ast -ast-print | FileCheck %s

That seems to work fine locally. I'll just leave the old test/Sema/attr-print.c alone as it's not part of this problem. Any objections?

I went ahead and committed. It's in r327456.

jdenny marked 3 inline comments as done.Mar 13 2018, 3:26 PM