This is an archive of the discontinued LLVM Phabricator instance.

[-Wcalled-once-parameter] Introduce 'called_once' attribute
ClosedPublic

Authored by vsavchenko on Nov 24 2020, 8:39 AM.

Details

Summary

This commit introduces a new attribute called_once.
It can be applied to function-like parameters to signify that
this parameter should be called exactly once. This concept
is particularly widespread in asynchronous programs.

Additionally, this commit introduce a new group of dataflow
analysis-based warnings to check this property. It identifies
and reports the following situations:

  • parameter is called twice
  • parameter is never called
  • parameter is not called on one of the paths

Current implementation can also automatically infer called_once
attribute for completion handler paramaters that should follow the
same principle by convention. This behavior is OFF by default and
can be turned on by using -Wcompletion-handler.

Diff Detail

Event Timeline

vsavchenko created this revision.Nov 24 2020, 8:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2020, 8:39 AM
vsavchenko requested review of this revision.Nov 24 2020, 8:39 AM

Probably irrelevant comment from the C++ world: If I needed this concept in C++, I'd probably piggyback on the existing semantic analysis of std::move: I'd design a class OnceCallable with an operator() && that both called and nulled-out its object parameter, and then call it like std::move(myCallback)(). Then if I ever tried to call it a second time, the static analyzer would (hopefully) complain that I was accessing a moved-from myCallback. But you want this for existing APIs that take the built-in block type, so never mind. :)

Probably irrelevant scope creep: Is it worth trying to support the possibility that someone might want to pass both a successHandler and a failureHandler, with the invariant that we'd make exactly one call to one of them (but never call both)? Is that just not a thing that ever happens in NSWhatever?

clang/lib/Analysis/CalledOnceCheck.cpp
1224

s/paramater/parameter/

Probably irrelevant comment from the C++ world: If I needed this concept in C++, I'd probably piggyback on the existing semantic analysis of std::move: I'd design a class OnceCallable with an operator() && that both called and nulled-out its object parameter, and then call it like std::move(myCallback)(). Then if I ever tried to call it a second time, the static analyzer would (hopefully) complain that I was accessing a moved-from myCallback. But you want this for existing APIs that take the built-in block type, so never mind. :)

That's neat! But yeah, it's designed for existing Obj-C APIs. However, I do want to add this attribute for C++, but it's way more things to support (lambda, call operators, pointers to member functions and so on).

Probably irrelevant scope creep: Is it worth trying to support the possibility that someone might want to pass both a successHandler and a failureHandler, with the invariant that we'd make exactly one call to one of them (but never call both)? Is that just not a thing that ever happens in NSWhatever?

This is a somewhat usual thing, actually. However, the naming convention is pretty strong, so parameter in addSuccessHandler and addFailureHandler won't be named completionHandler. In this case, the analysis won't consider those as calls, but rather as escapes and won't report when it sees two of those.

Fix tests and a typo

vsavchenko marked an inline comment as done.Nov 24 2020, 10:35 AM
NoQ added a comment.Nov 24 2020, 6:49 PM

A few random comments here and there as i slowly wrap my head around the overall analysis algorithm.

clang/lib/Analysis/CalledOnceCheck.cpp
48

Around this point I would have given up and declared using namespace llvm;.

77

Maybe static_assert at least some of those?

136

Let's say this out loud: The state is defined as a vector of parameter status values.

253

Every time I see such code, I want the behavior with respect to implicit lvalue-to-rvalue casts considered or at least explained explicitly. Because they're the ones that make the expression suddenly represent a completely unrelated value and in this regard they're very different from all other kinds of casts. Like, they're not some immaterial bureaucratic clutter to satisfy the type system, they're a huge transformation, basically one of the most important operations in the entire language, an entire freakin' memory access! So i'm curious how exactly do you want your code to behave in this specific case and i want you to comment loudly about that.

This is also at least the 3rd implementation of this functionality i've seen in Clang (and there are probably a lot more that i haven't seen) but it sounds like nobody's willing to define what they actually want when they do this precisely enough to figure out if they're all doing the same thing or a different thing. Everybody's like "Oh I just want some DeclRefExpr to point to, I don't care what role does it play in the expression" but it occasionally turns out that they do in fact care a lot. So, like, i don't insist on doing this right now but it'd be great to finally figure out what causes people to write such code and whether they have something in common.

422

Why not? What if the programmer was a big fan of shell scripts?

427

Also there's no test for this, right? Otherwise you would have come up with a better message and this case would have been unnecessary.

Add more comments and fix another test.

vsavchenko marked 2 inline comments as done.Nov 25 2020, 3:21 AM
vsavchenko added inline comments.
clang/lib/Analysis/CalledOnceCheck.cpp
77

That would've been awesome, and I spent some time on figuring out what would be a nice way of doing that. However, what hurts the most to put at least few of those, is the fact that operator | is not declared as constexpr.

422

The only situation where this can be useful if the user has something like this:

if (cond1 && cond2 && [obj methodWithCompletionHandler:completionHandler]) { ... }

which is way less common (I didn't see that) than a very simple:

if (cond1 && cond2 && ...) {
  completionHandler(...);
}

And it is pretty frustrating to see N warnings on the same if statement in the latter case. That's why I decided to ignore them for reporting purposes.

427

Exactly!

vsavchenko marked an inline comment as done.

Add another comment

ravikandhadai added inline comments.Nov 30 2020, 12:58 PM
clang/lib/Analysis/CalledOnceCheck.cpp
890

typo: th -> the

dexonsmith resigned from this revision.Nov 30 2020, 5:15 PM

@aaron.ballman Can you please take a look at the attribute side of this?

@aaron.ballman Can you please take a look at the attribute side of this?

Thank you for your patience, sorry it took me a while to get to this!

clang/include/clang/Basic/Attr.td
1805

No new undocumented attributes, please.

clang/lib/Sema/SemaDeclAttr.cpp
3705

I don't think there's a need for this check -- attaching the attribute to an invalid declaration doesn't hurt anything downstream (I believe), but failing to attach the attribute harms AST fidelity for the folks doing work on retaining erroneous AST nodes.

3712

Because there can be multiple parameters in the declaration, passing the range to the specific problematic parameter is helpful.

clang/test/SemaObjC/warn-called-once.m
888

Can you also add a test that shows this works with lambda or block parameters that contain the attribute? e.g.,

[](void (*fp CALLED_ONCE)()) { ... }(some_fp);

Also, you should add some tests that the attribute diagnoses when applied to something other than a parameter, is given an argument, and a test that the attribute is rejected unless in ObjC.

vsavchenko marked an inline comment as done.Dec 11 2020, 8:24 AM
vsavchenko added inline comments.
clang/include/clang/Basic/Attr.td
1805

Sure, will do!

clang/test/SemaObjC/warn-called-once.m
888

There is a test here called block_with_called_once that covers that.

vsavchenko added inline comments.Dec 11 2020, 8:24 AM
clang/lib/Sema/SemaDeclAttr.cpp
3705

Sure, simply did it like it's done for some other attribute for parameters.

3712

I was just thinking that because attribute applies to each parameter separately and we already show this error at the location of the attribute, it is clear which parameter is implied.

vsavchenko marked 4 inline comments as done.

Add documentation for new attribute and fix review remarks

Refine conventions for completion handlers

Fix minor things

Introduce a suppression for double call warning by nullifying the parameter.

aaron.ballman added inline comments.Dec 15 2020, 1:26 PM
clang/include/clang/Basic/AttrDocs.td
5174
5177–5181

I think you may have to remove the newlines here to make this a single bulleted list in Sphinx.

5189

I think it'd be helpful to show an example where the attribute is used correctly and another one where the parameter is diagnosed.

I think it'd also be helpful to explain why someone should write this attribute on their own declaration. As it stands, it sort of sounds like the attribute is only useful to the author of a method to help double-check that they've written their method properly.

clang/lib/Sema/SemaDeclAttr.cpp
3712

That's a good point; I think it's fine as-is. I think I was thinking about this as a function attribute by accident. :-)

Turn off conventional check heuristic

Add more info to the docs

Is the attribute considered to be a property of the parameter or a property of the function the parameter is declared in? e.g.,

void someOtherFunc(void (^cb)(void)) {
  if (something())
    cb();
}

void barWithCallback(void (^callback)(void) __attribute__((called_once))) {
  someOtherFunc(callback);
}

Should code like that also diagnose given that callback is not called on every code path? From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter. This in turn makes me wonder whether a better design is for -Wcompletion-handler to diagnose any function with a completion handler-like parameter if the parameter is not called exactly once within the function, and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked (or does this have too many false positives in practice)? Alternatively, should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

clang/include/clang/Basic/AttrDocs.td
5212–5213

Oh, I was thinking this attribute was enabling some optimization opportunities or doing something more than just acting as a marker to please check this function's correctness for some properties. This means that the programmer has to annotate their code and enable the warning flag in order for either the attribute or the warning flag to have effect, which feels a bit surprising to me.

Is the attribute considered to be a property of the parameter or a property of the function the parameter is declared in? e.g.,

void someOtherFunc(void (^cb)(void)) {
  if (something())
    cb();
}

void barWithCallback(void (^callback)(void) __attribute__((called_once))) {
  someOtherFunc(callback);
}

Should code like that also diagnose given that callback is not called on every code path? From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter. This in turn makes me wonder whether a better design is for -Wcompletion-handler to diagnose any function with a completion handler-like parameter if the parameter is not called exactly once within the function, and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked (or does this have too many false positives in practice)? Alternatively, should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

That was a tough choice whether we should do it in the analyzer or not.
The analyzer check would've been easier in terms of existing infrastructure, path sensitivity, and IPA. It is also much more lenient in terms of false positives (it is expected that the analyzer can have those).
However, warnings have much broader audience and we aim for a higher numbers of people to check their code (we've been asked to deliver it for 8 years). And because the analyzer doesn't have cross-translation unit analysis, the problem with inter procedural bugs appears as well.

From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter.

I have to disagree here. Semantically this is a property of a parameter and the fact that we couldn't analyze inter-procedurally is our own problem and implementation detail. We don't want to bother users with this. As far as I understand, the majority of existing warnings are not complete (don't find every bug) and this warning is not different.

and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked

Essentially, there is a way to prevent function from being analyzed by -Wcompletion-handler - annotate it with __attribute__((swift_async(none))).

clang/include/clang/Basic/AttrDocs.td
5177–5181

I checked it, it is one list with newlines:

5212–5213

For explicitly marked parameters, it's on by default, so the users can simply annotate their parameters and get their checks.

That was a tough choice whether we should do it in the analyzer or not.
The analyzer check would've been easier in terms of existing infrastructure, path sensitivity, and IPA. It is also much more lenient in terms of false positives (it is expected that the analyzer can have those).
However, warnings have much broader audience and we aim for a higher numbers of people to check their code (we've been asked to deliver it for 8 years). And because the analyzer doesn't have cross-translation unit analysis, the problem with inter procedural bugs appears as well.

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter.

I have to disagree here. Semantically this is a property of a parameter and the fact that we couldn't analyze inter-procedurally is our own problem and implementation detail.

Okay, that's good to know that you think the semantics should follow the parameter rather than the function. I agree that the analysis part is our own problem, I'm mostly trying to make sure we add this functionality to the correct place within the toolchain.

We don't want to bother users with this. As far as I understand, the majority of existing warnings are not complete (don't find every bug) and this warning is not different.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked

Essentially, there is a way to prevent function from being analyzed by -Wcompletion-handler - annotate it with __attribute__((swift_async(none))).

Ah, that's good to know, thanks!

clang/include/clang/Basic/AttrDocs.td
5177–5181

Oh, good to know, thank you!

5212–5213

Ah, yes, there are two diagnostics being used and only one of them is off by default. Thank you for clarifying.

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

I'm a bit less optimistic on that front, the whole model of how we do the analysis should be redesigned in order to get true cross-TU inter-procedural analysis. Additionally, some checkers when essentially inter-procedural rely on function conventions to assume behavior of called functions, the same way it is done with this analysis.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

As I noted above, in some sense it is already inter-procedural because usually called functions follow completion handler conventions as well.
So, we can actually make it inter-procedural if all involved functions follow conventions. And we can warn people if they leak called_once parameter into a function that doesn't specify what it will do with that parameter.

It was a decision not to do this and to be more lenient in the first version. You can see that in a lot of cases during the analysis we assume that the user knows what they are doing. And even in this model the number of reported warnings is pretty high.
We hope that this warning will help to change those cases and, maybe in the future, we can harden the analysis to be more demanding.

clang/lib/Analysis/CalledOnceCheck.cpp
84

Detect all conventions for captured parameters as well

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

I'm a bit less optimistic on that front, the whole model of how we do the analysis should be redesigned in order to get true cross-TU inter-procedural analysis. Additionally, some checkers when essentially inter-procedural rely on function conventions to assume behavior of called functions, the same way it is done with this analysis.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

As I noted above, in some sense it is already inter-procedural because usually called functions follow completion handler conventions as well.

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

So, we can actually make it inter-procedural if all involved functions follow conventions.
And we can warn people if they leak called_once parameter into a function that doesn't specify what it will do with that parameter.

It was a decision not to do this and to be more lenient in the first version. You can see that in a lot of cases during the analysis we assume that the user knows what they are doing. And even in this model the number of reported warnings is pretty high.
We hope that this warning will help to change those cases and, maybe in the future, we can harden the analysis to be more demanding.

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

I mean, of course, it doesn't have real inter-procedural analysis. But if everything is annotated or conventional - it is identical to inter-procedural case (like in tests with indirect in their names).

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

No, we don't plan on moving it to the CSA. As I mentioned above, for hardening we can simply warn if called_once parameters leak into functions we don't know about (not annotated or not following conventions).

Add one gigantic comment on status kinds and the reasons behind some of the design choices

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

I mean, of course, it doesn't have real inter-procedural analysis. But if everything is annotated or conventional - it is identical to inter-procedural case (like in tests with indirect in their names).

Okay, I'm glad to know I wasn't misunderstanding something there!

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

No, we don't plan on moving it to the CSA. As I mentioned above, for hardening we can simply warn if called_once parameters leak into functions we don't know about (not annotated or not following conventions).

Okay, thank you for verifying. The attribute bits of the patch LG to me (aside from a minor testing request). Someone else should validate the actual analysis bits.

clang/test/SemaObjC/attr-called-once.m
8

Can you also add tests that the attribute accepts no arguments and only appertains to parameters?

vsavchenko added inline comments.Dec 18 2020, 7:48 AM
clang/test/SemaObjC/attr-called-once.m
8

Sure!

vsavchenko marked an inline comment as done.

Add more test cases

NoQ added a comment.Dec 22 2020, 10:13 PM

Add one gigantic comment on status kinds and the reasons behind some of the design choices

This was freakin' awesome, thanks a lot. With all the background in place, the rest of the patch was a really nice read. I expected it to be a lot more hacky but it turned out very sane to be honest. I only have minor remarks here and there.

clang/include/clang/Analysis/Analyses/CalledOnceCheck.h
33

Hold up, how can this happen at all without path sensitivity? If the loop is not entered then literally nothing happens, so there can't be a call. If there's no call in either case then probably it's just "there's simply no call at all"?

clang/lib/Analysis/CalledOnceCheck.cpp
68
84–85

I think there's nothing wrong with / shameful about escaping, so i wouldn't call the world without escapes a perfect world, it's unnecessarily constrained :)

164

"Can be called" sounds like "Someone's allowed to call it". I was completely confused when i saw a call site for that function.

I too am at loss because the concept of "is definitely potentially called" is hard to explain especially when the concept of "maybe called" is already defined and means something different.

Maybe seenAnyCalls() or something like that?

Maybe also rename DefinitelyCalled and MaybeCalled to CalledOnAllPaths and CalledOnSomePathsOnly? That could eliminate a bit more confusion.

305–308

Technically the same applies to operator &, eg. (&*foo)() is your normal call. Obviously, (&foo)() doesn't compile. That said, if (&foo) compiles and is not your normal null check (but if (&*foo) and if (*foo) are).

322–323

You're potentially dropping some DeclRefExprs. Also the ones that you find aren't necessarily the function pointer variables you're looking for (as you don't actually have any checks in VisitDeclRefExpr). Combined, i suspect that there may be some exotic counterexamples like

if (!foo) {
  // "foo" is found
  if (foo == completion_handler) {
    // ...
  }
}

(which is probably ok anyway)

542–543

This is one of the more subtle facilities, i really want a comment here a lot more than on the NamesCollector. Like, which blocks do you feed into this class? Do i understand correctly that Parent is the block at which we decided to emit the warning? And SuccInQuestion is its successor on which there was no call? Or on which there was a call? I can probably ultimately figure this out if i read all the code but i would be much happier if there was a comment.

561

Why not FunctionCFG(AC->getCFG()) and omit the CFG parameter? That's the same function, right?

781

I'm super used to V and E for Vertices and Edges in the big-O-notation for graph algorithms, is it just me?

823

Did you consider AnyCall? That's a universal wrapper for all kinds of AST calls for exactly these cases. It's not super compile-time but it adds a lot of convenience. (Also uh-oh, its doxygen page seems to be broken). It's ok if you find it unsuitable but i kind of still want to popularize it.

1125
1133

So the contract of this function is "We haven't seen any actual calls, which is why we're wondering", otherwise it's not expected to do what it says it does?

1248

(:

1311

"Successor is has status" doesn't sound right, maybe anySuccessorHasStatus?

1443

I feel really bad bringing this up in this context but i guess it kind of makes sense to canonicalize the type first?

vsavchenko added inline comments.Dec 23 2020, 5:29 AM
clang/include/clang/Analysis/Analyses/CalledOnceCheck.h
33

You can see examples in tests 😉

clang/lib/Analysis/CalledOnceCheck.cpp
542–543

These are correct questions and I will add a comment, but I want to point out that these questions should be asked not about a private constructor (I always consider those as "don't go there girlfriend!"). The only "entry point" to this class,clarify, has more reasonable parameter names.

561

Right, I thought about it! It is simply how it was done in some other analysis-based warning

781

Yep, it is a rudiment of the previous version of this comment. I'll change it!

823

It doesn't seem to have iteration over arguments.

1125

Oh, didn't notice that one!

1133

Right!

1248

So true 😄

1311

Dern it! I noticed that before, but forgot to change it, thanks!

1443

Will do

vsavchenko marked 13 inline comments as done.

Correct some comments and names

clang/lib/Analysis/CalledOnceCheck.cpp
164

DefinitelyCalled is already a mouthful, so I prefer to keep it together with MaybeCalled. Also I agree about canBeCalled and I like your alternative!

305–308

I mean, we can add case UO_AddrOf:, but I don't think that we need more complex logic (in terms of * and &) here.

322–323

Yep, I thought about it and, honestly, decided to ignore 😊

NoQ accepted this revision.Dec 24 2020, 12:07 AM

Amazing, thank you. I'm happy with the analysis and i have nothing more to say really :)

clang/lib/Analysis/CalledOnceCheck.cpp
823

If you see some actual benefits and that's the only thing that's holding you back, you should add it. Or is it hard to add for whatever reason?

This revision is now accepted and ready to land.Dec 24 2020, 12:07 AM
vsavchenko added inline comments.Jan 5 2021, 7:05 AM
clang/lib/Analysis/CalledOnceCheck.cpp
823

As of now, I prefer not to pay extra runtime costs while I can. When I do the C++ part of it, I'll consider it :)

This revision was landed with ongoing or failed builds.Jan 5 2021, 7:28 AM
This revision was automatically updated to reflect the committed changes.