Page MenuHomePhabricator

[-Wcalled-once-parameter] Introduce 'called_once' attribute
ClosedPublic

Authored by vsavchenko on Nov 24 2020, 8:39 AM.

Details

Summary

This commit introduces a new attribute called_once.
It can be applied to function-like parameters to signify that
this parameter should be called exactly once. This concept
is particularly widespread in asynchronous programs.

Additionally, this commit introduce a new group of dataflow
analysis-based warnings to check this property. It identifies
and reports the following situations:

  • parameter is called twice
  • parameter is never called
  • parameter is not called on one of the paths

Current implementation can also automatically infer called_once
attribute for completion handler paramaters that should follow the
same principle by convention. This behavior is OFF by default and
can be turned on by using -Wcompletion-handler.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vsavchenko added inline comments.Dec 11 2020, 8:24 AM
clang/lib/Sema/SemaDeclAttr.cpp
3705

Sure, simply did it like it's done for some other attribute for parameters.

3712

I was just thinking that because attribute applies to each parameter separately and we already show this error at the location of the attribute, it is clear which parameter is implied.

vsavchenko marked 4 inline comments as done.

Add documentation for new attribute and fix review remarks

Refine conventions for completion handlers

Fix minor things

Introduce a suppression for double call warning by nullifying the parameter.

aaron.ballman added inline comments.Dec 15 2020, 1:26 PM
clang/include/clang/Basic/AttrDocs.td
5174
5177–5181

I think you may have to remove the newlines here to make this a single bulleted list in Sphinx.

5189

I think it'd be helpful to show an example where the attribute is used correctly and another one where the parameter is diagnosed.

I think it'd also be helpful to explain why someone should write this attribute on their own declaration. As it stands, it sort of sounds like the attribute is only useful to the author of a method to help double-check that they've written their method properly.

clang/lib/Sema/SemaDeclAttr.cpp
3712

That's a good point; I think it's fine as-is. I think I was thinking about this as a function attribute by accident. :-)

Turn off conventional check heuristic

Add more info to the docs

Is the attribute considered to be a property of the parameter or a property of the function the parameter is declared in? e.g.,

void someOtherFunc(void (^cb)(void)) {
  if (something())
    cb();
}

void barWithCallback(void (^callback)(void) __attribute__((called_once))) {
  someOtherFunc(callback);
}

Should code like that also diagnose given that callback is not called on every code path? From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter. This in turn makes me wonder whether a better design is for -Wcompletion-handler to diagnose any function with a completion handler-like parameter if the parameter is not called exactly once within the function, and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked (or does this have too many false positives in practice)? Alternatively, should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

clang/include/clang/Basic/AttrDocs.td
5212–5213

Oh, I was thinking this attribute was enabling some optimization opportunities or doing something more than just acting as a marker to please check this function's correctness for some properties. This means that the programmer has to annotate their code and enable the warning flag in order for either the attribute or the warning flag to have effect, which feels a bit surprising to me.

Is the attribute considered to be a property of the parameter or a property of the function the parameter is declared in? e.g.,

void someOtherFunc(void (^cb)(void)) {
  if (something())
    cb();
}

void barWithCallback(void (^callback)(void) __attribute__((called_once))) {
  someOtherFunc(callback);
}

Should code like that also diagnose given that callback is not called on every code path? From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter. This in turn makes me wonder whether a better design is for -Wcompletion-handler to diagnose any function with a completion handler-like parameter if the parameter is not called exactly once within the function, and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked (or does this have too many false positives in practice)? Alternatively, should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

should the check be implemented as a clang static analyzer check that tracks an annotated parameter across the inter-procedural CFG and diagnoses such code?

That was a tough choice whether we should do it in the analyzer or not.
The analyzer check would've been easier in terms of existing infrastructure, path sensitivity, and IPA. It is also much more lenient in terms of false positives (it is expected that the analyzer can have those).
However, warnings have much broader audience and we aim for a higher numbers of people to check their code (we've been asked to deliver it for 8 years). And because the analyzer doesn't have cross-translation unit analysis, the problem with inter procedural bugs appears as well.

From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter.

I have to disagree here. Semantically this is a property of a parameter and the fact that we couldn't analyze inter-procedurally is our own problem and implementation detail. We don't want to bother users with this. As far as I understand, the majority of existing warnings are not complete (don't find every bug) and this warning is not different.

and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked

Essentially, there is a way to prevent function from being analyzed by -Wcompletion-handler - annotate it with __attribute__((swift_async(none))).

clang/include/clang/Basic/AttrDocs.td
5177–5181

I checked it, it is one list with newlines:

5212–5213

For explicitly marked parameters, it's on by default, so the users can simply annotate their parameters and get their checks.

That was a tough choice whether we should do it in the analyzer or not.
The analyzer check would've been easier in terms of existing infrastructure, path sensitivity, and IPA. It is also much more lenient in terms of false positives (it is expected that the analyzer can have those).
However, warnings have much broader audience and we aim for a higher numbers of people to check their code (we've been asked to deliver it for 8 years). And because the analyzer doesn't have cross-translation unit analysis, the problem with inter procedural bugs appears as well.

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

From the test cases you have, it looks like you explicitly allow this code without diagnosing it, which suggests the attribute is really a property of the function and not a property of the parameter.

I have to disagree here. Semantically this is a property of a parameter and the fact that we couldn't analyze inter-procedurally is our own problem and implementation detail.

Okay, that's good to know that you think the semantics should follow the parameter rather than the function. I agree that the analysis part is our own problem, I'm mostly trying to make sure we add this functionality to the correct place within the toolchain.

We don't want to bother users with this. As far as I understand, the majority of existing warnings are not complete (don't find every bug) and this warning is not different.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

and the user marks the functions that should be exempt from the checking rather than marking the parameters that should be checked

Essentially, there is a way to prevent function from being analyzed by -Wcompletion-handler - annotate it with __attribute__((swift_async(none))).

Ah, that's good to know, thanks!

clang/include/clang/Basic/AttrDocs.td
5177–5181

Oh, good to know, thank you!

5212–5213

Ah, yes, there are two diagnostics being used and only one of them is off by default. Thank you for clarifying.

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

I'm a bit less optimistic on that front, the whole model of how we do the analysis should be redesigned in order to get true cross-TU inter-procedural analysis. Additionally, some checkers when essentially inter-procedural rely on function conventions to assume behavior of called functions, the same way it is done with this analysis.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

As I noted above, in some sense it is already inter-procedural because usually called functions follow completion handler conventions as well.
So, we can actually make it inter-procedural if all involved functions follow conventions. And we can warn people if they leak called_once parameter into a function that doesn't specify what it will do with that parameter.

It was a decision not to do this and to be more lenient in the first version. You can see that in a lot of cases during the analysis we assume that the user knows what they are doing. And even in this model the number of reported warnings is pretty high.
We hope that this warning will help to change those cases and, maybe in the future, we can harden the analysis to be more demanding.

clang/lib/Analysis/CalledOnceCheck.cpp
84

Detect all conventions for captured parameters as well

There have been efforts to add cross-TU support to the static analyzer, so I'm less worried about cross-TU inter-procedural bugs over the long term as I would expect that situation to be improved.

I'm a bit less optimistic on that front, the whole model of how we do the analysis should be redesigned in order to get true cross-TU inter-procedural analysis. Additionally, some checkers when essentially inter-procedural rely on function conventions to assume behavior of called functions, the same way it is done with this analysis.

I think my concern is somewhat different. If the goal is for the semantics to follow the parameter (which I also think is the correct behavior), then I think adding this as a frontend analysis is the incorrect place for the check to go because someday we're going to want the inter-procedural analysis and that will require us to figure out what to do with the frontend bits vs the static analyzer bits. If it starts off in the static analyzer, then the later improvements to the analysis capabilities don't require the user to change the way they interact with the toolchain.

Or do you expect that this analysis is sufficient and you don't expect to ever extend it to be inter-procedural?

As I noted above, in some sense it is already inter-procedural because usually called functions follow completion handler conventions as well.

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

So, we can actually make it inter-procedural if all involved functions follow conventions.
And we can warn people if they leak called_once parameter into a function that doesn't specify what it will do with that parameter.

It was a decision not to do this and to be more lenient in the first version. You can see that in a lot of cases during the analysis we assume that the user knows what they are doing. And even in this model the number of reported warnings is pretty high.
We hope that this warning will help to change those cases and, maybe in the future, we can harden the analysis to be more demanding.

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

I mean, of course, it doesn't have real inter-procedural analysis. But if everything is annotated or conventional - it is identical to inter-procedural case (like in tests with indirect in their names).

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

No, we don't plan on moving it to the CSA. As I mentioned above, for hardening we can simply warn if called_once parameters leak into functions we don't know about (not annotated or not following conventions).

Add one gigantic comment on status kinds and the reasons behind some of the design choices

Your test cases suggest that this is not inter-procedurally checked; it seems to require all of the APIs to be annotated in order to get the same effect that an inter-procedural check could determine on its own.

I mean, of course, it doesn't have real inter-procedural analysis. But if everything is annotated or conventional - it is identical to inter-procedural case (like in tests with indirect in their names).

Okay, I'm glad to know I wasn't misunderstanding something there!

I'm not worried about the leniency and I'm sorry if it sounds like I'm implying that it has to be inter-procedural to land -- that's not the case. What I am worried about is that the way the user runs the frontend is very different than the way the user runs the static analyzer, and migrating the warning from the FE to the static analyzer could leave us with problems. If the intention is that this check will live in the FE as-is (not getting inter-procedural support), then that's fine. But if the intention is to extend the check in the future in a way that might need it to move out of the FE and into the CSA, then I think the check should live in the CSA from the start (even if it doesn't do inter-procedural analysis). It's not clear to me whether "in the first version" implies that you expect a subsequent version to live in the CSA or not.

No, we don't plan on moving it to the CSA. As I mentioned above, for hardening we can simply warn if called_once parameters leak into functions we don't know about (not annotated or not following conventions).

Okay, thank you for verifying. The attribute bits of the patch LG to me (aside from a minor testing request). Someone else should validate the actual analysis bits.

clang/test/SemaObjC/attr-called-once.m
8

Can you also add tests that the attribute accepts no arguments and only appertains to parameters?

vsavchenko added inline comments.Dec 18 2020, 7:48 AM
clang/test/SemaObjC/attr-called-once.m
8

Sure!

vsavchenko marked an inline comment as done.

Add more test cases

NoQ added a comment.Tue, Dec 22, 10:13 PM

Add one gigantic comment on status kinds and the reasons behind some of the design choices

This was freakin' awesome, thanks a lot. With all the background in place, the rest of the patch was a really nice read. I expected it to be a lot more hacky but it turned out very sane to be honest. I only have minor remarks here and there.

clang/include/clang/Analysis/Analyses/CalledOnceCheck.h
33

Hold up, how can this happen at all without path sensitivity? If the loop is not entered then literally nothing happens, so there can't be a call. If there's no call in either case then probably it's just "there's simply no call at all"?

clang/lib/Analysis/CalledOnceCheck.cpp
68
84–85

I think there's nothing wrong with / shameful about escaping, so i wouldn't call the world without escapes a perfect world, it's unnecessarily constrained :)

164

"Can be called" sounds like "Someone's allowed to call it". I was completely confused when i saw a call site for that function.

I too am at loss because the concept of "is definitely potentially called" is hard to explain especially when the concept of "maybe called" is already defined and means something different.

Maybe seenAnyCalls() or something like that?

Maybe also rename DefinitelyCalled and MaybeCalled to CalledOnAllPaths and CalledOnSomePathsOnly? That could eliminate a bit more confusion.

305–308

Technically the same applies to operator &, eg. (&*foo)() is your normal call. Obviously, (&foo)() doesn't compile. That said, if (&foo) compiles and is not your normal null check (but if (&*foo) and if (*foo) are).

322–323

You're potentially dropping some DeclRefExprs. Also the ones that you find aren't necessarily the function pointer variables you're looking for (as you don't actually have any checks in VisitDeclRefExpr). Combined, i suspect that there may be some exotic counterexamples like

if (!foo) {
  // "foo" is found
  if (foo == completion_handler) {
    // ...
  }
}

(which is probably ok anyway)

542–543

This is one of the more subtle facilities, i really want a comment here a lot more than on the NamesCollector. Like, which blocks do you feed into this class? Do i understand correctly that Parent is the block at which we decided to emit the warning? And SuccInQuestion is its successor on which there was no call? Or on which there was a call? I can probably ultimately figure this out if i read all the code but i would be much happier if there was a comment.

561

Why not FunctionCFG(AC->getCFG()) and omit the CFG parameter? That's the same function, right?

781

I'm super used to V and E for Vertices and Edges in the big-O-notation for graph algorithms, is it just me?

823

Did you consider AnyCall? That's a universal wrapper for all kinds of AST calls for exactly these cases. It's not super compile-time but it adds a lot of convenience. (Also uh-oh, its doxygen page seems to be broken). It's ok if you find it unsuitable but i kind of still want to popularize it.

1125
1133

So the contract of this function is "We haven't seen any actual calls, which is why we're wondering", otherwise it's not expected to do what it says it does?

1248

(:

1311

"Successor is has status" doesn't sound right, maybe anySuccessorHasStatus?

1443

I feel really bad bringing this up in this context but i guess it kind of makes sense to canonicalize the type first?

vsavchenko added inline comments.Wed, Dec 23, 5:29 AM
clang/include/clang/Analysis/Analyses/CalledOnceCheck.h
33

You can see examples in tests 😉

clang/lib/Analysis/CalledOnceCheck.cpp
542–543

These are correct questions and I will add a comment, but I want to point out that these questions should be asked not about a private constructor (I always consider those as "don't go there girlfriend!"). The only "entry point" to this class,clarify, has more reasonable parameter names.

561

Right, I thought about it! It is simply how it was done in some other analysis-based warning

781

Yep, it is a rudiment of the previous version of this comment. I'll change it!

823

It doesn't seem to have iteration over arguments.

1125

Oh, didn't notice that one!

1133

Right!

1248

So true 😄

1311

Dern it! I noticed that before, but forgot to change it, thanks!

1443

Will do

vsavchenko marked 13 inline comments as done.

Correct some comments and names

clang/lib/Analysis/CalledOnceCheck.cpp
164

DefinitelyCalled is already a mouthful, so I prefer to keep it together with MaybeCalled. Also I agree about canBeCalled and I like your alternative!

305–308

I mean, we can add case UO_AddrOf:, but I don't think that we need more complex logic (in terms of * and &) here.

322–323

Yep, I thought about it and, honestly, decided to ignore 😊

NoQ accepted this revision.Thu, Dec 24, 12:07 AM

Amazing, thank you. I'm happy with the analysis and i have nothing more to say really :)

clang/lib/Analysis/CalledOnceCheck.cpp
823

If you see some actual benefits and that's the only thing that's holding you back, you should add it. Or is it hard to add for whatever reason?

This revision is now accepted and ready to land.Thu, Dec 24, 12:07 AM
vsavchenko added inline comments.Tue, Jan 5, 7:05 AM
clang/lib/Analysis/CalledOnceCheck.cpp
823

As of now, I prefer not to pay extra runtime costs while I can. When I do the C++ part of it, I'll consider it :)

This revision was landed with ongoing or failed builds.Tue, Jan 5, 7:28 AM
This revision was automatically updated to reflect the committed changes.