This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage] Handle pointer initializations for grouping related variables
ClosedPublic

Authored by t-rasmud on May 12 2023, 3:07 PM.

Details

Summary

This patch introduces a fiaxable for pointer initializations of the type T *p = q so that p and q are added as nodes to the pointer assignment graph introduced in https://reviews.llvm.org/D145739

Diff Detail

Event Timeline

t-rasmud created this revision.May 12 2023, 3:07 PM
Herald added a project: Restricted Project. · View Herald Transcript
t-rasmud requested review of this revision.May 12 2023, 3:07 PM
ziqingluo-90 added inline comments.
clang/lib/Analysis/UnsafeBufferUsage.cpp
727–728

PtrInitExpr is unused. Why do we need to match PtrInitExpr here while we already have PointerInitGadget?

737
2015

Why can we safely remove this check? Did we move this check to somewhere else?

t-rasmud updated this revision to Diff 527135.May 31 2023, 11:15 AM
t-rasmud marked 2 inline comments as done.
ziqingluo-90 added inline comments.Jun 5 2023, 6:19 PM
clang/lib/Analysis/UnsafeBufferUsage.cpp
566

It seems that PointerInitTag never gets to bind a node? It is used in the constructor to initialize the member PI, which is also never used. I suggest we either remove the member or bind the node with the tag.

clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp
28

Did we miss to add expected-note for change type of 'p' ... or we cannot fix p, q or r or some reason (curious)?

t-rasmud marked an inline comment as done.Jun 7 2023, 3:47 PM
t-rasmud added inline comments.
clang/lib/Analysis/UnsafeBufferUsage.cpp
2015

Sorry, this version of the patch was never meant to be reviewed. I uploaded purely to make sure I don't lose those changes.

t-rasmud updated this revision to Diff 529459.Jun 7 2023, 3:48 PM
t-rasmud marked an inline comment as done.

Address comments.

t-rasmud marked an inline comment as done.Jun 7 2023, 3:49 PM
t-rasmud updated this revision to Diff 529767.Jun 8 2023, 4:17 PM

Minor fixes.

t-rasmud added inline comments.Jun 8 2023, 4:28 PM
clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp
28

That's a great observation! You are right that the fixit for p implies a fixit for q due to the assignment statement int *p = q;. However, q is part of another fixable (int *r = q) that doesn't have a fixit. Since p and q need to be fixed either both together or none at all, we do not emit fixits for either (the method impossibleToFixForVar has the check that decides whether a fixit is impossible if any variable in the group doesn't have a fixit.)

More generally, when a variable doesn't have a fixit, it either means that the operation is safe or that the operation isn't handled by our machine. Currently, we over-approximate and do not emit fixits. In this particular test case, we don't know how r will be fixed and therefore can't tell how it affects the fixit for q

t-rasmud retitled this revision from [-Wunsafe-buffer-usage][WIP] Handle pointer initializations for grouping related variables to [-Wunsafe-buffer-usage] Handle pointer initializations for grouping related variables.Jun 12 2023, 10:45 AM
NoQ added a comment.Jun 13 2023, 4:23 PM

This is interesting, your new gadget is a single-variable gadget for the purposes of grouping and DRE claiming, but a multi-variable gadget for the purposes of strategy implications.

In particular it's never put into the bucket of fixables for the LHS variable.

This sounds scary but seems to be fine in this case. Indeed,

  • If the LHS variable is safe (i.e. the strategy is wontfix even after variable grouping), we're not emitting any fixes for it so it doesn't matter what the bucket contains, so everything works well.
    • In the future we may need to implement a .data() fix for the situation where the LHS is safe but RHS isn't. But we aren't actually fixing the LHS in this case, just the RHS, so buckets are still irrelevant.
  • If the LHS variable is unsafe, the RHS variable is brought in through strategy implications, so we choose a span strategy for both variables and run impossibleToFixForVar() on each of them to confirm that the RHS can be fixed, so this scenario works well too.
    • In particular, in the hypothetical situation that both strategies are span but the new gadget refuses to produce the fix (currently it always produces a trivial fix), fixits for the entire group will be correctly abandoned because the RHS is guaranteed to be part of the group.

So I think I'm ok with keeping such clever gadgets legal. Namely, it's ok for a fixable to not declare relation to a variable that participates in a strategy implication, as long as it's the source of the strategy implication. We should probably add an assert to verify that the gadget actually behaves this way - maybe put in the code that iterates over all strategy implications?

Another side effect of such implementation is that when the new gadget isn't found on a variable declaration statement, the declaration of the variable does not constitute an unclaimed use of that variable. The machine still believes that it fully understands the variable. This may or may not be a good thing; ultimately we may need a safeguard against declarations that *may* constitute a strategy implication (but aren't covered by any initialization fixables), so that to avoid suggesting low-quality fixits in such cases. But this isn't a regression; we already suggest such low-quality fixits and the patch only makes it better (https://godbolt.org/z/Mcn3Por5d). So we may want to implement that separately, but this doesn't necessarily need to follow the usual claiming process.

clang/test/SemaCXX/warn-unsafe-buffer-usage-multi-decl-ptr-init.cpp
28

Maybe let's add a // FIXME to the test then?

NoQ added a comment.Jun 13 2023, 4:35 PM

Other than me trying to understand the architectural implications here, I think the patch looks great and almost good to go!

clang/lib/Analysis/UnsafeBufferUsage.cpp
564

Am I reading this right that isInUnspecifiedUntypedContext() is here to avoid situations like

void foo(int *p) P
  // Changing `q` to `span` is quite problematic!
  if (int *q = p) {
    // ...
  }
}

?

Don't we already avoid that when we decide on fixability of variables themselves (https://godbolt.org/z/vGK7Ezo8j)? I suspect it may be unnecessary.

564

(typo, P => {)

t-rasmud updated this revision to Diff 531912.Jun 15 2023, 2:37 PM

Address comments.

t-rasmud marked 4 inline comments as done.Jun 15 2023, 2:38 PM
NoQ accepted this revision.Jun 21 2023, 1:51 PM

We've had an offline discussion about this and I think the right way to do that architecturally is to separate concerns. The new gadget doesn't really participate in fixit generation all that much, it's only there for the strategy implications. We struggle to implement getClaimedVarUseSites() in a way that makes sense, because it doesn't make sense to implement it in the first place. This isn't really a FixableGadget, this is something else, like we can call it StrategyGadget and make getStrategyImplications() a pure virtual method of StrategyGadget while getClaimedVarUseSites() remains in FixableGadget. Then for this gadget we can leave fixit generation to the code that fixes variable declarations.

Same with the assignment gadget in the previous patch; we can split it into a StrategyGadget that returns implication but doesn't claim variables, and a FixableGadget that fixes variable uses depending on the chosen strategy without participating in deciding the strategy. In this case such separation has a valuable effect of removing the UUC check from the StrategyGadget's matcher, so that strategy implications were always visible regardless of our ability to enumerate all possible contexts in which it becomes fixable.

It also plays nicely with the idea that FixableGadget should have always been a very thin abstraction layer on top of RewriteRule. At first we were like "But muh strategy implications!" but now it's quite clear that we shouldn't really be conflating them together.

We should address this separately though. It'll keep biting us, so we *should* address it soon-ish, but the current patch works well and I that shouldn't stop us from having progress, so LGTM!

clang/lib/Analysis/UnsafeBufferUsage.cpp
564

I think they're both statement matchers, no need for an extra layer.

This revision is now accepted and ready to land.Jun 21 2023, 1:51 PM
This revision was landed with ongoing or failed builds.Jun 21 2023, 3:54 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2023, 3:54 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript